Convert KVM_ARCH_WANT_MMU_NOTIFIER into a Kconfig and select it where
appropriate to effectively maintain existing behavior. Using a proper
Kconfig will simplify building more functionality on top of KVM's
mmu_notifier infrastructure.
Add a forward declaration of kvm_gfn_range to kvm_types.h so that
including arch/powerpc/include/asm/kvm_ppc.h's with CONFIG_KVM=n doesn't
generate warnings due to kvm_gfn_range being undeclared. PPC defines
hooks for PR vs. HV without guarding them via #ifdeffery, e.g.
bool (*unmap_gfn_range)(struct kvm *kvm, struct kvm_gfn_range *range);
bool (*age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
bool (*test_age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
bool (*set_spte_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
Alternatively, PPC could forward declare kvm_gfn_range, but there's no
good reason not to define it in common KVM.
Acked-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Message-Id: <20231027182217.3615211-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
1, Support PREEMPT_DYNAMIC with static keys;
2, Relax memory ordering for atomic operations;
3, Support BPF CPU v4 instructions for LoongArch;
4, Some build and runtime warning fixes.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmVQWXgWHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImepDTEACS808EsgSNIM1+JwldhdqKOErt
XDWlLuIddVpenInx8F+9GnZJzKBU+wl+Ow5ejcVarjcecIJDv5UhoVrbhpeOHkfv
RszRXQR4p/ZNSFvdraYDjjJ9UX6bp5rq7vMUC2d9bLazMauAfwf7T/HJ5qj9OYZi
RLlcwaKo2UQHYsT7nJicjh0qpH1YpZQBYTaUUCwzilzB6vAIOTf6X12vFmhtM/i+
5RIPnesMA1IQSm2ywUODpDHCs7Pirvy8aJvx0CsYdi3xl1yg3pUS6u69Ms61uWlw
29yYhNbWmVnDikTVLTNISDb/jwto5SAVB2KQKBhF1trF4ZBNE6r7sP4m2tfllYo9
KXK9tm0U8McS5o46Qd5er6eEnxL7mEeAsc12tNKUYOMe3SIkmHJmj/rZQOtpsiBg
zqQsYkGUfO2VAwMWiGke8dxPZElOYwZ3UCOpbEpXEXy3NW71VJTIuQFGmsYKJhdy
3xaAtQxdffE5yUTt2j3Y8Mex2b2oSUBSF263imsZjzWOOxd480iaoejtamf1V779
bElevzZjMDmbiQ7kiVSf96TWc7iYcSv33jhP4DorKIqnPseYPfrXEeD1xY7JV+IU
kkvSlO0hAJzVMmQgu5n0PPT1wrVpuvwtbsfcRobIkr1vktZyLaKHRq7rh4R5HTRL
ZUUm6c0kUDywGT+J4A==
=bmFe
-----END PGP SIGNATURE-----
Merge tag 'loongarch-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- support PREEMPT_DYNAMIC with static keys
- relax memory ordering for atomic operations
- support BPF CPU v4 instructions for LoongArch
- some build and runtime warning fixes
* tag 'loongarch-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
selftests/bpf: Enable cpu v4 tests for LoongArch
LoongArch: BPF: Support signed mod instructions
LoongArch: BPF: Support signed div instructions
LoongArch: BPF: Support 32-bit offset jmp instructions
LoongArch: BPF: Support unconditional bswap instructions
LoongArch: BPF: Support sign-extension mov instructions
LoongArch: BPF: Support sign-extension load instructions
LoongArch: Add more instruction opcodes and emit_* helpers
LoongArch/smp: Call rcutree_report_cpu_starting() earlier
LoongArch: Relax memory ordering for atomic operations
LoongArch: Mark __percpu functions as always inline
LoongArch: Disable module from accessing external data directly
LoongArch: Support PREEMPT_DYNAMIC with static keys
Add support for 32-bit offset jmp instruction. Currently, we use b
instruction which supports range within ±128MB for such jumps. This
should be large enough for BPF progs.
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add support for unconditional bswap instruction. Since LoongArch is
always little-endian, just treat unconditional bswap the same as big-
endian conversion.
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This patch adds more instruction opcodes and their corresponding emit_*
helpers which will be used in later patches.
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
A recent change to the optimization pipeline in LLVM reveals some
fragility around the inlining of LoongArch's __percpu functions, which
manifests as a BUILD_BUG() failure:
In file included from kernel/sched/build_policy.c:17:
In file included from include/linux/sched/cputime.h:5:
In file included from include/linux/sched/signal.h:5:
In file included from include/linux/rculist.h:11:
In file included from include/linux/rcupdate.h:26:
In file included from include/linux/irqflags.h:18:
arch/loongarch/include/asm/percpu.h:97:3: error: call to '__compiletime_assert_51' declared with 'error' attribute: BUILD_BUG failed
97 | BUILD_BUG();
| ^
include/linux/build_bug.h:59:21: note: expanded from macro 'BUILD_BUG'
59 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed")
| ^
include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG'
39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
| ^
include/linux/compiler_types.h:425:2: note: expanded from macro 'compiletime_assert'
425 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:413:2: note: expanded from macro '_compiletime_assert'
413 | __compiletime_assert(condition, msg, prefix, suffix)
| ^
include/linux/compiler_types.h:406:4: note: expanded from macro '__compiletime_assert'
406 | prefix ## suffix(); \
| ^
<scratch space>:86:1: note: expanded from here
86 | __compiletime_assert_51
| ^
1 error generated.
If these functions are not inlined (which the compiler is free to do
even with functions marked with the standard 'inline' keyword), the
BUILD_BUG() in the default case cannot be eliminated since the compiler
cannot prove it is never used, resulting in a build failure due to the
error attribute.
Mark these functions as __always_inline to guarantee inlining so that
the BUILD_BUG() only triggers when the default case genuinely cannot be
eliminated due to an unexpected size.
Cc: <stable@vger.kernel.org>
Closes: https://github.com/ClangBuiltLinux/linux/issues/1955
Fixes: 46859ac8af ("LoongArch: Add multi-processor (SMP) support")
Link: 1a2e77cf9e
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The distance between vmlinux and the module is too far so that PC-REL
cannot be accessed directly, only GOT.
When compiling module with GCC, the option `-mdirect-extern-access` is
disabled by default. The Clang option `-fdirect-access-external-data` is
enabled by default, so it needs to be explicitly disabled.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Since commit 4e90d0522a ("riscv: support PREEMPT_DYNAMIC with
static keys"), the infrastructure is complete and we can simply select
HAVE_PREEMPT_DYNAMIC_KEY to enable PREEMPT_DYNAMIC on LoongArch because
we already support static keys.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
- Implement the binary search in modpost for faster symbol lookup
- Respect HOSTCC when linking host programs written in Rust
- Change the binrpm-pkg target to generate kernel-devel RPM package
- Fix endianness issues for tee and ishtp MODULE_DEVICE_TABLE
- Unify vdso_install rules
- Remove unused __memexit* annotations
- Eliminate stale whitelisting for __devinit/__devexit from modpost
- Enable dummy-tools to handle the -fpatchable-function-entry flag
- Add 'userldlibs' syntax
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmVFIZgVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGeKwP+wd2kCrxAgS4zPffOcO3cVHfZwJe
AXOrTp/v73gzxb9eHXH6TmEDf1Rv7EwW3fmmGJosopJGD6itBqzJa4bNDrbq40rY
XStmg0NRmTrIG20CHGgaGWxb8/7WMrYfu0rhFdUXJjmbny6XwJ3US9FvDPC0mZz7
w9VCq5CZOqMsJcQyGkAR7uCHDRzNWiZ/Vnfbz3aa6abFzp7dsjhOgDy5SQ6qZgQz
AwHHKNEN+G3HWmGDZqcbV9aDaCk4btnz64h843RAxjy2HNJF360Ohm2KOcdJr5lo
DSSStkogBkZNSRQPtqtfknDjzITjeF4JAnUw5ivOtt8ERaO3JRUcr5gHjfw5iV/n
o4pC1SXmFzdfoN4dogoYF9rz3j955mSFlT/DSbSbuQS/ELzQs0nsqERxhV4zNCsX
KvYPUqKzZLW3i8pHNuhh7z7t4Nbz1zXqUa19FvaLNtFTCtS8/IA868a59S0uqT9I
EAIqrNy9qAsk8UuQUxWVx0qf9f5wKGYxW62iMIF9F2lsFRWA8H588CFPUuSU9Bhk
KAsvzq249MUGJd0RAjF92EWJgNz/nYzZfFTEL5HKAVauYY5UCyR3AVjrak761I8z
ctVskA7eVkaW4eARfcp15Fna15FHVzxBJ3B26oKYIJBQfJLjzZcV8XeMtEcQjEGU
jzl+oRqB/Q3oD7Nx
=PeX7
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Implement the binary search in modpost for faster symbol lookup
- Respect HOSTCC when linking host programs written in Rust
- Change the binrpm-pkg target to generate kernel-devel RPM package
- Fix endianness issues for tee and ishtp MODULE_DEVICE_TABLE
- Unify vdso_install rules
- Remove unused __memexit* annotations
- Eliminate stale whitelisting for __devinit/__devexit from modpost
- Enable dummy-tools to handle the -fpatchable-function-entry flag
- Add 'userldlibs' syntax
* tag 'kbuild-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (30 commits)
kbuild: support 'userldlibs' syntax
kbuild: dummy-tools: pretend we understand -fpatchable-function-entry
kbuild: Correct missing architecture-specific hyphens
modpost: squash ALL_{INIT,EXIT}_TEXT_SECTIONS to ALL_TEXT_SECTIONS
modpost: merge sectioncheck table entries regarding init/exit sections
modpost: use ALL_INIT_SECTIONS for the section check from DATA_SECTIONS
modpost: disallow the combination of EXPORT_SYMBOL and __meminit*
modpost: remove EXIT_SECTIONS macro
modpost: remove MEM_INIT_SECTIONS macro
modpost: remove more symbol patterns from the section check whitelist
modpost: disallow *driver to reference .meminit* sections
linux/init: remove __memexit* annotations
modpost: remove ALL_EXIT_DATA_SECTIONS macro
kbuild: simplify cmd_ld_multi_m
kbuild: avoid too many execution of scripts/pahole-flags.sh
kbuild: remove ARCH_POSTLINK from module builds
kbuild: unify no-compiler-targets and no-sync-config-targets
kbuild: unify vdso_install rules
docs: kbuild: add INSTALL_DTBS_PATH
UML: remove unused cmd_vdso_install
...
Here is the big set of tty/serial driver changes for 6.7-rc1. Included
in here are:
- console/vgacon cleanups and removals from Arnd
- tty core and n_tty cleanups from Jiri
- lots of 8250 driver updates and cleanups
- sc16is7xx serial driver updates
- dt binding updates
- first set of port lock wrapers from Thomas for the printk fixes
coming in future releases
- other small serial and tty core cleanups and updates
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZUTbaw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk9+gCeKdoRb8FDwGCO/GaoHwR4EzwQXhQAoKXZRmN5
LTtw9sbfGIiBdOTtgLPb
=6PJr
-----END PGP SIGNATURE-----
Merge tag 'tty-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty and serial updates from Greg KH:
"Here is the big set of tty/serial driver changes for 6.7-rc1. Included
in here are:
- console/vgacon cleanups and removals from Arnd
- tty core and n_tty cleanups from Jiri
- lots of 8250 driver updates and cleanups
- sc16is7xx serial driver updates
- dt binding updates
- first set of port lock wrapers from Thomas for the printk fixes
coming in future releases
- other small serial and tty core cleanups and updates
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (193 commits)
serdev: Replace custom code with device_match_acpi_handle()
serdev: Simplify devm_serdev_device_open() function
serdev: Make use of device_set_node()
tty: n_gsm: add copyright Siemens Mobility GmbH
tty: n_gsm: fix race condition in status line change on dead connections
serial: core: Fix runtime PM handling for pending tx
vgacon: fix mips/sibyte build regression
dt-bindings: serial: drop unsupported samsung bindings
tty: serial: samsung: drop earlycon support for unsupported platforms
tty: 8250: Add note for PX-835
tty: 8250: Fix IS-200 PCI ID comment
tty: 8250: Add Brainboxes Oxford Semiconductor-based quirks
tty: 8250: Add support for Intashield IX cards
tty: 8250: Add support for additional Brainboxes PX cards
tty: 8250: Fix up PX-803/PX-857
tty: 8250: Fix port count of PX-257
tty: 8250: Add support for Intashield IS-100
tty: 8250: Add support for Brainboxes UP cards
tty: 8250: Add support for additional Brainboxes UC cards
tty: 8250: Remove UC-257 and UC-431
...
there's little I can say which isn't in the individual changelogs.
The lengthier patch series are
- "kdump: use generic functions to simplify crashkernel reservation in
arch", from Baoquan He. This is mainly cleanups and consolidation of
the "crashkernel=" kernel parameter handling.
- After much discussion, David Laight's "minmax: Relax type checks in
min() and max()" is here. Hopefully reduces some typecasting and the
use of min_t() and max_t().
- A group of patches from Oleg Nesterov which clean up and slightly fix
our handling of reads from /proc/PID/task/... and which remove
task_struct.therad_group.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZUQP9wAKCRDdBJ7gKXxA
jmOAAQDh8sxagQYocoVsSm28ICqXFeaY9Co1jzBIDdNesAvYVwD/c2DHRqJHEiS4
63BNcG3+hM9nwGJHb5lyh5m79nBMRg0=
=On4u
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
"As usual, lots of singleton and doubleton patches all over the tree
and there's little I can say which isn't in the individual changelogs.
The lengthier patch series are
- 'kdump: use generic functions to simplify crashkernel reservation
in arch', from Baoquan He. This is mainly cleanups and
consolidation of the 'crashkernel=' kernel parameter handling
- After much discussion, David Laight's 'minmax: Relax type checks in
min() and max()' is here. Hopefully reduces some typecasting and
the use of min_t() and max_t()
- A group of patches from Oleg Nesterov which clean up and slightly
fix our handling of reads from /proc/PID/task/... and which remove
task_struct.thread_group"
* tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits)
scripts/gdb/vmalloc: disable on no-MMU
scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n
.mailmap: add address mapping for Tomeu Vizoso
mailmap: update email address for Claudiu Beznea
tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions
.mailmap: map Benjamin Poirier's address
scripts/gdb: add lx_current support for riscv
ocfs2: fix a spelling typo in comment
proc: test ProtectionKey in proc-empty-vm test
proc: fix proc-empty-vm test with vsyscall
fs/proc/base.c: remove unneeded semicolon
do_io_accounting: use sig->stats_lock
do_io_accounting: use __for_each_thread()
ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error()
ocfs2: fix a typo in a comment
scripts/show_delta: add __main__ judgement before main code
treewide: mark stuff as __ro_after_init
fs: ocfs2: check status values
proc: test /proc/${pid}/statm
compiler.h: move __is_constexpr() to compiler.h
...
included in this merge do the following:
- Kemeng Shi has contributed some compation maintenance work in the
series "Fixes and cleanups to compaction".
- Joel Fernandes has a patchset ("Optimize mremap during mutual
alignment within PMD") which fixes an obscure issue with mremap()'s
pagetable handling during a subsequent exec(), based upon an
implementation which Linus suggested.
- More DAMON/DAMOS maintenance and feature work from SeongJae Park i the
following patch series:
mm/damon: misc fixups for documents, comments and its tracepoint
mm/damon: add a tracepoint for damos apply target regions
mm/damon: provide pseudo-moving sum based access rate
mm/damon: implement DAMOS apply intervals
mm/damon/core-test: Fix memory leaks in core-test
mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
- In the series "Do not try to access unaccepted memory" Adrian Hunter
provides some fixups for the recently-added "unaccepted memory' feature.
To increase the feature's checking coverage. "Plug a few gaps where
RAM is exposed without checking if it is unaccepted memory".
- In the series "cleanups for lockless slab shrink" Qi Zheng has done
some maintenance work which is preparation for the lockless slab
shrinking code.
- Qi Zheng has redone the earlier (and reverted) attempt to make slab
shrinking lockless in the series "use refcount+RCU method to implement
lockless slab shrink".
- David Hildenbrand contributes some maintenance work for the rmap code
in the series "Anon rmap cleanups".
- Kefeng Wang does more folio conversions and some maintenance work in
the migration code. Series "mm: migrate: more folio conversion and
unification".
- Matthew Wilcox has fixed an issue in the buffer_head code which was
causing long stalls under some heavy memory/IO loads. Some cleanups
were added on the way. Series "Add and use bdev_getblk()".
- In the series "Use nth_page() in place of direct struct page
manipulation" Zi Yan has fixed a potential issue with the direct
manipulation of hugetlb page frames.
- In the series "mm: hugetlb: Skip initialization of gigantic tail
struct pages if freed by HVO" has improved our handling of gigantic
pages in the hugetlb vmmemmep optimizaton code. This provides
significant boot time improvements when significant amounts of gigantic
pages are in use.
- Matthew Wilcox has sent the series "Small hugetlb cleanups" - code
rationalization and folio conversions in the hugetlb code.
- Yin Fengwei has improved mlock()'s handling of large folios in the
series "support large folio for mlock"
- In the series "Expose swapcache stat for memcg v1" Liu Shixin has
added statistics for memcg v1 users which are available (and useful)
under memcg v2.
- Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
prctl so that userspace may direct the kernel to not automatically
propagate the denial to child processes. The series is named "MDWE
without inheritance".
- Kefeng Wang has provided the series "mm: convert numa balancing
functions to use a folio" which does what it says.
- In the series "mm/ksm: add fork-exec support for prctl" Stefan Roesch
makes is possible for a process to propagate KSM treatment across
exec().
- Huang Ying has enhanced memory tiering's calculation of memory
distances. This is used to permit the dax/kmem driver to use "high
bandwidth memory" in addition to Optane Data Center Persistent Memory
Modules (DCPMM). The series is named "memory tiering: calculate
abstract distance based on ACPI HMAT"
- In the series "Smart scanning mode for KSM" Stefan Roesch has
optimized KSM by teaching it to retain and use some historical
information from previous scans.
- Yosry Ahmed has fixed some inconsistencies in memcg statistics in the
series "mm: memcg: fix tracking of pending stats updates values".
- In the series "Implement IOCTL to get and optionally clear info about
PTEs" Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits
us to atomically read-then-clear page softdirty state. This is mainly
used by CRIU.
- Hugh Dickins contributed the series "shmem,tmpfs: general maintenance"
- a bunch of relatively minor maintenance tweaks to this code.
- Matthew Wilcox has increased the use of the VMA lock over file-backed
page faults in the series "Handle more faults under the VMA lock". Some
rationalizations of the fault path became possible as a result.
- In the series "mm/rmap: convert page_move_anon_rmap() to
folio_move_anon_rmap()" David Hildenbrand has implemented some cleanups
and folio conversions.
- In the series "various improvements to the GUP interface" Lorenzo
Stoakes has simplified and improved the GUP interface with an eye to
providing groundwork for future improvements.
- Andrey Konovalov has sent along the series "kasan: assorted fixes and
improvements" which does those things.
- Some page allocator maintenance work from Kemeng Shi in the series
"Two minor cleanups to break_down_buddy_pages".
- In thes series "New selftest for mm" Breno Leitao has developed
another MM self test which tickles a race we had between madvise() and
page faults.
- In the series "Add folio_end_read" Matthew Wilcox provides cleanups
and an optimization to the core pagecache code.
- Nhat Pham has added memcg accounting for hugetlb memory in the series
"hugetlb memcg accounting".
- Cleanups and rationalizations to the pagemap code from Lorenzo
Stoakes, in the series "Abstract vma_merge() and split_vma()".
- Audra Mitchell has fixed issues in the procfs page_owner code's new
timestamping feature which was causing some misbehaviours. In the
series "Fix page_owner's use of free timestamps".
- Lorenzo Stoakes has fixed the handling of new mappings of sealed files
in the series "permit write-sealed memfd read-only shared mappings".
- Mike Kravetz has optimized the hugetlb vmemmap optimization in the
series "Batch hugetlb vmemmap modification operations".
- Some buffer_head folio conversions and cleanups from Matthew Wilcox in
the series "Finish the create_empty_buffers() transition".
- As a page allocator performance optimization Huang Ying has added
automatic tuning to the allocator's per-cpu-pages feature, in the series
"mm: PCP high auto-tuning".
- Roman Gushchin has contributed the patchset "mm: improve performance
of accounted kernel memory allocations" which improves their performance
by ~30% as measured by a micro-benchmark.
- folio conversions from Kefeng Wang in the series "mm: convert page
cpupid functions to folios".
- Some kmemleak fixups in Liu Shixin's series "Some bugfix about
kmemleak".
- Qi Zheng has improved our handling of memoryless nodes by keeping them
off the allocation fallback list. This is done in the series "handle
memoryless nodes more appropriately".
- khugepaged conversions from Vishal Moola in the series "Some
khugepaged folio conversions".
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZULEMwAKCRDdBJ7gKXxA
jhQHAQCYpD3g849x69DmHnHWHm/EHQLvQmRMDeYZI+nx/sCJOwEAw4AKg0Oemv9y
FgeUPAD1oasg6CP+INZvCj34waNxwAc=
=E+Y4
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
- Kemeng Shi has contributed some compation maintenance work in the
series 'Fixes and cleanups to compaction'
- Joel Fernandes has a patchset ('Optimize mremap during mutual
alignment within PMD') which fixes an obscure issue with mremap()'s
pagetable handling during a subsequent exec(), based upon an
implementation which Linus suggested
- More DAMON/DAMOS maintenance and feature work from SeongJae Park i
the following patch series:
mm/damon: misc fixups for documents, comments and its tracepoint
mm/damon: add a tracepoint for damos apply target regions
mm/damon: provide pseudo-moving sum based access rate
mm/damon: implement DAMOS apply intervals
mm/damon/core-test: Fix memory leaks in core-test
mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
- In the series 'Do not try to access unaccepted memory' Adrian
Hunter provides some fixups for the recently-added 'unaccepted
memory' feature. To increase the feature's checking coverage. 'Plug
a few gaps where RAM is exposed without checking if it is
unaccepted memory'
- In the series 'cleanups for lockless slab shrink' Qi Zheng has done
some maintenance work which is preparation for the lockless slab
shrinking code
- Qi Zheng has redone the earlier (and reverted) attempt to make slab
shrinking lockless in the series 'use refcount+RCU method to
implement lockless slab shrink'
- David Hildenbrand contributes some maintenance work for the rmap
code in the series 'Anon rmap cleanups'
- Kefeng Wang does more folio conversions and some maintenance work
in the migration code. Series 'mm: migrate: more folio conversion
and unification'
- Matthew Wilcox has fixed an issue in the buffer_head code which was
causing long stalls under some heavy memory/IO loads. Some cleanups
were added on the way. Series 'Add and use bdev_getblk()'
- In the series 'Use nth_page() in place of direct struct page
manipulation' Zi Yan has fixed a potential issue with the direct
manipulation of hugetlb page frames
- In the series 'mm: hugetlb: Skip initialization of gigantic tail
struct pages if freed by HVO' has improved our handling of gigantic
pages in the hugetlb vmmemmep optimizaton code. This provides
significant boot time improvements when significant amounts of
gigantic pages are in use
- Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code
rationalization and folio conversions in the hugetlb code
- Yin Fengwei has improved mlock()'s handling of large folios in the
series 'support large folio for mlock'
- In the series 'Expose swapcache stat for memcg v1' Liu Shixin has
added statistics for memcg v1 users which are available (and
useful) under memcg v2
- Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
prctl so that userspace may direct the kernel to not automatically
propagate the denial to child processes. The series is named 'MDWE
without inheritance'
- Kefeng Wang has provided the series 'mm: convert numa balancing
functions to use a folio' which does what it says
- In the series 'mm/ksm: add fork-exec support for prctl' Stefan
Roesch makes is possible for a process to propagate KSM treatment
across exec()
- Huang Ying has enhanced memory tiering's calculation of memory
distances. This is used to permit the dax/kmem driver to use 'high
bandwidth memory' in addition to Optane Data Center Persistent
Memory Modules (DCPMM). The series is named 'memory tiering:
calculate abstract distance based on ACPI HMAT'
- In the series 'Smart scanning mode for KSM' Stefan Roesch has
optimized KSM by teaching it to retain and use some historical
information from previous scans
- Yosry Ahmed has fixed some inconsistencies in memcg statistics in
the series 'mm: memcg: fix tracking of pending stats updates
values'
- In the series 'Implement IOCTL to get and optionally clear info
about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap
which permits us to atomically read-then-clear page softdirty
state. This is mainly used by CRIU
- Hugh Dickins contributed the series 'shmem,tmpfs: general
maintenance', a bunch of relatively minor maintenance tweaks to
this code
- Matthew Wilcox has increased the use of the VMA lock over
file-backed page faults in the series 'Handle more faults under the
VMA lock'. Some rationalizations of the fault path became possible
as a result
- In the series 'mm/rmap: convert page_move_anon_rmap() to
folio_move_anon_rmap()' David Hildenbrand has implemented some
cleanups and folio conversions
- In the series 'various improvements to the GUP interface' Lorenzo
Stoakes has simplified and improved the GUP interface with an eye
to providing groundwork for future improvements
- Andrey Konovalov has sent along the series 'kasan: assorted fixes
and improvements' which does those things
- Some page allocator maintenance work from Kemeng Shi in the series
'Two minor cleanups to break_down_buddy_pages'
- In thes series 'New selftest for mm' Breno Leitao has developed
another MM self test which tickles a race we had between madvise()
and page faults
- In the series 'Add folio_end_read' Matthew Wilcox provides cleanups
and an optimization to the core pagecache code
- Nhat Pham has added memcg accounting for hugetlb memory in the
series 'hugetlb memcg accounting'
- Cleanups and rationalizations to the pagemap code from Lorenzo
Stoakes, in the series 'Abstract vma_merge() and split_vma()'
- Audra Mitchell has fixed issues in the procfs page_owner code's new
timestamping feature which was causing some misbehaviours. In the
series 'Fix page_owner's use of free timestamps'
- Lorenzo Stoakes has fixed the handling of new mappings of sealed
files in the series 'permit write-sealed memfd read-only shared
mappings'
- Mike Kravetz has optimized the hugetlb vmemmap optimization in the
series 'Batch hugetlb vmemmap modification operations'
- Some buffer_head folio conversions and cleanups from Matthew Wilcox
in the series 'Finish the create_empty_buffers() transition'
- As a page allocator performance optimization Huang Ying has added
automatic tuning to the allocator's per-cpu-pages feature, in the
series 'mm: PCP high auto-tuning'
- Roman Gushchin has contributed the patchset 'mm: improve
performance of accounted kernel memory allocations' which improves
their performance by ~30% as measured by a micro-benchmark
- folio conversions from Kefeng Wang in the series 'mm: convert page
cpupid functions to folios'
- Some kmemleak fixups in Liu Shixin's series 'Some bugfix about
kmemleak'
- Qi Zheng has improved our handling of memoryless nodes by keeping
them off the allocation fallback list. This is done in the series
'handle memoryless nodes more appropriately'
- khugepaged conversions from Vishal Moola in the series 'Some
khugepaged folio conversions'"
[ bcachefs conflicts with the dynamically allocated shrinkers have been
resolved as per Stephen Rothwell in
https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/
with help from Qi Zheng.
The clone3 test filtering conflict was half-arsed by yours truly ]
* tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits)
mm/damon/sysfs: update monitoring target regions for online input commit
mm/damon/sysfs: remove requested targets when online-commit inputs
selftests: add a sanity check for zswap
Documentation: maple_tree: fix word spelling error
mm/vmalloc: fix the unchecked dereference warning in vread_iter()
zswap: export compression failure stats
Documentation: ubsan: drop "the" from article title
mempolicy: migration attempt to match interleave nodes
mempolicy: mmap_lock is not needed while migrating folios
mempolicy: alloc_pages_mpol() for NUMA policy without vma
mm: add page_rmappable_folio() wrapper
mempolicy: remove confusing MPOL_MF_LAZY dead code
mempolicy: mpol_shared_policy_init() without pseudo-vma
mempolicy trivia: use pgoff_t in shared mempolicy tree
mempolicy trivia: slightly more consistent naming
mempolicy trivia: delete those ancient pr_debug()s
mempolicy: fix migrate_pages(2) syscall return nr_failed
kernfs: drop shared NUMA mempolicy hooks
hugetlbfs: drop shared NUMA mempolicy pretence
mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets()
...
API:
- Add virtual-address based lskcipher interface.
- Optimise ahash/shash performance in light of costly indirect calls.
- Remove ahash alignmask attribute.
Algorithms:
- Improve AES/XTS performance of 6-way unrolling for ppc.
- Remove some uses of obsolete algorithms (md4, md5, sha1).
- Add FIPS 202 SHA-3 support in pkcs1pad.
- Add fast path for single-page messages in adiantum.
- Remove zlib-deflate.
Drivers:
- Add support for S4 in meson RNG driver.
- Add STM32MP13x support in stm32.
- Add hwrng interface support in qcom-rng.
- Add support for deflate algorithm in hisilicon/zip.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmVB3vgACgkQxycdCkmx
i6dsOBAAykbnX8BpnpnOXYywE9ZWrl98rAk51MK0N9olZNfg78zRPIv7fFxFdC20
SDJrDSNPmn0Qvaa5e0EfoAdklsm0k2GkXL/BwPKMKWUsyIoJVYI3WrBMnjBy9xMp
yfME+h0bKoXJCZKnYkIUSGUejmUPSyRlEylrXoFlH/VWYwAaii/x9zwreQoF+0LR
KI24A1q8AYs6Dw9HSfndaAub9GOzrqKYs6fSaMG+77Y4UC5aoi5J9Bp2G3uVyHay
x/0bZtIxKXS9wn+LeG/3GspX23x/I5VwBOdAoMigrYmAIaIg5qgyMszudltTAs4R
zF1Kh7WsnM5+vpnBSeigzo+/GGOU3QTz8y3tBTg+3ZR7GWGOwQLiizhOYqCyOfAH
pIm6c++sZw/OOHiL69Nt4HeLKzGNYYWk3s4X/B/6cqoouPfOsfBaQobZNx9zfy7q
ZNEvSVBjrFX/L6wDSotny1LTWLUNjHbmLaMV5uQZ/SQKEtv19fp2Dl7SsLkHH+3v
ldOAwfoJR6QcSwz3Ez02TUAvQhtP172Hnxi7u44eiZu2aUboLhCFr7aEU6kVdBCx
1rIRVHD1oqlOEDRwPRXzhF3I8R4QDORJIxZ6UUhg7yueuI+XCGDsBNC+LqBrBmSR
IbdjqmSDUBhJyM5yMnt1VFYhqKQ/ZzwZ3JQviwW76Es9pwEIolM=
=IZmR
-----END PGP SIGNATURE-----
Merge tag 'v6.7-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Add virtual-address based lskcipher interface
- Optimise ahash/shash performance in light of costly indirect calls
- Remove ahash alignmask attribute
Algorithms:
- Improve AES/XTS performance of 6-way unrolling for ppc
- Remove some uses of obsolete algorithms (md4, md5, sha1)
- Add FIPS 202 SHA-3 support in pkcs1pad
- Add fast path for single-page messages in adiantum
- Remove zlib-deflate
Drivers:
- Add support for S4 in meson RNG driver
- Add STM32MP13x support in stm32
- Add hwrng interface support in qcom-rng
- Add support for deflate algorithm in hisilicon/zip"
* tag 'v6.7-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (283 commits)
crypto: adiantum - flush destination page before unmapping
crypto: testmgr - move pkcs1pad(rsa,sha3-*) to correct place
Documentation/module-signing.txt: bring up to date
module: enable automatic module signing with FIPS 202 SHA-3
crypto: asymmetric_keys - allow FIPS 202 SHA-3 signatures
crypto: rsa-pkcs1pad - Add FIPS 202 SHA-3 support
crypto: FIPS 202 SHA-3 register in hash info for IMA
x509: Add OIDs for FIPS 202 SHA-3 hash and signatures
crypto: ahash - optimize performance when wrapping shash
crypto: ahash - check for shash type instead of not ahash type
crypto: hash - move "ahash wrapping shash" functions to ahash.c
crypto: talitos - stop using crypto_ahash::init
crypto: chelsio - stop using crypto_ahash::init
crypto: ahash - improve file comment
crypto: ahash - remove struct ahash_request_priv
crypto: ahash - remove crypto_ahash_alignmask
crypto: gcm - stop using alignmask of ahash
crypto: chacha20poly1305 - stop using alignmask of ahash
crypto: ccm - stop using alignmask of ahash
net: ipv6: stop checking crypto_ahash_alignmask
...
* Generalized infrastructure for 'writable' ID registers, effectively
allowing userspace to opt-out of certain vCPU features for its guest
* Optimization for vSGI injection, opportunistically compressing MPIDR
to vCPU mapping into a table
* Improvements to KVM's PMU emulation, allowing userspace to select
the number of PMCs available to a VM
* Guest support for memory operation instructions (FEAT_MOPS)
* Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
bugs and getting rid of useless code
* Changes to the way the SMCCC filter is constructed, avoiding wasted
memory allocations when not in use
* Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
the overhead of errata mitigations
* Miscellaneous kernel and selftest fixes
LoongArch:
* New architecture. The hardware uses the same model as x86, s390
and RISC-V, where guest/host mode is orthogonal to supervisor/user
mode. The virtualization extensions are very similar to MIPS,
therefore the code also has some similarities but it's been cleaned
up to avoid some of the historical bogosities that are found in
arch/mips. The kernel emulates MMU, timer and CSR accesses, while
interrupt controllers are only emulated in userspace, at least for
now.
RISC-V:
* Support for the Smstateen and Zicond extensions
* Support for virtualizing senvcfg
* Support for virtualized SBI debug console (DBCN)
S390:
* Nested page table management can be monitored through tracepoints
and statistics
x86:
* Fix incorrect handling of VMX posted interrupt descriptor in KVM_SET_LAPIC,
which could result in a dropped timer IRQ
* Avoid WARN on systems with Intel IPI virtualization
* Add CONFIG_KVM_MAX_NR_VCPUS, to allow supporting up to 4096 vCPUs without
forcing more common use cases to eat the extra memory overhead.
* Add virtualization support for AMD SRSO mitigation (IBPB_BRTYPE and
SBPB, aka Selective Branch Predictor Barrier).
* Fix a bug where restoring a vCPU snapshot that was taken within 1 second of
creating the original vCPU would cause KVM to try to synchronize the vCPU's
TSC and thus clobber the correct TSC being set by userspace.
* Compute guest wall clock using a single TSC read to avoid generating an
inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads.
* "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain
about a "Firmware Bug" if the bit isn't set for select F/M/S combos.
Likewise "virtualize" (ignore) MSR_AMD64_TW_CFG to appease Windows Server
2022.
* Don't apply side effects to Hyper-V's synthetic timer on writes from
userspace to fix an issue where the auto-enable behavior can trigger
spurious interrupts, i.e. do auto-enabling only for guest writes.
* Remove an unnecessary kick of all vCPUs when synchronizing the dirty log
without PML enabled.
* Advertise "support" for non-serializing FS/GS base MSR writes as appropriate.
* Harden the fast page fault path to guard against encountering an invalid
root when walking SPTEs.
* Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n.
* Use the fast path directly from the timer callback when delivering Xen
timer events, instead of waiting for the next iteration of the run loop.
This was not done so far because previously proposed code had races,
but now care is taken to stop the hrtimer at critical points such as
restarting the timer or saving the timer information for userspace.
* Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future flag.
* Optimize injection of PMU interrupts that are simultaneous with NMIs.
* Usual handful of fixes for typos and other warts.
x86 - MTRR/PAT fixes and optimizations:
* Clean up code that deals with honoring guest MTRRs when the VM has
non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled.
* Zap EPT entries when non-coherent DMA assignment stops/start to prevent
using stale entries with the wrong memtype.
* Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y.
This was done as a workaround for virtual machine BIOSes that did not
bother to clear CR0.CD (because ancient KVM/QEMU did not bother to
set it, in turn), and there's zero reason to extend the quirk to
also ignore guest PAT.
x86 - SEV fixes:
* Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts SHUTDOWN while
running an SEV-ES guest.
* Clean up the recognition of emulation failures on SEV guests, when KVM would
like to "skip" the instruction but it had already been partially emulated.
This makes it possible to drop a hack that second guessed the (insufficient)
information provided by the emulator, and just do the right thing.
Documentation:
* Various updates and fixes, mostly for x86
* MTRR and PAT fixes and optimizations:
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmVBZc0UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroP1LQf+NgsmZ1lkGQlKdSdijoQ856w+k0or
l2SV1wUwiEdFPSGK+RTUlHV5Y1ni1dn/CqCVIJZKEI3ZtZ1m9/4HKIRXvbMwFHIH
hx+E4Lnf8YUjsGjKTLd531UKcpphztZavQ6pXLEwazkSkDEra+JIKtooI8uU+9/p
bd/eF1V+13a8CHQf1iNztFJVxqBJbVlnPx4cZDRQQvewskIDGnVDtwbrwCUKGtzD
eNSzhY7si6O2kdQNkuA8xPhg29dYX9XLaCK2K1l8xOUm8WipLdtF86GAKJ5BVuOL
6ek/2QCYjZ7a+coAZNfgSEUi8JmFHEqCo7cnKmWzPJp+2zyXsdudqAhT1g==
=UIxm
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Generalized infrastructure for 'writable' ID registers, effectively
allowing userspace to opt-out of certain vCPU features for its
guest
- Optimization for vSGI injection, opportunistically compressing
MPIDR to vCPU mapping into a table
- Improvements to KVM's PMU emulation, allowing userspace to select
the number of PMCs available to a VM
- Guest support for memory operation instructions (FEAT_MOPS)
- Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
bugs and getting rid of useless code
- Changes to the way the SMCCC filter is constructed, avoiding wasted
memory allocations when not in use
- Load the stage-2 MMU context at vcpu_load() for VHE systems,
reducing the overhead of errata mitigations
- Miscellaneous kernel and selftest fixes
LoongArch:
- New architecture for kvm.
The hardware uses the same model as x86, s390 and RISC-V, where
guest/host mode is orthogonal to supervisor/user mode. The
virtualization extensions are very similar to MIPS, therefore the
code also has some similarities but it's been cleaned up to avoid
some of the historical bogosities that are found in arch/mips. The
kernel emulates MMU, timer and CSR accesses, while interrupt
controllers are only emulated in userspace, at least for now.
RISC-V:
- Support for the Smstateen and Zicond extensions
- Support for virtualizing senvcfg
- Support for virtualized SBI debug console (DBCN)
S390:
- Nested page table management can be monitored through tracepoints
and statistics
x86:
- Fix incorrect handling of VMX posted interrupt descriptor in
KVM_SET_LAPIC, which could result in a dropped timer IRQ
- Avoid WARN on systems with Intel IPI virtualization
- Add CONFIG_KVM_MAX_NR_VCPUS, to allow supporting up to 4096 vCPUs
without forcing more common use cases to eat the extra memory
overhead.
- Add virtualization support for AMD SRSO mitigation (IBPB_BRTYPE and
SBPB, aka Selective Branch Predictor Barrier).
- Fix a bug where restoring a vCPU snapshot that was taken within 1
second of creating the original vCPU would cause KVM to try to
synchronize the vCPU's TSC and thus clobber the correct TSC being
set by userspace.
- Compute guest wall clock using a single TSC read to avoid
generating an inaccurate time, e.g. if the vCPU is preempted
between multiple TSC reads.
- "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which
complain about a "Firmware Bug" if the bit isn't set for select
F/M/S combos. Likewise "virtualize" (ignore) MSR_AMD64_TW_CFG to
appease Windows Server 2022.
- Don't apply side effects to Hyper-V's synthetic timer on writes
from userspace to fix an issue where the auto-enable behavior can
trigger spurious interrupts, i.e. do auto-enabling only for guest
writes.
- Remove an unnecessary kick of all vCPUs when synchronizing the
dirty log without PML enabled.
- Advertise "support" for non-serializing FS/GS base MSR writes as
appropriate.
- Harden the fast page fault path to guard against encountering an
invalid root when walking SPTEs.
- Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n.
- Use the fast path directly from the timer callback when delivering
Xen timer events, instead of waiting for the next iteration of the
run loop. This was not done so far because previously proposed code
had races, but now care is taken to stop the hrtimer at critical
points such as restarting the timer or saving the timer information
for userspace.
- Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future
flag.
- Optimize injection of PMU interrupts that are simultaneous with
NMIs.
- Usual handful of fixes for typos and other warts.
x86 - MTRR/PAT fixes and optimizations:
- Clean up code that deals with honoring guest MTRRs when the VM has
non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled.
- Zap EPT entries when non-coherent DMA assignment stops/start to
prevent using stale entries with the wrong memtype.
- Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y
This was done as a workaround for virtual machine BIOSes that did
not bother to clear CR0.CD (because ancient KVM/QEMU did not bother
to set it, in turn), and there's zero reason to extend the quirk to
also ignore guest PAT.
x86 - SEV fixes:
- Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts
SHUTDOWN while running an SEV-ES guest.
- Clean up the recognition of emulation failures on SEV guests, when
KVM would like to "skip" the instruction but it had already been
partially emulated. This makes it possible to drop a hack that
second guessed the (insufficient) information provided by the
emulator, and just do the right thing.
Documentation:
- Various updates and fixes, mostly for x86
- MTRR and PAT fixes and optimizations"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (164 commits)
KVM: selftests: Avoid using forced target for generating arm64 headers
tools headers arm64: Fix references to top srcdir in Makefile
KVM: arm64: Add tracepoint for MMIO accesses where ISV==0
KVM: arm64: selftest: Perform ISB before reading PAR_EL1
KVM: arm64: selftest: Add the missing .guest_prepare()
KVM: arm64: Always invalidate TLB for stage-2 permission faults
KVM: x86: Service NMI requests after PMI requests in VM-Enter path
KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI
KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs
KVM: arm64: Refine _EL2 system register list that require trap reinjection
arm64: Add missing _EL2 encodings
arm64: Add missing _EL12 encodings
KVM: selftests: aarch64: vPMU test for validating user accesses
KVM: selftests: aarch64: vPMU register test for unimplemented counters
KVM: selftests: aarch64: vPMU register test for implemented counters
KVM: selftests: aarch64: Introduce vpmu_counter_access test
tools: Import arm_pmuv3.h
KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
...
Add LoongArch's KVM support. Loongson 3A5000/3A6000 supports hardware
assisted virtualization. With cpu virtualization, there are separate
hw-supported user mode and kernel mode in guest mode. With memory
virtualization, there are two-level hw mmu table for guest mode and host
mode. Also there is separate hw cpu timer with consant frequency in
guest mode, so that vm can migrate between hosts with different freq.
Currently, we are able to boot LoongArch Linux Guests.
Few key aspects of KVM LoongArch added by this series are:
1. Enable kvm hardware function when kvm module is loaded.
2. Implement VM and vcpu related ioctl interface such as vcpu create,
vcpu run etc. GET_ONE_REG/SET_ONE_REG ioctl commands are use to
get general registers one by one.
3. Hardware access about MMU, timer and csr are emulated in kernel.
4. Hardwares such as mmio and iocsr device are emulated in user space
such as IPI, irqchips, pci devices etc.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmUaKb4WHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImeuj+EACFIsWmHb8I3q4feviWBIdRcve7
xzpseO/r2Xfx5IdU6/iAKIW1YNVpHU3c8+nYS20K73uGJnMJnAZ/hmPFf+0pJ4AB
qZL4aBClRynH4YsdGUeYwBfU7VMKg66ijiaZkqFEa7lSePA81gwYY2MYao58J7Gw
qMe9IPnxcU/a+UqxTrFs2/G4aSb4SR0cp69GFSIGXZ7uKBc4LCMZ0ujzo/NY/V4G
NkoJi8I85frF1OQbJaeJ/lGkC1Dx3Qbv7AbXYObA8K/ka1c5lr9RKraha2BOFPif
mHYDa7lE6qj6e5LX9QVcpkuDnxs2yaA0Zny/p2I2o7AS0xUU/eFGGdnu1bkMqrsw
42ZuKL5Dtp/Wl5W2EEIkCVmZar+3UOeiWwGXPTk4A4fQHEGayXzJvrdUa/7f+O6Y
JWYKSVjo6ZHg0ArKFtA13N54KximsrOKTpaB4fuUIP5SXMB/+JVn77/93Oyq6XSY
9+7yUTzjeMqz+4o8DLPhBIyJPvZ6cxAfx2i4htHiQg7e9ir0rZE3jsOtvJ2pgo3K
1QtGSJWH9dQMeOFnBfpDIvx3aDBcXPayXHNdaL3t/gCwJCpRuH1g6HeDTYXCJsQc
qu6FDoeQn5+5IeEzoZawapQIJ2y7iXTk5SoysybKcKmnp8yYKB6olCqW8cxbWeMs
HSqTUusPMsnJGlJMTw==
=jaPm
-----END PGP SIGNATURE-----
Merge tag 'loongarch-kvm-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD
LoongArch KVM changes for v6.7
Add LoongArch's KVM support. Loongson 3A5000/3A6000 supports hardware
assisted virtualization. With cpu virtualization, there are separate
hw-supported user mode and kernel mode in guest mode. With memory
virtualization, there are two-level hw mmu table for guest mode and host
mode. Also there is separate hw cpu timer with consant frequency in
guest mode, so that vm can migrate between hosts with different freq.
Currently, we are able to boot LoongArch Linux Guests.
Few key aspects of KVM LoongArch added by this series are:
1. Enable kvm hardware function when kvm module is loaded.
2. Implement VM and vcpu related ioctl interface such as vcpu create,
vcpu run etc. GET_ONE_REG/SET_ONE_REG ioctl commands are use to
get general registers one by one.
3. Hardware access about MMU, timer and csr are emulated in kernel.
4. Hardwares such as mmio and iocsr device are emulated in user space
such as IPI, irqchips, pci devices etc.
- Futex improvements:
- Add the 'futex2' syscall ABI, which is an attempt to get away from the
multiplex syscall and adds a little room for extentions, while lifting
some limitations.
- Fix futex PI recursive rt_mutex waiter state bug
- Fix inter-process shared futexes on no-MMU systems
- Use folios instead of pages
- Micro-optimizations of locking primitives:
- Improve arch_spin_value_unlocked() on asm-generic ticket spinlock
architectures, to improve lockref code generation.
- Improve the x86-32 lockref_get_not_zero() main loop by adding
build-time CMPXCHG8B support detection for the relevant lockref code,
and by better interfacing the CMPXCHG8B assembly code with the compiler.
- Introduce arch_sync_try_cmpxchg() on x86 to improve sync_try_cmpxchg()
code generation. Convert some sync_cmpxchg() users to sync_try_cmpxchg().
- Micro-optimize rcuref_put_slowpath()
- Locking debuggability improvements:
- Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well
- Enforce atomicity of sched_submit_work(), which is de-facto atomic but
was un-enforced previously.
- Extend <linux/cleanup.h>'s no_free_ptr() with __must_check semantics
- Fix ww_mutex self-tests
- Clean up const-propagation in <linux/seqlock.h> and simplify
the API-instantiation macros a bit.
- RT locking improvements:
- Provide the rt_mutex_*_schedule() primitives/helpers and use them
in the rtmutex code to avoid recursion vs. rtlock on the PI state.
- Add nested blocking lockdep asserts to rt_mutex_lock(), rtlock_lock()
and rwbase_read_lock().
- Plus misc fixes & cleanups
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU877IRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1g9jw/+N7rxQ78dmFCYh4UWnLCYvuKP0/ivHErG
493JcB8MupuA2tfJHIkDdr4aM2mNq2E61w69/WlZAQWWD6pdOhwgF5Xf5eoEcJm0
vsAhWBGLxihXdtevPuMAx0dEpg3AMp2wc6i5PkN831KdPUgCNsrKq9Bfnfef7/G8
MQTSHjmtba6jxleyxfEa4tE2xe5PJX825nRfkX2e1cf+stkYua+uJFxVxUfxFWGE
4pBy70D9OC7MsJ44WWOA1gwkVtMMiBTmRPNjlP8Gz2GQ0f3ERHRwYk3jDHOPHZI6
0GNt7pE3IMXQn2UuDtfkvv9IFTd+U5qD+APnWIn2ntWXqzGLFqOlmovMrobVn7El
olYDCyweWPG71m1Qblsb1VK2QjRPQVJ9NAEg8RlDHIu2ThxHbMysDVGPVOYnPFq4
S8QFpmldzbNoPU4rDJyT1fAmoUIrusBHkl+Us3yGfC74iM+fHnDEvaSoMZbzEdY1
x/Nocj9XgKEgfXdYzrCWFmZ9xXqHkO25/wDL6yKqBdQtvaEalXuHTT6mQcYxrUPm
Xx1BPan2Jg7p4u2oOFcVtKewUtRH9KBx8qytr5S+JK4PJbrBsixMnr84HLd/3X2V
ykYkO+367T5MTYv4TnJDE5vdurzUqekKSCFPY3skPujPJfdLj1vsPzYf9iMkCLdo
hU2f/R+Wpdk=
=36Ff
-----END PGP SIGNATURE-----
Merge tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Info Molnar:
"Futex improvements:
- Add the 'futex2' syscall ABI, which is an attempt to get away from
the multiplex syscall and adds a little room for extentions, while
lifting some limitations.
- Fix futex PI recursive rt_mutex waiter state bug
- Fix inter-process shared futexes on no-MMU systems
- Use folios instead of pages
Micro-optimizations of locking primitives:
- Improve arch_spin_value_unlocked() on asm-generic ticket spinlock
architectures, to improve lockref code generation
- Improve the x86-32 lockref_get_not_zero() main loop by adding
build-time CMPXCHG8B support detection for the relevant lockref
code, and by better interfacing the CMPXCHG8B assembly code with
the compiler
- Introduce arch_sync_try_cmpxchg() on x86 to improve
sync_try_cmpxchg() code generation. Convert some sync_cmpxchg()
users to sync_try_cmpxchg().
- Micro-optimize rcuref_put_slowpath()
Locking debuggability improvements:
- Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well
- Enforce atomicity of sched_submit_work(), which is de-facto atomic
but was un-enforced previously.
- Extend <linux/cleanup.h>'s no_free_ptr() with __must_check
semantics
- Fix ww_mutex self-tests
- Clean up const-propagation in <linux/seqlock.h> and simplify the
API-instantiation macros a bit
RT locking improvements:
- Provide the rt_mutex_*_schedule() primitives/helpers and use them
in the rtmutex code to avoid recursion vs. rtlock on the PI state.
- Add nested blocking lockdep asserts to rt_mutex_lock(),
rtlock_lock() and rwbase_read_lock()
.. plus misc fixes & cleanups"
* tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
futex: Don't include process MM in futex key on no-MMU
locking/seqlock: Fix grammar in comment
alpha: Fix up new futex syscall numbers
locking/seqlock: Propagate 'const' pointers within read-only methods, remove forced type casts
locking/lockdep: Fix string sizing bug that triggers a format-truncation compiler-warning
locking/seqlock: Change __seqprop() to return the function pointer
locking/seqlock: Simplify SEQCOUNT_LOCKNAME()
locking/atomics: Use atomic_try_cmpxchg_release() to micro-optimize rcuref_put_slowpath()
locking/atomic, xen: Use sync_try_cmpxchg() instead of sync_cmpxchg()
locking/atomic/x86: Introduce arch_sync_try_cmpxchg()
locking/atomic: Add generic support for sync_try_cmpxchg() and its fallback
locking/seqlock: Fix typo in comment
futex/requeue: Remove unnecessary ‘NULL’ initialization from futex_proxy_trylock_atomic()
locking/local, arch: Rewrite local_add_unless() as a static inline function
locking/debug: Fix debugfs API return value checks to use IS_ERR()
locking/ww_mutex/test: Make sure we bail out instead of livelock
locking/ww_mutex/test: Fix potential workqueue corruption
locking/ww_mutex/test: Use prng instead of rng to avoid hangs at bootup
futex: Add sys_futex_requeue()
futex: Add flags2 argument to futex_requeue()
...
Currently, there is no standard implementation for vdso_install,
leading to various issues:
1. Code duplication
Many architectures duplicate similar code just for copying files
to the install destination.
Some architectures (arm, sparc, x86) create build-id symlinks,
introducing more code duplication.
2. Unintended updates of in-tree build artifacts
The vdso_install rule depends on the vdso files to install.
It may update in-tree build artifacts. This can be problematic,
as explained in commit 19514fc665 ("arm, kbuild: make
"make install" not depend on vmlinux").
3. Broken code in some architectures
Makefile code is often copied from one architecture to another
without proper adaptation.
'make vdso_install' for parisc does not work.
'make vdso_install' for s390 installs vdso64, but not vdso32.
To address these problems, this commit introduces a generic vdso_install
rule.
Architectures that support vdso_install need to define vdso-install-y
in arch/*/Makefile. vdso-install-y lists the files to install.
For example, arch/x86/Makefile looks like this:
vdso-install-$(CONFIG_X86_64) += arch/x86/entry/vdso/vdso64.so.dbg
vdso-install-$(CONFIG_X86_X32_ABI) += arch/x86/entry/vdso/vdsox32.so.dbg
vdso-install-$(CONFIG_X86_32) += arch/x86/entry/vdso/vdso32.so.dbg
vdso-install-$(CONFIG_IA32_EMULATION) += arch/x86/entry/vdso/vdso32.so.dbg
These files will be installed to $(MODLIB)/vdso/ with the .dbg suffix,
if exists, stripped away.
vdso-install-y can optionally take the second field after the colon
separator. This is needed because some architectures install a vdso
file as a different base name.
The following is a snippet from arch/arm64/Makefile.
vdso-install-$(CONFIG_COMPAT_VDSO) += arch/arm64/kernel/vdso32/vdso.so.dbg:vdso32.so
This will rename vdso.so.dbg to vdso32.so during installation. If such
architectures change their implementation so that the base names match,
this workaround will go away.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Sven Schnelle <svens@linux.ibm.com> # s390
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Reviewed-by: Guo Ren <guoren@kernel.org>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
This unnecessary explicit setting of cra_alignmask to 0 shows up when
grepping for shash algorithms that set an alignmask. Remove it. No
change in behavior.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the code disables WUC only disables it for ioremap_wc(), which
is only used when mapping writecombine pages like ioremap() (mapped to
the kernel space). But for VRAM mapped in TTM/GEM, it is mapped with a
crafted pgprot by the pgprot_writecombine() function, in which case WUC
isn't disabled now.
Disable WUC for pgprot_writecombine() (fallback to SUC) if needed, like
ioremap_wc().
This improves the AMDGPU driver's stability (solves some misrendering)
on Loongson-3A5000/3A6000 machines.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Replace kmap_atomic()/kunmap_atomic() calls with kmap_local_page()/
kunmap_local() in copy_user_highpage() which can be invoked from both
preemptible and atomic context [1].
[1] https://lore.kernel.org/all/20201029222652.302358281@linutronix.de/
Suggested-by: Deepak R Varma <drv@mailo.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Export symbol invalid_pud_table for modules building (such as the KVM
module) if 4-level page tables enabled. Otherwise we get:
ERROR: modpost: "invalid_pud_table" [arch/loongarch/kvm/kvm.ko] undefined!
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
As described in include/linux/linkage.h,
FUNC -- C-like functions (proper stack frame etc.)
CODE -- non-C code (e.g. irq handlers with different, special stack etc.)
SYM_FUNC_{START, END} -- use for global functions
SYM_CODE_{START, END} -- use for non-C (special) functions
So use SYM_CODE_* to annotate exception handlers.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
After the vga console no longer relies on global screen_info, there are
only two remaining use cases:
- on the x86 architecture, it is used for multiple boot methods
(bzImage, EFI, Xen, kexec) to commucate the initial VGA or framebuffer
settings to a number of device drivers.
- on other architectures, it is only used as part of the EFI stub,
and only for the three sysfb framebuffers (simpledrm, simplefb, efifb).
Remove the duplicate data structure definitions by moving it into the
efi-init.c file that sets it up initially for the EFI case, leaving x86
as an exception that retains its own definition for non-EFI boots.
The added #ifdefs here are optional, I added them to further limit the
reach of screen_info to configurations that have at least one of the
users enabled.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231017093947.3627976-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On non-x86 architectures, the screen_info variable is generally only
used for the VGA console where supported, and in some cases the EFI
framebuffer or vga16fb.
Now that we have a definite list of which architectures actually use it
for what, use consistent #ifdef checks so the global variable is only
defined when it is actually used on those architectures.
Loongarch and riscv have no support for vgacon or vga16fb, but
they support EFI firmware, so only that needs to be checked, and the
initialization can be removed because that is handled by EFI.
IA64 has both vgacon and EFI, though EFI apparently never uses
a framebuffer here.
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Khalid Aziz <khalid@gonehiking.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231009211845.3136536-3-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Recently, we found that cross-die access to pagetable pages on ARM64
machines can cause performance fluctuations in our business. Currently,
there are no PMU events available to track this situation on our ARM64
machines, so accurate pagetable accounting can help to analyze this issue,
but now the PUD level pagetable accounting is missed.
So introduce pagetable_pud_ctor/dtor() to help to get accurate PUD
pagetable accounting, as well as converting the architectures which use
generic PUD pagetable allocation to add corresponding PUD pagetable
accounting. Moreover this patch will mark the PUD level pagetable with
PG_table flag, which will help to do sanity validation in
unpoison_memory().
On my testing machine, I can see more pagetables statistics after the patch
with page-types tool:
Before patch:
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000004000000 27326 106 __________________________g_________________ pgtable
After patch:
0x0000000004000000 27541 107 __________________________g_________________ pgtable
Link: https://lkml.kernel.org/r/876c71c03a7e69c17722a690e3225a4f7b172fb2.1695017383.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add two parameters 'low_size' and 'high' to function parse_crashkernel(),
later crashkernel=,high|low parsing will be added. Make adjustments in
all call sites of parse_crashkernel() in arch.
Link: https://lkml.kernel.org/r/20230914033142.676708-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Jiahao <chenjiahao16@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rewrite local_add_unless() as a static inline function with boolean
return value, similar to the arch_atomic_add_unless() arch fallbacks.
The function is currently unused.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230731084458.28096-1-ubizjak@gmail.com
Implement LoongArch vcpu world switch, including vcpu enter guest and
vcpu exit from guest, both operations need to save or restore the host
and guest registers.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement kvm exception vectors, using kvm_fault_tables array to save
the handle function pointers and it is used when vcpu handle guest exit.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement handle mmio exception, setting the mmio info into vcpu_run and
return to user space to handle it.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement kvm handle vcpu iocsr exception, setting the iocsr info into
vcpu_run and return to user space to handle it.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement LoongArch kvm mmu, it is used to switch gpa to hpa when guest
exit because of address translation exception.
This patch implement: allocating gpa page table, searching gpa from it,
and flushing guest gpa in the table.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement LoongArch virtual machine tlb operations such as flush tlb by
specific gpa parameter and flush all of the virtual machine's tlbs.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement LoongArch vcpu timer operations such as init kvm timer,
acquire kvm timer, save kvm timer and restore kvm timer. When vcpu
exit, we use kvm soft timer to emulate hardware timer. If timeout
happens, the vcpu timer interrupt will be set and it is going to be
handled at vcpu next entrance.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
1, Implement LoongArch vcpu status description such as idle exits
counter, signal exits counter, cpucfg exits counter, etc.
2, Implement some misc vcpu relaterd interfaces, such as vcpu runnable,
vcpu should kick, vcpu dump regs, etc.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement LoongArch vcpu load and vcpu put operations, including load
csr value into hardware and save csr value into vcpu structure.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement vcpu interrupt operations such as vcpu set irq and vcpu
clear irq, using set_gcsr_estat() to set irq which is parsed by the
irq bitmap.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement LoongArch fpu related interface for vcpu, such as get fpu, set
fpu, own fpu and lose fpu, etc.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement basic vcpu ioctl interfaces, including:
1, vcpu KVM_ENABLE_CAP ioctl interface.
2, vcpu get registers and set registers operations, it is called when
user space use the ioctl interface to get or set regs.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement basic vcpu interfaces, including:
1, vcpu create and destroy interface, saving info into vcpu arch
structure such as vcpu exception entrance, vcpu enter guest pointer,
etc. Init vcpu timer and set address translation mode when vcpu create.
2, vcpu run interface, handling mmio, iocsr reading fault and deliver
interrupt, lose fpu before vcpu enter guest.
3, vcpu handle exit interface, getting the exit code by ESTAT register
and using kvm exception vector to handle it.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement LoongArch VM operations: Init and destroy vm interface,
allocating memory page to save the vm pgd when init vm. Implement
vm check extension, such as getting vcpu number info, memory slots
info, and fpu info. And implement vm status description.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement kvm hardware enable, disable interface, setting the
guest config register to enable virtualization features when called
the interface.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement LoongArch kvm module init, module exit interface, using kvm
context to save the vpid info and vcpu world switch interface pointer.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add LoongArch KVM related header files, including kvm.h, kvm_host.h and
kvm_types.h. All of those are about LoongArch virtualization features
and kvm interfaces.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Tested-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
When build and update kernel with the latest upstream binutils and
loongson3_defconfig, module loader fails with:
kmod: zsmalloc: Unknown relocation type 109
kmod: fuse: Unknown relocation type 109
kmod: fuse: Unknown relocation type 109
kmod: radeon: Unknown relocation type 109
kmod: nf_tables: Unknown relocation type 109
kmod: nf_tables: Unknown relocation type 109
This is because the latest upstream binutils replaces a pair of ADD64
and SUB64 with 64_PCREL, so add support for 64_PCREL relocation type.
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
Cc: <stable@vger.kernel.org>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
When build and update kernel with the latest upstream binutils and
loongson3_defconfig, module loader fails with:
kmod: zsmalloc: Unsupport relocation type 99, please add its support.
kmod: fuse: Unsupport relocation type 99, please add its support.
kmod: ipmi_msghandler: Unsupport relocation type 99, please add its support.
kmod: ipmi_msghandler: Unsupport relocation type 99, please add its support.
kmod: pstore: Unsupport relocation type 99, please add its support.
kmod: drm_display_helper: Unsupport relocation type 99, please add its support.
kmod: drm_display_helper: Unsupport relocation type 99, please add its support.
kmod: drm_display_helper: Unsupport relocation type 99, please add its support.
kmod: fuse: Unsupport relocation type 99, please add its support.
kmod: fat: Unsupport relocation type 99, please add its support.
This is because the latest upstream binutils replaces a pair of ADD32
and SUB32 with 32_PCREL, so add support for 32_PCREL relocation type.
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
Cc: <stable@vger.kernel.org>
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The relocation types from 101 to 109 are used by GNU binutils >= 2.41,
add their definitions to use them in later patches.
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=include/elf/loongarch.h#l230
Cc: <stable@vger.kernel.org>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
For 64bit kernel without HIGHMEM, high_memory is the virtual address of
the highest physical address in the system. But __va(get_num_physpages()
<< PAGE_SHIFT) is not what we want for high_memory because there may be
holes in the physical address space. On the other hand, max_low_pfn is
calculated from memblock_end_of_DRAM(), which is exactly corresponding
to the highest physical address, so use it for high_memory calculation.
Cc: <stable@vger.kernel.org>
Fixes: d4b6f1562a ("LoongArch: Add Non-Uniform Memory Access (NUMA) support")
Signed-off-by: Chong Qiao <qiaochong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
As Linus suggested, kasan_mem_to_shadow()/kasan_shadow_to_mem() are not
performance-critical and too big to inline. This is simply wrong so just
define them out-of-line.
If they really need to be inlined in future, such as the objtool / SMAP
issue for X86, we should mark them __always_inline.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
As Linus suggested, __HAVE_ARCH_XYZ is "stupid" and "having historical
uses of it doesn't make it good". So migrate __HAVE_ARCH_SHADOW_MAP to
separate macros named after the respective functions.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The initial aim is to silence the following objtool warning:
arch/loongarch/kernel/relocate_kernel.o: warning: objtool: relocate_new_kernel+0x74: unreachable instruction
There are two adjacent "b" instructions, the second one is unreachable,
it is dead code, just remove it.
Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Use _UL() and _ULL() that are provided by const.h.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Since commit 0a6b58c5cd ("lockdep: fix static memory detection even
more") the lockdep code uses is_kernel_core_data(), is_kernel_rodata()
and init_section_contains() to verify if a lock is located inside a
kernel static data section.
This change triggers a failure on LoongArch, for which the vmlinux.lds.S
script misses to put the locks (as part of in the .data.rel symbols)
into the Linux data section.
This patch fixes the lockdep problem by moving *(.data.rel*) symbols
into the kernel data section (from _sdata to _edata).
Additionally, move other wrongly assigned symbols too:
- altinstructions into the _initdata section,
- PLT symbols behind the read-only section, and
- *(.la_abs) into the data section.
Cc: stable <stable@kernel.org> # v6.4+
Fixes: 0a6b58c5cd ("lockdep: fix static memory detection even more")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
1, Allow usage of LSX/LASX in the kernel;
2, Add SIMD-optimized RAID5/RAID6 routines;
3, Add Loongson Binary Translation (LBT) extension support;
4, Add basic KGDB & KDB support;
5, Add building with kcov coverage;
6, Add KFENCE (Kernel Electric-Fence) support;
7, Add KASAN (Kernel Address Sanitizer) support;
8, Some bug fixes and other small changes;
9, Update the default config file.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmT5TfMWHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImeqd3EACjqCaHNlp33kwufSPpGuQw9a8I
F7JW1KzBOoWELch5nFRjfQClROBWRmM4jN5YnxENBQ5K2F1K6gfxdkfjew+KV2mn
ki9ByamCfFVJDZXo9wavUD2LBrVakEFmLT+SyXBxdWwJ3fDivHjF6A0qs9ltp7dq
Bttq4bkw1mZsU6MnViRwPKVROtNUVrd9mwYSTq0iXviVEbWhPHQQTxRizNra9Z6X
7XWxO0ODHl0WVvdOJU+F16mBRS3Bs1g/HHAIDc41yrYEHFFOeFCEUAQSF/4Nj5wj
BAfAB8WOa9+vPH8fTnrpCt2RtGJmkz71TM49DdXB7jpGaWIyc4WDi9MXeeBiJ0wE
vQg8IECc9POC1sH4/6BMwq2qkrWRj2PYFYof0fP66iWNjmodtNUf7GOVHy8MTQan
xHWizJFAdY/u/bwbF9tRQ+EVeot/844CkjtZxkgTfV8shN6kCMEVAamwBItZ7TXN
g/oc1ORM6nsKHBDQF3r2LSY0Gbf3OSfMJVL8SLEQ9hAhgGhotmJ36B4bdvyO7T0Q
gNn//U+p4IIMFRKRxreEz9P0KjTOJrHAAxNzu1oZebhGZd5WI+i0PHYkkBDKZTXc
7qaEdM2cX8Wd0ePIXOHQnSItwYO7ilrviHyeCM8wd/g2/W/00jvnpF3J+2rk7eJO
rcfAr8+V5ylYBQzp6Q==
=NXy2
-----END PGP SIGNATURE-----
Merge tag 'loongarch-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Allow usage of LSX/LASX in the kernel, and use them for
SIMD-optimized RAID5/RAID6 routines
- Add Loongson Binary Translation (LBT) extension support
- Add basic KGDB & KDB support
- Add building with kcov coverage
- Add KFENCE (Kernel Electric-Fence) support
- Add KASAN (Kernel Address Sanitizer) support
- Some bug fixes and other small changes
- Update the default config file
* tag 'loongarch-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (25 commits)
LoongArch: Update Loongson-3 default config file
LoongArch: Add KASAN (Kernel Address Sanitizer) support
LoongArch: Simplify the processing of jumping new kernel for KASLR
kasan: Add (pmd|pud)_init for LoongArch zero_(pud|p4d)_populate process
kasan: Add __HAVE_ARCH_SHADOW_MAP to support arch specific mapping
LoongArch: Add KFENCE (Kernel Electric-Fence) support
LoongArch: Get partial stack information when providing regs parameter
LoongArch: mm: Add page table mapped mode support for virt_to_page()
kfence: Defer the assignment of the local variable addr
LoongArch: Allow building with kcov coverage
LoongArch: Provide kaslr_offset() to get kernel offset
LoongArch: Add basic KGDB & KDB support
LoongArch: Add Loongson Binary Translation (LBT) extension support
raid6: Add LoongArch SIMD recovery implementation
raid6: Add LoongArch SIMD syndrome calculation
LoongArch: Add SIMD-optimized XOR routines
LoongArch: Allow usage of LSX/LASX in the kernel
LoongArch: Define symbol 'fault' as a local label in fpu.S
LoongArch: Adjust {copy, clear}_user exception handler behavior
LoongArch: Use static defined zero page rather than allocated
...
1, Enable LSX and LASX.
2, Enable KASLR (CONFIG_RANDOMIZE_BASE).
3, Enable jump label (patching mechanism for static key).
4, Enable LoongArch CRC32(c) Acceleration.
5, Enable Loongson-specific drivers: I2C/RTC/DRM/SOC/CLK/PINCTRL/GPIO/SPI.
6, Enable EXFAT/NTFS3/JFS/GFS2/OCFS2/UBIFS/EROFS/CEPH file systems.
7, Enable WangXun NGBE/TXGBE NIC drivers.
8, Enable some IPVS options.
9, Remove CONFIG_SYSFS_DEPRECATED since it is removed in Kconfig.
10, Remove CONFIG_IP_NF_TARGET_CLUSTERIP since it is removed in Kconfig.
11, Remove CONFIG_NFT_OBJREF since it is removed in Kconfig.
12, Remove CONFIG_R8188EU since it is replaced by CONFIG_RTL8XXXU.
Signed-off-by: Trevor Woerner <twoerner@gmail.com>
Signed-off-by: Xuewen Wang <wangxuewen@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
1/8 of kernel addresses reserved for shadow memory. But for LoongArch,
There are a lot of holes between different segments and valid address
space (256T available) is insufficient to map all these segments to kasan
shadow memory with the common formula provided by kasan core, saying
(addr >> KASAN_SHADOW_SCALE_SHIFT) + KASAN_SHADOW_OFFSET
So LoongArch has a arch-specific mapping formula, different segments are
mapped individually, and only limited space lengths of these specific
segments are mapped to shadow.
At early boot stage the whole shadow region populated with just one
physical page (kasan_early_shadow_page). Later, this page is reused as
readonly zero shadow for some memory that kasan currently don't track.
After mapping the physical memory, pages for shadow memory are allocated
and mapped.
Functions like memset()/memcpy()/memmove() do a lot of memory accesses.
If bad pointer passed to one of these function it is important to be
caught. Compiler's instrumentation cannot do this since these functions
are written in assembly.
KASan replaces memory functions with manually instrumented variants.
Original functions declared as weak symbols so strong definitions in
mm/kasan/kasan.c could replace them. Original functions have aliases
with '__' prefix in names, so we could call non-instrumented variant
if needed.
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Modified relocate_kernel() doesn't return new kernel's entry point but
the random_offset. In this way we share the start_kernel() processing
with the normal kernel, which avoids calling 'jr a0' directly and allows
some other operations (e.g, kasan_early_init) before start_kernel() when
KASLR (CONFIG_RANDOMIZE_BASE) is turned on.
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The LoongArch architecture is quite different from other architectures.
When the allocating of KFENCE itself is done, it is mapped to the direct
mapping configuration window [1] by default on LoongArch. It means that
it is not possible to use the page table mapped mode which required by
the KFENCE system and therefore it should be remapped to the appropriate
region.
This patch adds architecture specific implementation details for KFENCE.
In particular, this implements the required interface in <asm/kfence.h>.
Tested this patch by running the testcases and all passed.
[1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode
Signed-off-by: Enze Li <lienze@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Currently, arch_stack_walk() can only get the full stack information
including NMI. This is because the implementation of arch_stack_walk()
is forced to ignore the information passed by the regs parameter and use
the current stack information instead.
For some detection systems like KFENCE, only partial stack information
is needed. In particular, the stack frame where the interrupt occurred.
To support KFENCE, this patch modifies the implementation of the
arch_stack_walk() function so that if this function is called with the
regs argument passed, it retains all the stack information in regs and
uses it to provide accurate information.
Before this patch:
[ 1.531195 ] ==================================================================
[ 1.531442 ] BUG: KFENCE: out-of-bounds read in stack_trace_save_regs+0x48/0x6c
[ 1.531442 ]
[ 1.531900 ] Out-of-bounds read at 0xffff800012267fff (1B left of kfence-#12):
[ 1.532046 ] stack_trace_save_regs+0x48/0x6c
[ 1.532169 ] kfence_report_error+0xa4/0x528
[ 1.532276 ] kfence_handle_page_fault+0x124/0x270
[ 1.532388 ] no_context+0x50/0x94
[ 1.532453 ] do_page_fault+0x1a8/0x36c
[ 1.532524 ] tlb_do_page_fault_0+0x118/0x1b4
[ 1.532623 ] test_out_of_bounds_read+0xa0/0x1d8
[ 1.532745 ] kunit_generic_run_threadfn_adapter+0x1c/0x28
[ 1.532854 ] kthread+0x124/0x130
[ 1.532922 ] ret_from_kernel_thread+0xc/0xa4
<snip>
After this patch:
[ 1.320220 ] ==================================================================
[ 1.320401 ] BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa8/0x1d8
[ 1.320401 ]
[ 1.320898 ] Out-of-bounds read at 0xffff800012257fff (1B left of kfence-#10):
[ 1.321134 ] test_out_of_bounds_read+0xa8/0x1d8
[ 1.321264 ] kunit_generic_run_threadfn_adapter+0x1c/0x28
[ 1.321392 ] kthread+0x124/0x130
[ 1.321459 ] ret_from_kernel_thread+0xc/0xa4
<snip>
Suggested-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Enze Li <lienze@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
According to LoongArch documentations, there are two types of address
translation modes: direct mapped address translation mode (DMW mode) and
page table mapped address translation mode (TLB mode).
Currently, virt_to_page() only supports direct mapped mode. This patch
determines which mode is used, and adds corresponding handling functions
for both modes.
For more details on the two modes, see [1].
[1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#virtual-address-space-and-address-translation-mode
Signed-off-by: Enze Li <lienze@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add ARCH_HAS_KCOV and HAVE_GCC_PLUGINS to the LoongArch Kconfig. And
also disable instrumentation of vdso.
Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Provide kaslr_offset() to get the kernel offset when KASLR is enabled.
Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
KGDB is intended to be used as a source level debugger for the Linux
kernel. It is used along with gdb to debug a Linux kernel. GDB can be
used to "break in" to the kernel to inspect memory, variables and regs
similar to the way an application developer would use GDB to debug an
application. KDB is a frontend of KGDB which is similar to GDB.
By now, in addition to the generic KGDB features, the LoongArch KGDB
implements the following features:
- Hardware breakpoints/watchpoints;
- Software single-step support for KDB.
Signed-off-by: Qing Zhang <zhangqing@loongson.cn> # Framework & CoreFeature
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> # BreakPoint & SingleStep
Signed-off-by: Hui Li <lihui@loongson.cn> # Some Minor Improvements
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> # Some Build Error Fixes
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Loongson Binary Translation (LBT) is used to accelerate binary translation,
which contains 4 scratch registers (scr0 to scr3), x86/ARM eflags (eflags)
and x87 fpu stack pointer (ftop).
This patch support kernel to save/restore these registers, handle the LBT
exception and maintain sigcontext.
Signed-off-by: Qi Hu <huqi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add LSX and LASX implementations of xor operations, operating on 64
bytes (one L1 cache line) at a time, for a balance between memory
utilization and instruction mix. Huacai confirmed that all future
LoongArch implementations by Loongson (that we care) will likely also
feature 64-byte cache lines, and experiments show no throughput
improvement with further unrolling.
Performance numbers measured during system boot on a 3A5000 @ 2.5GHz:
> 8regs : 12702 MB/sec
> 8regs_prefetch : 10920 MB/sec
> 32regs : 12686 MB/sec
> 32regs_prefetch : 10918 MB/sec
> lsx : 17589 MB/sec
> lasx : 26116 MB/sec
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Allow usage of LSX/LASX in the kernel by extending kernel_fpu_begin()
and kernel_fpu_end().
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The initial aim is to silence the following objtool warnings:
arch/loongarch/kernel/fpu.o: warning: objtool: _save_fp_context() falls through to next function fault()
arch/loongarch/kernel/fpu.o: warning: objtool: _restore_fp_context() falls through to next function fault()
arch/loongarch/kernel/fpu.o: warning: objtool: _save_lsx_context() falls through to next function fault()
arch/loongarch/kernel/fpu.o: warning: objtool: _restore_lsx_context() falls through to next function fault()
arch/loongarch/kernel/fpu.o: warning: objtool: _save_lasx_context() falls through to next function fault()
arch/loongarch/kernel/fpu.o: warning: objtool: _restore_lasx_context() falls through to next function fault()
Currently, SYM_FUNC_START()/SYM_FUNC_END() defines the symbol 'fault' as
SYM_T_FUNC which is STT_FUNC, the objtool warnings are generated through
the following code:
tools/objtool/include/objtool/check.h:
static inline struct symbol *insn_func(struct instruction *insn)
{
struct symbol *sym = insn->sym;
if (sym && sym->type != STT_FUNC)
sym = NULL;
return sym;
}
tools/objtool/check.c:
static int validate_branch(struct objtool_file *file, struct symbol *func,
struct instruction *insn, struct insn_state state)
{
...
if (func && insn_func(insn) && func != insn_func(insn)->pfunc) {
...
WARN("%s() falls through to next function %s()",
func->name, insn_func(insn)->name);
return 1;
}
...
}
We can see that the fixup can be a local label in the following code:
arch/loongarch/include/asm/asm-extable.h:
.pushsection __ex_table, "a"; \
.balign 4; \
.long ((insn) - .); \
.long ((fixup) - .); \
.short (type); \
.short (data); \
.popsection;
.macro _asm_extable, insn, fixup
__ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0)
.endm
Like arch/loongarch/lib/*.S, just define the symbol 'fault' as a local
label in fpu.S.
Before:
$ readelf -s arch/loongarch/kernel/fpu.o | awk -F: /fault/'{print $2}'
000000000000053c 8 FUNC GLOBAL DEFAULT 1 fault
After:
$ readelf -s arch/loongarch/kernel/fpu.o | awk -F: /fault/'{print $2}'
000000000000053c 0 NOTYPE LOCAL DEFAULT 1 .L_fpu_fault
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The {copy, clear}_user function should returns number of bytes that
could not be {copied, cleared}. So, try to {copy, clear} byte by byte
when ld.{d,w,h} and st.{d,w,h} trapped into an exception.
Reviewed-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Weihao Li <liweihao@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
On LoongArch system, there is only one page needed for zero page (no
cache synonyms), and there is no COLOR_ZERO_PAGE, so zero_page_mask is
useless and the macro __HAVE_COLOR_ZERO_PAGE is not necessary.
Like other popular architectures, It is simpler to define the zero page
in kernel BSS code segment rather than dynamically allocate.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Function pcpu_populate_pte() and fixmap_pte() are similar, they populate
one page from kernel address space. And there is confusion between pgd
and p4d in the function fixmap_pte(), such as pgd_none() always returns
zero. This patch introduces a unified function populate_kernel_pte() and
then replaces pcpu_populate_pte() and fixmap_pte().
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Do some code improvements in function pcpu_populate_pte():
1. Add memory allocation failure handling;
2. Replace pgd_populate() with p4d_populate(), it will be useful if
there are four-level page tables.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Both shm_align_mask and SHMLBA want to avoid cache alias. But they are
inconsistent: shm_align_mask is (PAGE_SIZE - 1) while SHMLBA is SZ_64K,
but PAGE_SIZE is not always equal to SZ_64K.
This may cause problems when shmat() twice. Fix this problem by removing
shm_align_mask and using SHMLBA (strictly SHMLBA - 1) instead.
Reported-by: Jiantao Shan <shanjiantao@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
When I do LTP test, LTP test case ksm06 caused panic at
break_ksm_pmd_entry
-> pmd_leaf (Huge page table but False)
-> pte_present (panic)
The reason is pmd_leaf() is not defined, So like commit 501b810467
("mips: mm: add p?d_leaf() definitions") add p?d_leaf() definition for
LoongArch.
Fixes: 09cfefb7fa ("LoongArch: Add memory management")
Cc: stable@vger.kernel.org
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
When building with CONFIG_LTO_CLANG_FULL, there are several errors due
to the way that parse_r is defined with an __asm__ statement in a
header:
ld.lld: error: ld-temp.o <inline asm>:105:1: macro 'parse_r' is already defined
.macro parse_r var r
^
This was an issue for arch/mips as well, which was resolved by commit
67512a8cf5 ("MIPS: Avoid macro redefinitions").
However, parse_r is unused in arch/loongarch after commit 83d8b38967
("LoongArch: Simplify the invtlb wrappers"), so doing the same change
does not make much sense now. Just remove parse_r (and parse_v, which
is also unused) to resolve the redefinition error. If it needs to be
brought back due to an actual use, it should be brought back with the
same changes as the aforementioned arch/mips commit.
Closes: https://github.com/ClangBuiltLinux/linux/issues/1924
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Here is the big set of tty and serial driver changes for 6.6-rc1.
Lots of cleanups in here this cycle, and some driver updates. Short
summary is:
- Jiri's continued work to make the tty code and apis be a bit more
sane with regards to modern kernel coding style and types
- cpm_uart driver updates
- n_gsm updates and fixes
- meson driver updates
- sc16is7xx driver updates
- 8250 driver updates for different hardware types
- qcom-geni driver fixes
- tegra serial driver change
- stm32 driver updates
- synclink_gt driver cleanups
- tty structure size reduction
All of these have been in linux-next this week with no reported issues.
The last bit of cleanups from Jiri and the tty structure size reduction
came in last week, a bit late but as they were just style changes and
size reductions, I figured they should get into this merge cycle so that
others can work on top of them with no merge conflicts.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPH+jA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykKyACgldt6QeenTN+6dXIHS/eQHtTKZwMAn3arSeXI
QrUUnLFjOWyoX87tbMBQ
=LVw0
-----END PGP SIGNATURE-----
Merge tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Here is the big set of tty and serial driver changes for 6.6-rc1.
Lots of cleanups in here this cycle, and some driver updates. Short
summary is:
- Jiri's continued work to make the tty code and apis be a bit more
sane with regards to modern kernel coding style and types
- cpm_uart driver updates
- n_gsm updates and fixes
- meson driver updates
- sc16is7xx driver updates
- 8250 driver updates for different hardware types
- qcom-geni driver fixes
- tegra serial driver change
- stm32 driver updates
- synclink_gt driver cleanups
- tty structure size reduction
All of these have been in linux-next this week with no reported
issues. The last bit of cleanups from Jiri and the tty structure size
reduction came in last week, a bit late but as they were just style
changes and size reductions, I figured they should get into this merge
cycle so that others can work on top of them with no merge conflicts"
* tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (199 commits)
tty: shrink the size of struct tty_struct by 40 bytes
tty: n_tty: deduplicate copy code in n_tty_receive_buf_real_raw()
tty: n_tty: extract ECHO_OP processing to a separate function
tty: n_tty: unify counts to size_t
tty: n_tty: use u8 for chars and flags
tty: n_tty: simplify chars_in_buffer()
tty: n_tty: remove unsigned char casts from character constants
tty: n_tty: move newline handling to a separate function
tty: n_tty: move canon handling to a separate function
tty: n_tty: use MASK() for masking out size bits
tty: n_tty: make n_tty_data::num_overrun unsigned
tty: n_tty: use time_is_before_jiffies() in n_tty_receive_overrun()
tty: n_tty: use 'num' for writes' counts
tty: n_tty: use output character directly
tty: n_tty: make flow of n_tty_receive_buf_common() a bool
Revert "tty: serial: meson: Add a earlycon for the T7 SoC"
Documentation: devices.txt: Fix minors for ttyCPM*
Documentation: devices.txt: Remove ttySIOC*
Documentation: devices.txt: Remove ttyIOC*
serial: 8250_bcm7271: improve bcm7271 8250 port
...
Convert IBT selftest to asm to fix objtool warning
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmTv1QQACgkQaDWVMHDJ
krAUwhAAn6TOwHJK8BSkHeiQhON1nrlP3c5cv0AyZ2NP8RYDrZrSZvhpYBJ6wgKC
Cx5CGq5nn9twYsYS3KsktLKDfR3lRdsQ7K9qtyFtYiaeaVKo+7gEKl/K+klwai8/
gninQWHk0zmSCja8Vi77q52WOMkQKapT8+vaON9EVDO8dVEi+CvhAIfPwMafuiwO
Rk4X86SzoZu9FP79LcCg9XyGC/XbM2OG9eNUTSCKT40qTTKm5y4gix687NvAlaHR
ko5MTsdl0Wfp6Qk0ohT74LnoA2c1g/FluvZIM33ci/2rFpkf9Hw7ip3lUXqn6CPx
rKiZ+pVRc0xikVWkraMfIGMJfUd2rhelp8OyoozD7DB7UZw40Q4RW4N5tgq9Fhe9
MQs3p1v9N8xHdRKl365UcOczUxNAmv4u0nV5gY/4FMC6VjldCl2V9fmqYXyzFS4/
Ogg4FSd7c2JyGFKPs+5uXyi+RY2qOX4+nzHOoKD7SY616IYqtgKoz5usxETLwZ6s
VtJOmJL0h//z0A7tBliB0zd+SQ5UQQBDC2XouQH2fNX2isJMn0UDmWJGjaHgK6Hh
8jVp6LNqf+CEQS387UxckOyj7fu438hDky1Ggaw4YqowEOhQeqLVO4++x+HITrbp
AupXfbJw9h9cMN63Yc0gVxXQ9IMZ+M7UxLtZ3Cd8/PVztNy/clA=
=3UUm
-----END PGP SIGNATURE-----
Merge tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 shadow stack support from Dave Hansen:
"This is the long awaited x86 shadow stack support, part of Intel's
Control-flow Enforcement Technology (CET).
CET consists of two related security features: shadow stacks and
indirect branch tracking. This series implements just the shadow stack
part of this feature, and just for userspace.
The main use case for shadow stack is providing protection against
return oriented programming attacks. It works by maintaining a
secondary (shadow) stack using a special memory type that has
protections against modification. When executing a CALL instruction,
the processor pushes the return address to both the normal stack and
to the special permission shadow stack. Upon RET, the processor pops
the shadow stack copy and compares it to the normal stack copy.
For more information, refer to the links below for the earlier
versions of this patch set"
Link: https://lore.kernel.org/lkml/20220130211838.8382-1-rick.p.edgecombe@intel.com/
Link: https://lore.kernel.org/lkml/20230613001108.3040476-1-rick.p.edgecombe@intel.com/
* tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
x86/shstk: Change order of __user in type
x86/ibt: Convert IBT selftest to asm
x86/shstk: Don't retry vm_munmap() on -EINTR
x86/kbuild: Fix Documentation/ reference
x86/shstk: Move arch detail comment out of core mm
x86/shstk: Add ARCH_SHSTK_STATUS
x86/shstk: Add ARCH_SHSTK_UNLOCK
x86: Add PTRACE interface for shadow stack
selftests/x86: Add shadow stack test
x86/cpufeatures: Enable CET CR4 bit for shadow stack
x86/shstk: Wire in shadow stack interface
x86: Expose thread features in /proc/$PID/status
x86/shstk: Support WRSS for userspace
x86/shstk: Introduce map_shadow_stack syscall
x86/shstk: Check that signal frame is shadow stack mem
x86/shstk: Check that SSP is aligned on sigreturn
x86/shstk: Handle signals for shadow stack
x86/shstk: Introduce routines modifying shstk
x86/shstk: Handle thread shadow stack
x86/shstk: Add user-mode shadow stack support
...
core:
- fix gfp flags in drmm_kmalloc
gpuva:
- add new generic GPU VA manager (for nouveau initially)
syncobj:
- add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl
dma-buf:
- acquire resv lock for mmap() in exporters
- support dma-buf self import automatically
- docs fixes
backlight:
- fix fbdev interactions
atomic:
- improve logging
prime:
- remove struct gem_prim_mmap plus driver updates
gem:
- drm_exec: add locking over multiple GEM objects
- fix lockdep checking
fbdev:
- make fbdev userspace interfaces optional
- use linux device instead of fbdev device
- use deferred i/o helper macros in various drivers
- Make FB core selectable without drivers
- Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT
- Add helper macros and Kconfig tokens for DMA-allocated framebuffer
ttm:
- support init_on_free
- swapout fixes
panel:
- panel-edp: Support AUO B116XAB01.4
- Support Visionox R66451 plus DT bindings
- ld9040: Backlight support, magic improved,
Kconfig fix
- Convert to of_device_get_match_data()
- Fix Kconfig dependencies
- simple: Set bpc value to fix warning; Set connector type for AUO T215HVN01;
Support Innolux G156HCE-L01 plus DT bindings
- ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings
- startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings
- sitronix-st7789v: Support Inanbo T28CP45TN89 plus DT bindings;
Support EDT ET028013DMA plus DT bindings; Various cleanups
- edp: Add timings for N140HCA-EAC
- Allow panels and touchscreens to power sequence together
- Fix Innolux G156HCE-L01 LVDS clock
bridge:
- debugfs for chains support
- dw-hdmi: Improve support for YUV420 bus format
CEC suspend/resume, update EDID on HDMI detect
- dw-mipi-dsi: Fix enable/disable of DSI controller
- lt9611uxc: Use MODULE_FIRMWARE()
- ps8640: Remove broken EDID code
- samsung-dsim: Fix command transfer
- tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups
- adv7511: Fix low refresh rate
- anx7625: Switch to macros instead of hardcoded values
locking fixes
- tc358767: fix hardware delays
- sitronix-st7789v: Support panel orientation; Support rotation
property; Add support for Jasonic
JT240MHQS-HWT-EK-E3 plus DT bindings
amdgpu:
- SDMA 6.1.0 support
- HDP 6.1 support
- SMUIO 14.0 support
- PSP 14.0 support
- IH 6.1 support
- Lots of checkpatch cleanups
- GFX 9.4.3 updates
- Add USB PD and IFWI flashing documentation
- GPUVM updates
- RAS fixes
- DRR fixes
- FAMS fixes
- Virtual display fixes
- Soft IH fixes
- SMU13 fixes
- Rework PSP firmware loading for other IPs
- Kernel doc fixes
- DCN 3.0.1 fixes
- LTTPR fixes
- DP MST fixes
- DCN 3.1.6 fixes
- SMU 13.x fixes
- PSP 13.x fixes
- SubVP fixes
- GC 9.4.3 fixes
- Display bandwidth calculation fixes
- VCN4 secure submission fixes
- Allow building DC on RISC-V
- Add visible FB info to bo_print_info
- HBR3 fixes
- GFX9 MCBP fix
- GMC10 vmhub index fix
- GMC11 vmhub index fix
- Create a new doorbell manager
- SR-IOV fixes
- initial freesync panel replay support
- revert zpos properly until igt regression is fixeed
- use TTM to manage doorbell BAR
- Expose both current and average power via hwmon if supported
amdkfd:
- Cleanup CRIU dma-buf handling
- Use KIQ to unmap HIQ
- GFX 9.4.3 debugger updates
- GFX 9.4.2 debugger fixes
- Enable cooperative groups fof gfx11
- SVM fixes
- Convert older APUs to use dGPU path like newer APUs
- Drop IOMMUv2 path as it is no longer used
- TBA fix for aldebaran
i915:
- ICL+ DSI modeset sequence
- HDCP improvements
- MTL display fixes and cleanups
- HSW/BDW PSR1 restored
- Init DDI ports in VBT order
- General display refactors
- Start using plane scale factor for relative data rate
- Use shmem for dpt objects
- Expose RPS thresholds in sysfs
- Apply GuC SLPC min frequency softlimit correctly
- Extend Wa_14015795083 to TGL, RKL, DG1 and ADL
- Fix a VMA UAF for multi-gt platform
- Do not use stolen on MTL due to HW bug
- Check HuC and GuC version compatibility on MTL
- avoid infinite GPU waits due to premature release
of request memory
- Fixes and updates for GSC memory allocation
- Display SDVO fixes
- Take stolen handling out of FBC code
- Make i915_coherent_map_type GT-centric
- Simplify shmem_create_from_object map_type
msm:
- SM6125 MDSS support
- DPU: SM6125 DPU support
- DSI: runtime PM support, burst mode support
- DSI PHY: SM6125 support in 14nm DSI PHY driver
- GPU: prepare for a7xx
- fix a690 firmware
- disable relocs on a6xx and newer
radeon:
- Lots of checkpatch cleanups
ast:
- improve device-model detection
- Represent BMV as virtual connector
- Report DP connection status
nouveau:
- add new exec/bind interface to support Vulkan
- document some getparam ioctls
- improve VRAM detection
- various fixes/cleanups
- workraound DPCD issues
ivpu:
- MMU updates
- debugfs support
- Support vpu4
virtio:
- add sync object support
atmel-hlcdc:
- Support inverted pixclock polarity
etnaviv:
- runtime PM cleanups
- hang handling fixes
exynos:
- use fbdev DMA helpers
- fix possible NULL ptr dereference
komeda:
- always attach encoder
omapdrm:
- use fbdev DMA helpers
ingenic:
- kconfig regmap fixes
loongson:
- support display controller
mediatek:
- Small mtk-dpi cleanups
- DisplayPort: support eDP and aux-bus
- Fix coverity issues
- Fix potential memory leak if vmap() fail
mgag200:
- minor fixes
mxsfb:
- support disabling overlay planes
panfrost:
- fix sync in IRQ handling
ssd130x:
- Support per-controller default resolution plus DT bindings
- Reduce memory-allocation overhead
- Improve intermediate buffer size computation
- Fix allocation of temporary buffers
- Fix pitch computation
- Fix shadow plane allocation
tegra:
- use fbdev DMA helpers
- Convert to devm_platform_ioremap_resource()
- support bridge/connector
- enable PM
tidss:
- Support TI AM625 plus DT bindings
- Implement new connector model plus driver updates
vkms:
- improve write back support
- docs fixes
- support gamma LUT
zynqmp-dpsub:
- misc fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmTukSYACgkQDHTzWXnE
hr6vnQ/+J7vBVkBr8JsaEV/twcZwzbNdpivsIagd8U83GQB50nDReVXbNx+Wo0/C
WiGlrC6Sw3NVOGbkigd5IQ7fb5C/7RnBmzMi/iS7Qnk2uEqLqgV00VxfGwdm6wgr
0gNB8zuu2xYphHz2K8LzwnmeQRdN+YUQpUa2wNzLO88IEkTvq5vx2rJEn5p9/3hp
OxbbPBzpDRRPlkNFfVQCN8todbKdsPc4am81Eqgv7BJf21RFgQodPGW5koCYuv0w
3m+PJh1KkfYAL974EsLr/pkY7yhhiZ6SlFLX8ssg4FyZl/Vthmc9bl14jRq/pqt4
GBp8yrPq1XjrwXR8wv3MiwNEdANQ+KD9IoGlzLxqVgmEFRE+g4VzZZXeC3AIrTVP
FPg4iLUrDrmj9RpJmbVqhq9X2jZs+EtRAFkJPrPbq2fItAD2a2dW4X3ISSnnTqDI
6O2dVwuLCU6OfWnvN4bPW9p8CqRgR8Itqv1SI8qXooDy307YZu1eTUf5JAVwG/SW
xbDEFVFlMPyFLm+KN5dv1csJKK21vWi9gLg8phK8mTWYWnqMEtJqbxbRzmdBEFmE
pXKVu01P6ZqgBbaETpCljlOaEDdJnvO4W+o70MgBtpR2IWFMbMNO+iS0EmLZ6Vgj
9zYZctpL+dMuHV0Of1GMkHFRHTMYEzW4tuctLIQfG13y4WzyczY=
=CwV9
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"The drm core grew a new generic gpu virtual address manager, and new
execution locking helpers. These are used by nouveau now to provide
uAPI support for the userspace Vulkan driver. AMD had a bunch of new
IP core support, loads of refactoring around fbdev, but mostly just
the usual amount of stuff across the board.
core:
- fix gfp flags in drmm_kmalloc
gpuva:
- add new generic GPU VA manager (for nouveau initially)
syncobj:
- add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl
dma-buf:
- acquire resv lock for mmap() in exporters
- support dma-buf self import automatically
- docs fixes
backlight:
- fix fbdev interactions
atomic:
- improve logging
prime:
- remove struct gem_prim_mmap plus driver updates
gem:
- drm_exec: add locking over multiple GEM objects
- fix lockdep checking
fbdev:
- make fbdev userspace interfaces optional
- use linux device instead of fbdev device
- use deferred i/o helper macros in various drivers
- Make FB core selectable without drivers
- Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT
- Add helper macros and Kconfig tokens for DMA-allocated framebuffer
ttm:
- support init_on_free
- swapout fixes
panel:
- panel-edp: Support AUO B116XAB01.4
- Support Visionox R66451 plus DT bindings
- ld9040:
- Backlight support
- magic improved
- Kconfig fix
- Convert to of_device_get_match_data()
- Fix Kconfig dependencies
- simple:
- Set bpc value to fix warning
- Set connector type for AUO T215HVN01
- Support Innolux G156HCE-L01 plus DT bindings
- ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings
- startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings
- sitronix-st7789v:
- Support Inanbo T28CP45TN89 plus DT bindings
- Support EDT ET028013DMA plus DT bindings
- Various cleanups
- edp: Add timings for N140HCA-EAC
- Allow panels and touchscreens to power sequence together
- Fix Innolux G156HCE-L01 LVDS clock
bridge:
- debugfs for chains support
- dw-hdmi:
- Improve support for YUV420 bus format
- CEC suspend/resume
- update EDID on HDMI detect
- dw-mipi-dsi: Fix enable/disable of DSI controller
- lt9611uxc: Use MODULE_FIRMWARE()
- ps8640: Remove broken EDID code
- samsung-dsim: Fix command transfer
- tc358764:
- Handle HS/VS polarity
- Use BIT() macro
- Various cleanups
- adv7511: Fix low refresh rate
- anx7625:
- Switch to macros instead of hardcoded values
- locking fixes
- tc358767: fix hardware delays
- sitronix-st7789v:
- Support panel orientation
- Support rotation property
- Add support for Jasonic JT240MHQS-HWT-EK-E3 plus DT bindings
amdgpu:
- SDMA 6.1.0 support
- HDP 6.1 support
- SMUIO 14.0 support
- PSP 14.0 support
- IH 6.1 support
- Lots of checkpatch cleanups
- GFX 9.4.3 updates
- Add USB PD and IFWI flashing documentation
- GPUVM updates
- RAS fixes
- DRR fixes
- FAMS fixes
- Virtual display fixes
- Soft IH fixes
- SMU13 fixes
- Rework PSP firmware loading for other IPs
- Kernel doc fixes
- DCN 3.0.1 fixes
- LTTPR fixes
- DP MST fixes
- DCN 3.1.6 fixes
- SMU 13.x fixes
- PSP 13.x fixes
- SubVP fixes
- GC 9.4.3 fixes
- Display bandwidth calculation fixes
- VCN4 secure submission fixes
- Allow building DC on RISC-V
- Add visible FB info to bo_print_info
- HBR3 fixes
- GFX9 MCBP fix
- GMC10 vmhub index fix
- GMC11 vmhub index fix
- Create a new doorbell manager
- SR-IOV fixes
- initial freesync panel replay support
- revert zpos properly until igt regression is fixeed
- use TTM to manage doorbell BAR
- Expose both current and average power via hwmon if supported
amdkfd:
- Cleanup CRIU dma-buf handling
- Use KIQ to unmap HIQ
- GFX 9.4.3 debugger updates
- GFX 9.4.2 debugger fixes
- Enable cooperative groups fof gfx11
- SVM fixes
- Convert older APUs to use dGPU path like newer APUs
- Drop IOMMUv2 path as it is no longer used
- TBA fix for aldebaran
i915:
- ICL+ DSI modeset sequence
- HDCP improvements
- MTL display fixes and cleanups
- HSW/BDW PSR1 restored
- Init DDI ports in VBT order
- General display refactors
- Start using plane scale factor for relative data rate
- Use shmem for dpt objects
- Expose RPS thresholds in sysfs
- Apply GuC SLPC min frequency softlimit correctly
- Extend Wa_14015795083 to TGL, RKL, DG1 and ADL
- Fix a VMA UAF for multi-gt platform
- Do not use stolen on MTL due to HW bug
- Check HuC and GuC version compatibility on MTL
- avoid infinite GPU waits due to premature release of request memory
- Fixes and updates for GSC memory allocation
- Display SDVO fixes
- Take stolen handling out of FBC code
- Make i915_coherent_map_type GT-centric
- Simplify shmem_create_from_object map_type
msm:
- SM6125 MDSS support
- DPU: SM6125 DPU support
- DSI: runtime PM support, burst mode support
- DSI PHY: SM6125 support in 14nm DSI PHY driver
- GPU: prepare for a7xx
- fix a690 firmware
- disable relocs on a6xx and newer
radeon:
- Lots of checkpatch cleanups
ast:
- improve device-model detection
- Represent BMV as virtual connector
- Report DP connection status
nouveau:
- add new exec/bind interface to support Vulkan
- document some getparam ioctls
- improve VRAM detection
- various fixes/cleanups
- workraound DPCD issues
ivpu:
- MMU updates
- debugfs support
- Support vpu4
virtio:
- add sync object support
atmel-hlcdc:
- Support inverted pixclock polarity
etnaviv:
- runtime PM cleanups
- hang handling fixes
exynos:
- use fbdev DMA helpers
- fix possible NULL ptr dereference
komeda:
- always attach encoder
omapdrm:
- use fbdev DMA helpers
ingenic:
- kconfig regmap fixes
loongson:
- support display controller
mediatek:
- Small mtk-dpi cleanups
- DisplayPort: support eDP and aux-bus
- Fix coverity issues
- Fix potential memory leak if vmap() fail
mgag200:
- minor fixes
mxsfb:
- support disabling overlay planes
panfrost:
- fix sync in IRQ handling
ssd130x:
- Support per-controller default resolution plus DT bindings
- Reduce memory-allocation overhead
- Improve intermediate buffer size computation
- Fix allocation of temporary buffers
- Fix pitch computation
- Fix shadow plane allocation
tegra:
- use fbdev DMA helpers
- Convert to devm_platform_ioremap_resource()
- support bridge/connector
- enable PM
tidss:
- Support TI AM625 plus DT bindings
- Implement new connector model plus driver updates
vkms:
- improve write back support
- docs fixes
- support gamma LUT
zynqmp-dpsub:
- misc fixes"
* tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm: (1327 commits)
drm/gpuva_mgr: remove unused prev pointer in __drm_gpuva_sm_map()
drm/tests/drm_kunit_helpers: Place correct function name in the comment header
drm/nouveau: uapi: don't pass NO_PREFETCH flag implicitly
drm/nouveau: uvmm: fix unset region pointer on remap
drm/nouveau: sched: avoid job races between entities
drm/i915: Fix HPD polling, reenabling the output poll work as needed
drm: Add an HPD poll helper to reschedule the poll work
drm/i915: Fix TLB-Invalidation seqno store
drm/ttm/tests: Fix type conversion in ttm_pool_test
drm/msm/a6xx: Bail out early if setting GPU OOB fails
drm/msm/a6xx: Move LLC accessors to the common header
drm/msm/a6xx: Introduce a6xx_llc_read
drm/ttm/tests: Require MMU when testing
drm/panel: simple: Fix Innolux G156HCE-L01 LVDS clock
Revert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0""
drm/amdgpu: Add memory vendor information
drm/amd: flush any delayed gfxoff on suspend entry
drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
drm/amdgpu: Remove gfxoff check in GFX v9.4.3
drm/amd/pm: Update pci link speed for smu v13.0.6
...
("refactor Kconfig to consolidate KEXEC and CRASH options").
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h").
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands").
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions").
- Switch the handling of kdump from a udev scheme to in-kernel handling,
by Eric DeVolder ("crash: Kernel handling of CPU and memory hot
un/plug").
- Many singleton patches to various parts of the tree
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO2GpAAKCRDdBJ7gKXxA
juW3AQD1moHzlSN6x9I3tjm5TWWNYFoFL8af7wXDJspp/DWH/AD/TO0XlWWhhbYy
QHy7lL0Syha38kKLMXTM+bN6YQHi9AU=
=WJQa
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- An extensive rework of kexec and crash Kconfig from Eric DeVolder
("refactor Kconfig to consolidate KEXEC and CRASH options")
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h")
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands")
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions")
- Switch the handling of kdump from a udev scheme to in-kernel
handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
hot un/plug")
- Many singleton patches to various parts of the tree
* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
document while_each_thread(), change first_tid() to use for_each_thread()
drivers/char/mem.c: shrink character device's devlist[] array
x86/crash: optimize CPU changes
crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
crash: hotplug support for kexec_load()
x86/crash: add x86 crash hotplug support
crash: memory and CPU hotplug sysfs attributes
kexec: exclude elfcorehdr from the segment digest
crash: add generic infrastructure for crash hotplug support
crash: move a few code bits to setup support of crash hotplug
kstrtox: consistently use _tolower()
kill do_each_thread()
nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
scripts/bloat-o-meter: count weak symbol sizes
treewide: drop CONFIG_EMBEDDED
lockdep: fix static memory detection even more
lib/vsprintf: declare no_hash_pointers in sprintf.h
lib/vsprintf: split out sprintf() and friends
kernel/fork: stop playing lockless games for exe_file replacement
adfs: delete unused "union adfs_dirtail" definition
...
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
reduces the special-case code for handling hugetlb pages in GUP. It
also speeds up GUP handling of transparent hugepages.
- Peng Zhang provides some maple tree speedups ("Optimize the fast path
of mas_store()").
- Sergey Senozhatsky has improved te performance of zsmalloc during
compaction (zsmalloc: small compaction improvements").
- Domenico Cerasuolo has developed additional selftest code for zswap
("selftests: cgroup: add zswap test program").
- xu xin has doe some work on KSM's handling of zero pages. These
changes are mainly to enable the user to better understand the
effectiveness of KSM's treatment of zero pages ("ksm: support tracking
KSM-placed zero-pages").
- Jeff Xu has fixes the behaviour of memfd's
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
- David Howells has fixed an fscache optimization ("mm, netfs, fscache:
Stop read optimisation when folio removed from pagecache").
- Axel Rasmussen has given userfaultfd the ability to simulate memory
poisoning ("add UFFDIO_POISON to simulate memory poisoning with UFFD").
- Miaohe Lin has contributed some routine maintenance work on the
memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
check").
- Peng Zhang has contributed some maintenance work on the maple tree
code ("Improve the validation for maple tree and some cleanup").
- Hugh Dickins has optimized the collapsing of shmem or file pages into
THPs ("mm: free retracted page table by RCU").
- Jiaqi Yan has a patch series which permits us to use the healthy
subpages within a hardware poisoned huge page for general purposes
("Improve hugetlbfs read on HWPOISON hugepages").
- Kemeng Shi has done some maintenance work on the pagetable-check code
("Remove unused parameters in page_table_check").
- More folioification work from Matthew Wilcox ("More filesystem folio
conversions for 6.6"), ("Followup folio conversions for zswap"). And
from ZhangPeng ("Convert several functions in page_io.c to use a
folio").
- page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
- Baoquan He has converted some architectures to use the GENERIC_IOREMAP
ioremap()/iounmap() code ("mm: ioremap: Convert architectures to take
GENERIC_IOREMAP way").
- Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration").
- Better maple tree lockdep checking from Liam Howlett ("More strict
maple tree lockdep"). Liam also developed some efficiency improvements
("Reduce preallocations for maple tree").
- Cleanup and optimization to the secondary IOMMU TLB invalidation, from
Alistair Popple ("Invalidate secondary IOMMU TLB on permission
upgrade").
- Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
for arm64").
- Kemeng Shi provides some maintenance work on the compaction code ("Two
minor cleanups for compaction").
- Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle most
file-backed faults under the VMA lock").
- Aneesh Kumar contributes code to use the vmemmap optimization for DAX
on ppc64, under some circumstances ("Add support for DAX vmemmap
optimization for ppc64").
- page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
data in page_ext"), ("minor cleanups to page_ext header").
- Some zswap cleanups from Johannes Weiner ("mm: zswap: three
cleanups").
- kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
- VMA handling cleanups from Kefeng Wang ("mm: convert to
vma_is_initial_heap/stack()").
- DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
address ranges and DAMON monitoring targets").
- Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
- Liam Howlett has improved the maple tree node replacement code
("maple_tree: Change replacement strategy").
- ZhangPeng has a general code cleanup - use the K() macro more widely
("cleanup with helper macro K()").
- Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for memmap
on memory feature on ppc64").
- pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
in page_alloc"), ("Two minor cleanups for get pageblock migratetype").
- Vishal Moola introduces a memory descriptor for page table tracking,
"struct ptdesc" ("Split ptdesc from struct page").
- memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
for vm.memfd_noexec").
- MM include file rationalization from Hugh Dickins ("arch: include
asm/cacheflush.h in asm/hugetlb.h").
- THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
output").
- kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
object_cache instead of kmemleak_initialized").
- More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
and _folio_order").
- A VMA locking scalability improvement from Suren Baghdasaryan
("Per-VMA lock support for swap and userfaults").
- pagetable handling cleanups from Matthew Wilcox ("New page table range
API").
- A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
using page->private on tail pages for THP_SWAP + cleanups").
- Cleanups and speedups to the hugetlb fault handling from Matthew
Wilcox ("Change calling convention for ->huge_fault").
- Matthew Wilcox has also done some maintenance work on the MM subsystem
documentation ("Improve mm documentation").
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO1JUQAKCRDdBJ7gKXxA
jrMwAP47r/fS8vAVT3zp/7fXmxaJYTK27CTAM881Gw1SDhFM/wEAv8o84mDenCg6
Nfio7afS1ncD+hPYT8947UnLxTgn+ww=
=Afws
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Some swap cleanups from Ma Wupeng ("fix WARN_ON in
add_to_avail_list")
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
reduces the special-case code for handling hugetlb pages in GUP. It
also speeds up GUP handling of transparent hugepages.
- Peng Zhang provides some maple tree speedups ("Optimize the fast path
of mas_store()").
- Sergey Senozhatsky has improved te performance of zsmalloc during
compaction (zsmalloc: small compaction improvements").
- Domenico Cerasuolo has developed additional selftest code for zswap
("selftests: cgroup: add zswap test program").
- xu xin has doe some work on KSM's handling of zero pages. These
changes are mainly to enable the user to better understand the
effectiveness of KSM's treatment of zero pages ("ksm: support
tracking KSM-placed zero-pages").
- Jeff Xu has fixes the behaviour of memfd's
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
- David Howells has fixed an fscache optimization ("mm, netfs, fscache:
Stop read optimisation when folio removed from pagecache").
- Axel Rasmussen has given userfaultfd the ability to simulate memory
poisoning ("add UFFDIO_POISON to simulate memory poisoning with
UFFD").
- Miaohe Lin has contributed some routine maintenance work on the
memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
check").
- Peng Zhang has contributed some maintenance work on the maple tree
code ("Improve the validation for maple tree and some cleanup").
- Hugh Dickins has optimized the collapsing of shmem or file pages into
THPs ("mm: free retracted page table by RCU").
- Jiaqi Yan has a patch series which permits us to use the healthy
subpages within a hardware poisoned huge page for general purposes
("Improve hugetlbfs read on HWPOISON hugepages").
- Kemeng Shi has done some maintenance work on the pagetable-check code
("Remove unused parameters in page_table_check").
- More folioification work from Matthew Wilcox ("More filesystem folio
conversions for 6.6"), ("Followup folio conversions for zswap"). And
from ZhangPeng ("Convert several functions in page_io.c to use a
folio").
- page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
- Baoquan He has converted some architectures to use the
GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert
architectures to take GENERIC_IOREMAP way").
- Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration").
- Better maple tree lockdep checking from Liam Howlett ("More strict
maple tree lockdep"). Liam also developed some efficiency
improvements ("Reduce preallocations for maple tree").
- Cleanup and optimization to the secondary IOMMU TLB invalidation,
from Alistair Popple ("Invalidate secondary IOMMU TLB on permission
upgrade").
- Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
for arm64").
- Kemeng Shi provides some maintenance work on the compaction code
("Two minor cleanups for compaction").
- Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle
most file-backed faults under the VMA lock").
- Aneesh Kumar contributes code to use the vmemmap optimization for DAX
on ppc64, under some circumstances ("Add support for DAX vmemmap
optimization for ppc64").
- page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
data in page_ext"), ("minor cleanups to page_ext header").
- Some zswap cleanups from Johannes Weiner ("mm: zswap: three
cleanups").
- kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
- VMA handling cleanups from Kefeng Wang ("mm: convert to
vma_is_initial_heap/stack()").
- DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
address ranges and DAMON monitoring targets").
- Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
- Liam Howlett has improved the maple tree node replacement code
("maple_tree: Change replacement strategy").
- ZhangPeng has a general code cleanup - use the K() macro more widely
("cleanup with helper macro K()").
- Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for
memmap on memory feature on ppc64").
- pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
in page_alloc"), ("Two minor cleanups for get pageblock
migratetype").
- Vishal Moola introduces a memory descriptor for page table tracking,
"struct ptdesc" ("Split ptdesc from struct page").
- memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
for vm.memfd_noexec").
- MM include file rationalization from Hugh Dickins ("arch: include
asm/cacheflush.h in asm/hugetlb.h").
- THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
output").
- kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
object_cache instead of kmemleak_initialized").
- More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
and _folio_order").
- A VMA locking scalability improvement from Suren Baghdasaryan
("Per-VMA lock support for swap and userfaults").
- pagetable handling cleanups from Matthew Wilcox ("New page table
range API").
- A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
using page->private on tail pages for THP_SWAP + cleanups").
- Cleanups and speedups to the hugetlb fault handling from Matthew
Wilcox ("Change calling convention for ->huge_fault").
- Matthew Wilcox has also done some maintenance work on the MM
subsystem documentation ("Improve mm documentation").
* tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits)
maple_tree: shrink struct maple_tree
maple_tree: clean up mas_wr_append()
secretmem: convert page_is_secretmem() to folio_is_secretmem()
nios2: fix flush_dcache_page() for usage from irq context
hugetlb: add documentation for vma_kernel_pagesize()
mm: add orphaned kernel-doc to the rst files.
mm: fix clean_record_shared_mapping_range kernel-doc
mm: fix get_mctgt_type() kernel-doc
mm: fix kernel-doc warning from tlb_flush_rmaps()
mm: remove enum page_entry_size
mm: allow ->huge_fault() to be called without the mmap_lock held
mm: move PMD_ORDER to pgtable.h
mm: remove checks for pte_index
memcg: remove duplication detection for mem_cgroup_uncharge_swap
mm/huge_memory: work on folio->swap instead of page->private when splitting folio
mm/swap: inline folio_set_swap_entry() and folio_swap_entry()
mm/swap: use dedicated entry for swap in folio
mm/swap: stop using page->private on tail pages for THP_SWAP
selftests/mm: fix WARNING comparing pointer to 0
selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check
...
In hw_breakpoint_control(), encode_ctrl_reg() has already encoded the
MWPnCFG3_LoadEn/MWPnCFG3_StoreEn bits in info->ctrl. We don't need to
add (1 << MWPnCFG3_LoadEn | 1 << MWPnCFG3_StoreEn) unconditionally.
Otherwise we can't set read watchpoint and write watchpoint separately.
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This is a port of commit 379eb01c21 ("riscv: Ensure the value
of FP registers in the core dump file is up to date").
The values of FP/SIMD registers in the core dump file come from the
thread.fpu. However, kernel saves the FP/SIMD registers only before
scheduling out the process. If no process switch happens during the
exception handling, kernel will not have a chance to save the latest
values of FP/SIMD registers. So it may cause their values in the core
dump file incorrect. To solve this problem, force fpr_get()/simd_get()
to save the FP/SIMD registers into the thread.fpu if the target task
equals the current task.
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The initial aim is to silence the following objtool warning:
arch/loongarch/kernel/process.o: warning: objtool: arch_cpu_idle_dead() falls through to next function start_thread()
According to tools/objtool/Documentation/objtool.txt, this is because
the last instruction of arch_cpu_idle_dead() is a call to a noreturn
function play_dead(). In order to silence the warning, one simple way
is to add the noreturn function play_dead() to objtool's hard-coded
global_noreturns array, that is to say, just put "NORETURN(play_dead)"
into tools/objtool/noreturns.h, it works well.
But I noticed that play_dead() is only defined once and only called by
arch_cpu_idle_dead(), so put the body of play_dead() into the caller
arch_cpu_idle_dead(), then remove the noreturn function play_dead() is
an alternative way which can reduce the overhead of the function call
at the same time.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add identifier names to arguments of die() declaration in ptrace.h
to fix the following checkpatch warnings:
WARNING: function definition argument 'const char *' should also have an identifier name
WARNING: function definition argument 'struct pt_regs *' should also have an identifier name
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
After the call to oops_exit(), it should not panic or execute
the crash kernel if the oops is to be suppressed.
Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
If notify_die() returns NOTIFY_STOP, honor the return value from the
handler chain invocation in die() and return without killing the task
as, through a debugger, the fault may have been fixed. It makes sense
even if ignoring the event will make the system unstable: by allowing
access through a debugger it has been compromised already anyway. It
makes our port consistent with x86, arm64, riscv and csky.
Commit 20c0d2d440 ("[PATCH] i386: pass proper trap numbers to die
chain handlers") may be the earliest of similar changes.
Link: https://lore.kernel.org/r/43DDF02E.76F0.0078.0@novell.com/
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
All *.S files under arch/loongarch/ have been converted to include
<linux/export.h> instead of <asm/export.h>.
Remove <asm/export.h>.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Commit ddb5cdbafa ("kbuild: generate KSYMTAB entries by modpost")
deprecated <asm/export.h>, which is now a wrapper of <linux/export.h>.
Replace #include <asm/export.h> with #include <linux/export.h>.
After all the <asm/export.h> lines are converted, <asm/export.h> and
<asm-generic/export.h> will be removed.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
There is no EXPORT_SYMBOL() line there, hence #include <asm/export.h>
is unneeded.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
As explained by Nick in the original issue: the kernel usually does a
good job of providing library helpers that have similar semantics as
their ordinary userspace libc equivalents, but -ffreestanding disables
such libcall optimization and other related features in the compiler,
which can lead to unexpected things such as CONFIG_FORTIFY_SOURCE not
working (!).
However, due to the desire for better control over unaligned accesses
with respect to CONFIG_ARCH_STRICT_ALIGN, and also for avoiding the
GCC bug https://gcc.gnu.org/PR109465, we do want to still disable
optimizations for the memory libcalls (memcpy, memmove and memset for
now). Use finer-grained -fno-builtin-* toggles to achieve this without
losing source fortification and other libcall optimizations.
Closes: https://github.com/ClangBuiltLinux/linux/issues/1897
Reported-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
In drivers/Kconfig, drivers/firmware/Kconfig is sourced for all ports so
there is no need to source it in the port-specific Kconfig file. And
sourcing it here also caused the "Firmware Drivers" menu appeared two
times: one in the "Device Drivers" menu, another in the toplevel menu.
This is really puzzling so remove it.
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Move the default (no-op) implementation of flush_icache_pages() to
<linux/cacheflush.h> from <asm-generic/cacheflush.h>. Remove the
flush_icache_page() wrapper from each architecture into
<linux/cacheflush.h>.
Link: https://lkml.kernel.org/r/20230802151406.3735276-32-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add update_mmu_cache_range() and change _PFN_SHIFT to PFN_PTE_SHIFT. It
would probably be more efficient to implement __update_tlb() by flushing
the entire folio instead of calling __update_tlb() N times, but I'll leave
that for someone who understands the architecture better.
Link: https://lkml.kernel.org/r/20230802151406.3735276-15-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmTiDvweHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG7doH/2Poj73npPKPVfT3
RF8AsgvAj2pLby67rdvwTdFX+exS63SsuwtGGRfAHfGabiwmNN+oT2dLb0aY15bp
nskHnFpcqQ/pfZ2i2rUCenzBX8S9QULvPidLKRaf1FcSdOzqd97Bw5oDPMtzqy/R
Pm+Dzs//7fYvtm69nt6hKW4d6wXxNcg7Fk/QgoJ5Ax9vGvDuZmWXH0ZgBf/5kH04
TTPQNtVX57lf+FHugkhFEn4JbYXvN168b+LuX2PHwOeG/8AIS69Hc0vgvhHNAycT
mmpUI1gWA2jfrJ2RCyyezF/6wy9Ocsp+CbPjfwjuRUxOk0XIm1+cp9Mlz/cRbMsZ
f0tOTpk=
=xrJp
-----END PGP SIGNATURE-----
BackMerge tag 'v6.5-rc7' into drm-next
Linux 6.5-rc7
This is needed for the CI stuff and the msm pull has fixes in it.
Signed-off-by: Dave Airlie <airlied@redhat.com>
As part of the conversions to replace pgtable constructor/destructors with
ptdesc equivalents, convert various page table functions to use ptdescs.
Some of the functions use the *get*page*() helper functions. Convert
these to use pagetable_alloc() and ptdesc_address() instead to help
standardize page tables further.
Link: https://lkml.kernel.org/r/20230807230513.102486-22-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Guo Ren <guoren@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The APIs that allow backtracing across CPUs have always had a way to
exclude the current CPU. This convenience means callers didn't need to
find a place to allocate a CPU mask just to handle the common case.
Let's extend the API to take a CPU ID to exclude instead of just a
boolean. This isn't any more complex for the API to handle and allows the
hardlockup detector to exclude a different CPU (the one it already did a
trace for) without needing to find space for a CPU mask.
Arguably, this new API also encourages safer behavior. Specifically if
the caller wants to avoid tracing the current CPU (maybe because they
already traced the current CPU) this makes it more obvious to the caller
that they need to make sure that the current CPU ID can't change.
[akpm@linux-foundation.org: fix trigger_allbutcpu_cpu_backtrace() stub]
Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The kexec and crash kernel options are provided in the common
kernel/Kconfig.kexec. Utilize the common options and provide
the ARCH_SUPPORTS_ and ARCH_SELECTS_ entries to recreate the
equivalent set of KEXEC and CRASH options.
Link: https://lkml.kernel.org/r/20230712161545.87870-7-eric.devolder@oracle.com
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Arm disabled hugetlb vmemmap optimization [1] because hugetlb vmemmap
optimization includes an update of both the permissions (writeable to
read-only) and the output address (pfn) of the vmemmap ptes. That is not
supported without unmapping of pte(marking it invalid) by some
architectures.
With DAX vmemmap optimization we don't require such pte updates and
architectures can enable DAX vmemmap optimization while having hugetlb
vmemmap optimization disabled. Hence split DAX optimization support into
a different config.
s390, loongarch and riscv don't have devdax support. So the DAX config is
not enabled for them. With this change, arm64 should be able to select
DAX optimization
[1] commit 060a2c92d1 ("arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP")
Link: https://lkml.kernel.org/r/20230724190759.483013-8-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm: ioremap: Convert architectures to take GENERIC_IOREMAP
way", v8.
Motivation and implementation:
==============================
Currently, many architecutres have't taken the standard GENERIC_IOREMAP
way to implement ioremap_prot(), iounmap(), and ioremap_xx(), but make
these functions specifically under each arch's folder. Those cause many
duplicated code of ioremap() and iounmap().
In this patchset, firstly introduce generic_ioremap_prot() and
generic_iounmap() to extract the generic code for GENERIC_IOREMAP. By
taking GENERIC_IOREMAP method, the generic generic_ioremap_prot(),
generic_iounmap(), and their generic wrapper ioremap_prot(), ioremap() and
iounmap() are all visible and available to arch. Arch needs to provide
wrapper functions to override the generic version if there's arch specific
handling in its corresponding ioremap_prot(), ioremap() or iounmap().
With these changes, duplicated ioremap/iounmap() code uder ARCH-es are
removed, and the equivalent functioality is kept as before.
Background info:
================
1) The converting more architectures to take GENERIC_IOREMAP way is
suggested by Christoph in below discussion:
https://lore.kernel.org/all/Yp7h0Jv6vpgt6xdZ@infradead.org/T/#u
2) In the previous v1 to v3, it's basically further action after arm64
has converted to GENERIC_IOREMAP way in below patchset. It's done by
adding hook ioremap_allowed() and iounmap_allowed() in ARCH to add ARCH
specific handling the middle of ioremap_prot() and iounmap().
[PATCH v5 0/6] arm64: Cleanup ioremap() and support ioremap_prot()
https://lore.kernel.org/all/20220607125027.44946-1-wangkefeng.wang@huawei.com/T/#u
Later, during v3 reviewing, Christophe Leroy suggested to introduce
generic_ioremap_prot() and generic_iounmap() to generic codes, and ARCH
can provide wrapper function ioremap_prot(), ioremap() or iounmap() if
needed. Christophe made a RFC patchset as below to specially demonstrate
his idea. This is what v4 and now v5 is doing.
[RFC PATCH 0/8] mm: ioremap: Convert architectures to take GENERIC_IOREMAP way
https://lore.kernel.org/all/cover.1665568707.git.christophe.leroy@csgroup.eu/T/#u
Testing:
========
In v8, I only applied this patchset onto the latest linus's tree to build
and run on arm64 and s390.
This patch (of 19):
Let's use '#define ioremap_xx' and "#ifdef ioremap_xx" instead.
To remove defined ARCH_HAS_IOREMAP_xx macros in <asm/io.h> of each ARCH,
the ARCH's own ioremap_wc|wt|np definition need be above "#include
<asm-generic/iomap.h>. Otherwise the redefinition error would be seen
during compiling. So the relevant adjustments are made to avoid compiling
error:
loongarch:
- doesn't include <asm-generic/iomap.h>, defining ARCH_HAS_IOREMAP_WC
is redundant, so simply remove it.
m68k:
- selected GENERIC_IOMAP, <asm-generic/iomap.h> has been added in
<asm-generic/io.h>, and <asm/kmap.h> is included above
<asm-generic/iomap.h>, so simply remove ARCH_HAS_IOREMAP_WT defining.
mips:
- move "#include <asm-generic/iomap.h>" below ioremap_wc definition
in <asm/io.h>
powerpc:
- remove "#include <asm-generic/iomap.h>" in <asm/io.h> because it's
duplicated with the one in <asm-generic/io.h>, let's rely on the
latter.
x86:
- selected GENERIC_IOMAP, remove #include <asm-generic/iomap.h> in
the middle of <asm/io.h>. Let's rely on <asm-generic/io.h>.
Link: https://lkml.kernel.org/r/20230706154520.11257-2-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Helge Deller <deller@gmx.de>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit a2225d931f ("autofs: remove left-over autofs4 stubs")
promised the removal of the fs/autofs/Kconfig fragment for AUTOFS4_FS
within a couple of releases, but five years later this still has not
happened yet, and AUTOFS4_FS is still enabled in 63 defconfigs.
Get rid of it mechanically:
git grep -l CONFIG_AUTOFS4_FS -- '*defconfig' |
xargs sed -i 's/AUTOFS4_FS/AUTOFS_FS/'
Also just remove the AUTOFS4_FS config option stub. Anybody who hasn't
regenerated their config file in the last five years will need to just
get the new name right when they do.
Signed-off-by: Sven Joachim <svenjoac@gmx.de>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the current configuration, cpu_has_lsx and cpu_has_lasx cannot be
constants. So cleanup the __builtin_constant_p() checking to reduce the
complexity.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
As the code comment says, the initial aim is to reduce one instruction
in some corner cases, if bit[51:31] is all 0 or all 1, no need to call
lu32id. That is to say, it should call lu32id only if bit[51:31] is not
all 0 and not all 1. The current code always call lu32id, the result is
right but the logic is unexpected and wrong, fix it.
Cc: stable@vger.kernel.org # 6.1
Fixes: 5dc615520c ("LoongArch: Add BPF JIT support")
Reported-by: Colin King (gmail) <colin.i.king@gmail.com>
Closes: https://lore.kernel.org/all/bcf97046-e336-712a-ac68-7fd194f2953e@gmail.com/
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Currently nettrace does not work on LoongArch due to missing
bpf_probe_read{,str}() support, with the error message:
ERROR: failed to load kprobe-based eBPF
ERROR: failed to load kprobe-based bpf
According to commit 0ebeea8ca8 ("bpf: Restrict bpf_probe_read{,
str}() only to archs where they work"), we only need to select
CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE to add said support,
because LoongArch does have non-overlapping address ranges for kernel
and userspace.
Cc: stable@vger.kernel.org # 6.1
Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This patch fixes an underflow issue in the return value within the
exception path, specifically at .Llt8 when the remaining length is less
than 8 bytes.
Cc: stable@vger.kernel.org
Fixes: 8941e93ca5 ("LoongArch: Optimize memory ops (memset/memcpy/memmove)")
Reported-by: Weihao Li <liweihao@loongson.cn>
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
On FDT systems these command line processing are already taken care of
by early_init_dt_scan_chosen(). Add similar handling to the ACPI (non-
FDT) code path to allow these config options to work for ACPI (non-FDT)
systems too.
Signed-off-by: Zhihong Dong <donmor3000@hotmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Binutils 2.41 enables linker relaxation by default, but the kernel
module loader doesn't support that, so just disable it. Otherwise we
get such an error when loading modules:
"Unknown relocation type 102"
As an alternative, we could add linker relaxation support in the kernel
module loader. But it is relatively large complexity that may or may not
bring a similar gain, and we don't really want to include this linker
pass in the kernel.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This is a port of commit 4fe4a6374c ("MIPS: Only fiddle with
CHECKFLAGS if `need-compiler'") to LoongArch.
We have originally guarded fiddling with CHECKFLAGS in our arch Makefile
by checking for the CONFIG_LOONGARCH variable, not set for targets such
as `distclean', etc. that neither include `.config' nor use the compiler.
Starting from commit 805b2e1d42 ("kbuild: include Makefile.compiler
only when compiler is needed") we have had a generic `need-compiler'
variable explicitly telling us if the compiler will be used and thus its
capabilities need to be checked and expressed in the form of compilation
flags. If this variable is not set, then `make' functions such as
`cc-option' are undefined, causing all kinds of weirdness to happen if
we expect specific results to be returned.
It doesn't cause problems on LoongArch now. But as a guard we replace
the check for CONFIG_LOONGARCH with one for `need-compiler' instead, so
as to prevent the compiler from being ever called for CHECKFLAGS when
not needed.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The passed parameter to sysrq handlers is a key (a character). So change
the type from 'int' to 'u8'. Let it specifically be 'u8' for two
reasons:
* unsigned: unsigned values come from the upper layers (devices) and the
tty layer assumes unsigned on most places, and
* 8-bit: as that what's supposed to be one day in all the layers built
on the top of tty. (Currently, we use mostly 'unsigned char' and
somewhere still only 'char'. (But that also translates to the former
thanks to -funsigned-char.))
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Zqiang <qiang.zhang1211@gmail.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de> # DRM
Acked-by: WANG Xuerui <git@xen0n.name> # loongarch
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20230712081811.29004-3-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UAPI Changes:
* fbdev:
* Make fbdev userspace interfaces optional; only leaves the
framebuffer console active
* prime:
* Support dma-buf self-import for all drivers automatically: improves
support for many userspace compositors
Cross-subsystem Changes:
* backlight:
* Fix interaction with fbdev in several drivers
* base: Convert struct platform.remove to return void; part of a larger,
tree-wide effort
* dma-buf: Acquire reservation lock for mmap() in exporters; part
of an on-going effort to simplify locking around dma-bufs
* fbdev:
* Use Linux device instead of fbdev device in many places
* Use deferred-I/O helper macros in various drivers
* i2c: Convert struct i2c from .probe_new to .probe; part of a larger,
tree-wide effort
* video:
* Avoid including <linux/screen_info.h>
Core Changes:
* atomic:
* Improve logging
* prime:
* Remove struct drm_driver.gem_prime_mmap plus driver updates: all
drivers now implement this callback with drm_gem_prime_mmap()
* gem:
* Support execution contexts: provides locking over multiple GEM
objects
* ttm:
* Support init_on_free
* Swapout fixes
Driver Changes:
* accel:
* ivpu: MMU updates; Support debugfs
* ast:
* Improve device-model detection
* Cleanups
* bridge:
* dw-hdmi: Improve support for YUV420 bus format
* dw-mipi-dsi: Fix enable/disable of DSI controller
* lt9611uxc: Use MODULE_FIRMWARE()
* ps8640: Remove broken EDID code
* samsung-dsim: Fix command transfer
* tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups
* Cleanups
* ingenic:
* Kconfig REGMAP fixes
* loongson:
* Support display controller
* mgag200:
* Minor fixes
* mxsfb:
* Support disabling overlay planes
* nouveau:
* Improve VRAM detection
* Various fixes and cleanups
* panel:
* panel-edp: Support AUO B116XAB01.4
* Support Visionox R66451 plus DT bindings
* Cleanups
* ssd130x:
* Support per-controller default resolution plus DT bindings
* Reduce memory-allocation overhead
* Cleanups
* tidss:
* Support TI AM625 plus DT bindings
* Implement new connector model plus driver updates
* vkms
* Improve write-back support
* Documentation fixes
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmSvvRAACgkQaA3BHVML
eiNpGQgAs8jq1XjN9t8jZsdgXnoCbkZyVUI2NO0HwoVwpRCLgbXp5AX5qq2oRciE
TBhe4Fceh/ZsYqHTZQahnguxgRKM5JgXwbI4Z0iiOVcqasNbycaKAqipxJJ7kdo1
qPhGCbgQFVX7oIq2xjfXehh6O0SYX+R9r88X8dMJxMYv/pcLwOHG74kS040WOcQq
uATgcnobOf/D8ZmlqvfKGAeTUoFo/RSR2Uhlauka58qgeUbicrTELZT2barY9d+k
as6U5vv4wx2zMklTkjrlkMpAT1ZpbB9d3jGHwL27VEnjlfd3wV2bdH7Dzn9qZRf/
gn0ALg/b3u5yBWk/k7YBvijXyNcH6Q==
=bBuG
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2023-07-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.6:
UAPI Changes:
* fbdev:
* Make fbdev userspace interfaces optional; only leaves the
framebuffer console active
* prime:
* Support dma-buf self-import for all drivers automatically: improves
support for many userspace compositors
Cross-subsystem Changes:
* backlight:
* Fix interaction with fbdev in several drivers
* base: Convert struct platform.remove to return void; part of a larger,
tree-wide effort
* dma-buf: Acquire reservation lock for mmap() in exporters; part
of an on-going effort to simplify locking around dma-bufs
* fbdev:
* Use Linux device instead of fbdev device in many places
* Use deferred-I/O helper macros in various drivers
* i2c: Convert struct i2c from .probe_new to .probe; part of a larger,
tree-wide effort
* video:
* Avoid including <linux/screen_info.h>
Core Changes:
* atomic:
* Improve logging
* prime:
* Remove struct drm_driver.gem_prime_mmap plus driver updates: all
drivers now implement this callback with drm_gem_prime_mmap()
* gem:
* Support execution contexts: provides locking over multiple GEM
objects
* ttm:
* Support init_on_free
* Swapout fixes
Driver Changes:
* accel:
* ivpu: MMU updates; Support debugfs
* ast:
* Improve device-model detection
* Cleanups
* bridge:
* dw-hdmi: Improve support for YUV420 bus format
* dw-mipi-dsi: Fix enable/disable of DSI controller
* lt9611uxc: Use MODULE_FIRMWARE()
* ps8640: Remove broken EDID code
* samsung-dsim: Fix command transfer
* tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups
* Cleanups
* ingenic:
* Kconfig REGMAP fixes
* loongson:
* Support display controller
* mgag200:
* Minor fixes
* mxsfb:
* Support disabling overlay planes
* nouveau:
* Improve VRAM detection
* Various fixes and cleanups
* panel:
* panel-edp: Support AUO B116XAB01.4
* Support Visionox R66451 plus DT bindings
* Cleanups
* ssd130x:
* Support per-controller default resolution plus DT bindings
* Reduce memory-allocation overhead
* Cleanups
* tidss:
* Support TI AM625 plus DT bindings
* Implement new connector model plus driver updates
* vkms
* Improve write-back support
* Documentation fixes
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230713090830.GA23281@linux-uq9g
The x86 Shadow stack feature includes a new type of memory called shadow
stack. This shadow stack memory has some unusual properties, which requires
some core mm changes to function properly.
One of these unusual properties is that shadow stack memory is writable,
but only in limited ways. These limits are applied via a specific PTE
bit combination. Nevertheless, the memory is writable, and core mm code
will need to apply the writable permissions in the typical paths that
call pte_mkwrite(). The goal is to make pte_mkwrite() take a VMA, so
that the x86 implementation of it can know whether to create regular
writable or shadow stack mappings.
But there are a couple of challenges to this. Modifying the signatures of
each arch pte_mkwrite() implementation would be error prone because some
are generated with macros and would need to be re-implemented. Also, some
pte_mkwrite() callers operate on kernel memory without a VMA.
So this can be done in a three step process. First pte_mkwrite() can be
renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite()
added that just calls pte_mkwrite_novma(). Next callers without a VMA can
be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers
can be changed to take/pass a VMA.
Start the process by renaming pte_mkwrite() to pte_mkwrite_novma() and
adding the pte_mkwrite() wrapper in linux/pgtable.h. Apply the same
pattern for pmd_mkwrite(). Since not all archs have a pmd_mkwrite_novma(),
create a new arch config HAS_HUGE_PAGE that can be used to tell if
pmd_mkwrite() should be defined. Otherwise in the !HAS_HUGE_PAGE cases the
compiler would not be able to find pmd_mkwrite_novma().
No functional change.
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/
Link: https://lore.kernel.org/all/20230613001108.3040476-2-rick.p.edgecombe%40intel.com
The header file <linux/efi.h> does not need anything from
<linux/screen_info.h>. Declare struct screen_info and remove
the include statements. Update a number of source files that
require struct screen_info's definition.
v2:
* update loongarch (Jingfeng)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Sui Jingfeng <suijingfeng@loongson.cn>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230706104852.27451-2-tzimmermann@suse.de
These are cleanups for architecture specific header files:
- the comments in include/linux/syscalls.h have gone out of sync
and are really pointless, so these get removed
- The asm/bitsperlong.h header no longer needs to be architecture
specific on modern compilers, so use a generic version for newer
architectures that use new enough userspace compilers
- A cleanup for virt_to_pfn/virt_to_bus to have proper type
checking, forcing the use of pointers
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmSl138ACgkQYKtH/8kJ
UieqWxAA2WjNVfyuieYckglOVE0PZPs2fzCwyzTY5iUTH3gE5cBFWJDWcg2EnouG
v3X3htEQcowYWaCF9+rypQXaGiSx4WXi2Bjxnz3D/BcreqWPI4eSQ0fpGG5SURTY
2zYF72GTt4JGR++l+7/R9MZwPbwYDT9BsD5tkel8PxnyVLM6/c5xFvbjzRSKFE8x
SMN1jGZ62ITLNf/8coAOEPNxBYtDT6yQyu7P2sx5cd65LAQq9yLKjFklnBBovgWT
OoCIZAdGkhcNwOh1LjyHcdNdpfNJGceKyqKPqty07IhCQuF2jxiyFYFzuBbeyQfE
S0itN8o/MIfUmxaQl3e8dPAVb1RlNVr1zfQ6y4tUtWNdkNL2WwSnSQSRHrBfHxCQ
QCF++PMeFcLhGwMYtqdNJ7XGLQ0PsjD74pRf0vo+vjmqDk2BJsJBP57VU+8MJn5r
SoxqnJ0WxLvm1TfrNKusV7zMNWquc2duJDW40zsOssP4itjYELSI6qa56qmzlqmX
zKmRx6mxAlx9RRK8FHXFYHbz3p93vv8z9vTOZV3AjIjjED960CLknUAwCC8FoJyz
9b5wyMXsLQHQjGt8luAvPc6OiU0EiU9a4SPK+feWcv27serFvnjJlRTS/yG2Z3zd
BYsUgsXHypsdoud+aE7MeCy7fE8n3mhoyMQQRBkOMFJ7RsG6wAE=
=S/he
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"These are cleanups for architecture specific header files:
- the comments in include/linux/syscalls.h have gone out of sync and
are really pointless, so these get removed
- The asm/bitsperlong.h header no longer needs to be architecture
specific on modern compilers, so use a generic version for newer
architectures that use new enough userspace compilers
- A cleanup for virt_to_pfn/virt_to_bus to have proper type checking,
forcing the use of pointers"
* tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
syscalls: Remove file path comments from headers
tools arch: Remove uapi bitsperlong.h of hexagon and microblaze
asm-generic: Unify uapi bitsperlong.h for arm64, riscv and loongarch
m68k/mm: Make pfn accessors static inlines
arm64: memory: Make virt_to_pfn() a static inline
ARM: mm: Make virt_to_pfn() a static inline
asm-generic/page.h: Make pfn accessors static inlines
xen/netback: Pass (void *) to virt_to_page()
netfs: Pass a pointer to virt_to_page()
cifs: Pass a pointer to virt_to_page() in cifsglob
cifs: Pass a pointer to virt_to_page()
riscv: mm: init: Pass a pointer to virt_to_page()
ARC: init: Pass a pointer to virt_to_pfn() in init
m68k: Pass a pointer to virt_to_pfn() virt_to_page()
fs/proc/kcore.c: Pass a pointer to virt_addr_valid()
- Add new feature to have function graph tracer record the return value.
Adds a new option: funcgraph-retval ; when set, will show the return
value of a function in the function graph tracer.
- Also add the option: funcgraph-retval-hex where if it is not set, and
the return value is an error code, then it will return the decimal of
the error code, otherwise it still reports the hex value.
- Add the file /sys/kernel/tracing/osnoise/per_cpu/cpu<cpu>/timerlat_fd
That when a application opens it, it becomes the task that the timer lat
tracer traces. The application can also read this file to find out how
it's being interrupted.
- Add the file /sys/kernel/tracing/available_filter_functions_addrs
that works just the same as available_filter_functions but also shows
the addresses of the functions like kallsyms, except that it gives the
address of where the fentry/mcount jump/nop is. This is used by BPF to
make it easier to attach BPF programs to ftrace hooks.
- Replace strlcpy with strscpy in the tracing boot code.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZJy6ixQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qnzRAPsEI2YgjaJSHnuPoGRHbrNil6pq66wY
LYaLizGI4Jv9BwEAqdSdcYcMiWo1SFBAO8QxEDM++BX3zrRyVgW8ahaTNgs=
=TF0C
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
- Add new feature to have function graph tracer record the return
value. Adds a new option: funcgraph-retval ; when set, will show the
return value of a function in the function graph tracer.
- Also add the option: funcgraph-retval-hex where if it is not set, and
the return value is an error code, then it will return the decimal of
the error code, otherwise it still reports the hex value.
- Add the file /sys/kernel/tracing/osnoise/per_cpu/cpu<cpu>/timerlat_fd
That when a application opens it, it becomes the task that the timer
lat tracer traces. The application can also read this file to find
out how it's being interrupted.
- Add the file /sys/kernel/tracing/available_filter_functions_addrs
that works just the same as available_filter_functions but also shows
the addresses of the functions like kallsyms, except that it gives
the address of where the fentry/mcount jump/nop is. This is used by
BPF to make it easier to attach BPF programs to ftrace hooks.
- Replace strlcpy with strscpy in the tracing boot code.
* tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix warnings when building htmldocs for function graph retval
riscv: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL
tracing/boot: Replace strlcpy with strscpy
tracing/timerlat: Add user-space interface
tracing/osnoise: Skip running osnoise if all instances are off
tracing/osnoise: Switch from PF_NO_SETAFFINITY to migrate_disable
ftrace: Show all functions with addresses in available_filter_functions_addrs
selftests/ftrace: Add funcgraph-retval test case
LoongArch: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL
x86/ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL
arm64: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL
tracing: Add documentation for funcgraph-retval and funcgraph-retval-hex
function_graph: Support recording and printing the return value of function
fgraph: Add declaration of "struct fgraph_ret_regs"
1, Preliminary ClangBuiltLinux enablement;
2, Add support to clone a time namespace;
3, Add vector extensions support;
4, Add SMT (Simultaneous Multi-Threading) support;
5, Support dbar with different hints;
6, Introduce hardware page table walker;
7, Add jump-label implementation;
8, Add rethook and uprobes support;
9, Some bug fixes and other small changes.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmSdmI4WHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImeoeDEACd+KFZnQrX6fwpohuxWgQ46YSk
bmKRnVCVg6jg/yL99WTHloaubMbyncgNL7YNvCRmuXcTQdjFP1zFb3q3ZqxTkaOT
Kg9EO4R5H8U2Wolz3uqcvTBbPxv6bwDor1gBWzVo8RTO4S7gYt5tLS7pvLiYPWzp
Jhgko2AHE/Y02Qqg00ARLIzDDLMm9vR5Gdmpj2jhl8wMHMNaqMW5E0r7XaiFoav0
G1PjIAk+0LIj9QHYUm5e0kcXHh0KzgUMG0LUDawbAanZn2r1KhAk0ouwXX/eu9J7
NQQRDi1Z02pTI5X3V1VAf+0O7RGnGGnWb/r2K76nr7lZNp88RJfbLtgBM01pzw2U
NTnIAP7cAomNDaBglzAuS17yeFTS953nxaQlR8/t3UefP+fHmiIJyOOlxrXFMwVM
jOfW4JAIkcl5DD/8l9lU1t91+zyKMrjsv8IrlGPW1sLsr/UOQjBajdHwnH8veC41
mL+xjiMb51g33JDRDAl6mloxXms9LvcNzbnKSwqVO1i23GkN1iHJmbq66Teut07C
7AH2qM3tSAuiXmNF1ntvodK1ukLS8moOiP8ZuuHfS13rr7gxPyRAvYdBiDaNEhF2
gCYYpIcaly+5rSf6wvDXgl/s3tS4o07AzDfpttyH5jYnY80nVj7CgS8vUU/mg1lW
QpapBnBHU8wrz+eBqQ==
=3gt3
-----END PGP SIGNATURE-----
Merge tag 'loongarch-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- preliminary ClangBuiltLinux enablement
- add support to clone a time namespace
- add vector extensions support
- add SMT (Simultaneous Multi-Threading) support
- support dbar with different hints
- introduce hardware page table walker
- add jump-label implementation
- add rethook and uprobes support
- some bug fixes and other small changes
* tag 'loongarch-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (28 commits)
LoongArch: Remove five DIE_* definitions in kdebug.h
LoongArch: Add uprobes support
LoongArch: Use larch_insn_gen_break() for kprobes
LoongArch: Add larch_insn_gen_break() to generate break insns
LoongArch: Check for AMO instructions in insns_not_supported()
LoongArch: Move three functions from kprobes.c to inst.c
LoongArch: Replace kretprobe with rethook
LoongArch: Add jump-label implementation
LoongArch: Select HAVE_DEBUG_KMEMLEAK to support kmemleak
LoongArch: Export some arch-specific pm interfaces
LoongArch: Introduce hardware page table walker
LoongArch: Support dbar with different hints
LoongArch: Add SMT (Simultaneous Multi-Threading) support
LoongArch: Add vector extensions support
LoongArch: Add support to clone a time namespace
Makefile: Add loongarch target flag for Clang compilation
LoongArch: Mark Clang LTO as working
LoongArch: Include KBUILD_CPPFLAGS in CHECKFLAGS invocation
LoongArch: vDSO: Use CLANG_FLAGS instead of filtering out '--target='
LoongArch: Tweak CFLAGS for Clang compatibility
...
core:
- replace strlcpy with strscpy
- EDID changes to support further conversion to struct drm_edid
- Move i915 DSC parameter code to common DRM helpers
- Add Colorspace functionality
aperture:
- ignore framebuffers with non-primary devices
fbdev:
- use fbdev i/o helpers
- add Kconfig options for fb_ops helpers
- use new fb io helpers directly in drivers
sysfs:
- export DRM connector ID
scheduler:
- Avoid an infinite loop
ttm:
- store function table in .rodata
- Add query for TTM mem limit
- Add NUMA awareness to pools
- Export ttm_pool_fini()
bridge:
- fsl-ldb: support i.MX6SX
- lt9211, lt9611: remove blanking packets
- tc358768: implement input bus formats, devm cleanups
- ti-snd65dsi86: implement wait_hpd_asserted
- analogix: fix endless probe loop
- samsung-dsim: support swapped clock, fix enabling, support var clock
- display-connector: Add support for external power supply
- imx: Fix module linking
- tc358762: Support reset GPIO
panel:
- nt36523: Support Lenovo J606F
- st7703: Support Anbernic RG353V-V2
- InnoLux G070ACE-L01 support
- boe-tv101wum-nl6: Improve initialization
- sharp-ls043t1le001: Mode fixes
- simple: BOE EV121WXM-N10-1850, S6D7AA0
- Ampire AM-800480L1TMQW-T00H
- Rocktech RK043FN48H
- Starry himax83102-j02
- Starry ili9882t
amdgpu:
- add new ctx query flag to handle reset better
- add new query/set shadow buffer for rdna3
- DCN 3.2/3.1.x/3.0.x updates
- Enable DC_FP on loongarch
- PCIe fix for RDNA2
- improve DC FAMS/SubVP support for better power management
- partition support for lots of engines
- Take NUMA into account when allocating memory
- Add new DRM_AMDGPU_WERROR config parameter to help with CI
- Initial SMU13 overdrive support
- Add support for new colorspace KMS API
- W=1 fixes
amdkfd:
- Query TTM mem limit rather than hardcoding it
- GC 9.4.3 partition support
- Handle NUMA for partitions
- Add debugger interface for enabling gdb
- Add KFD event age tracking
radeon:
- Fix possible UAF
i915:
- new getparam for PXP support
- GSC/MEI proxy driver
- Meteorlake display enablement
- avoid clearing preallocated framebuffers with TTM
- implement framebuffer mmap support
- Disable sampler indirect state in bindless heap
- Enable fdinfo for GuC backends
- GuC loading and firmware table handling fixes
- Various refactors for multi-tile enablement
- Define MOCS and PAT tables for MTL
- GSC/MEI support for Meteorlake
- PMU multi-tile support
- Large driver kernel doc cleanup
- Allow VRR toggling and arbitrary refresh rates
- Support async flips on linear buffers on display ver 12+
- Expose CRTC CTM property on ILK/SNB/VLV
- New debugfs for display clock frequencies
- Hotplug refactoring
- Display refactoring
- I915_GEM_CREATE_EXT_SET_PAT for Mesa on Meteorlake
- Use large rings for compute contexts
- HuC loading for MTL
- Allow user to set cache at BO creation
- MTL powermanagement enhancements
- Switch to dedicated workqueues to stop using flush_scheduled_work()
- Move display runtime init under display/
- Remove 10bit gamma on desktop gen3 parts, they don't support it
habanalabs:
- uapi: return 0 for user queries if there was a h/w or f/w error
- Add pci health check when we lose connection with the firmware. This can be used to
distinguish between pci link down and firmware getting stuck.
- Add more info to the error print when TPC interrupt occur.
- Firmware fixes
msm:
- Adreno A660 bindings
- SM8350 MDSS bindings fix
- Added support for DPU on sm6350 and sm6375 platforms
- Implemented tearcheck support to support vsync on SM150 and newer platforms
- Enabled missing features (DSPP, DSC, split display) on sc8180x, sc8280xp, sm8450
- Added support for DSI and 28nm DSI PHY on MSM8226 platform
- Added support for DSI on sm6350 and sm6375 platforms
- Added support for display controller on MSM8226 platform
- A690 GPU support
- Move cmdstream dumping out of fence signaling path
- a610 support
- Support for a6xx devices without GMU
nouveau:
- NULL ptr before deref fixes
armada:
- implement fbdev emulation as client
sun4i:
- fix mipi-dsi dotclock
- release clocks
vc4:
- rgb range toggle property
- BT601 / BT2020 HDMI support
vkms:
- convert to drmm helpers
- add reflection and rotation support
- fix rgb565 conversion
gma500:
- fix iomem access
shmobile:
- support renesas soc platform
- enable fbdev
mxsfb:
- Add support for i.MX93 LCDIF
stm:
- dsi: Use devm_ helper
- ltdc: Fix potential invalid pointer deref
renesas:
- Group drivers in renesas subdirectory to prepare for new platform
- Drop deprecated R-Car H3 ES1.x support
meson:
- Add support for MIPI DSI displays
virtio:
- add sync object support
mediatek:
- Add display binding document for MT6795
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmSc3UwACgkQDHTzWXnE
hr69fQ/+PF9L7FSB/qfjaoqJnk6wJyCehv7pDX2/UK7FUrW0e4EwNVx4KKIRqO/P
pKSU9wRlC72ViGgqOYnw0pwzuh45630vWo1stbgxipU2cvM6Ywlq8FiQFdymFe+P
tLYWe5MR55Y+E9Y+bCrKn2yvQ7v+f6EZ6ITIX7mrXL77Bpxhv58VzmZawkxmw5MV
vwhSqJaaeeWNoyfSIDdN8Oj9fE6ScTyiA0YisOP6jnK/TiQofXQxFrMIdKctCcoA
HjolfEEPVCDOSBipkV3hLiyN8lXmt47BmuHp9opSL/g1aASteVeD1/GrccTaA4xV
ah+Jx1hBLcH5sm8CZzbCcHhNu3ILnPCFZFCx8gwflQqmDIOZvoMdL75j7lgqJZG8
TePEiifG3kYO/ZiDc5TUBdeMfbgeehPOsxbvOlA3LxJrgyxe/5o9oejX2Uvvzhoq
9fno1PLqeCILqYaMiCocJwyTw/2VKYCCH7Wiypd4o3h0nmAbbqPT3KeZgNOjoa2X
GXpiIU9rTQ8LZgSmOXdCt2rc9Jb6q+eCiDgrZzAukbP8veQyOvO16Nx1+XzLhOYc
BfjEOoA7nBJD+UPLWkwj42gKtoEWN7IOMTHgcK11d8jdpGISGupl/1nntGhYk0jO
+3RRZXMB/Gjwe9ge4K9bFC81pbfuAE7ELQtPsgV9LapMmWHKccY=
=FmUA
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2023-06-29' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"There is one set of patches to misc for a i915 gsc/mei proxy driver.
Otherwise it's mostly amdgpu/i915/msm, lots of hw enablement and lots
of refactoring.
core:
- replace strlcpy with strscpy
- EDID changes to support further conversion to struct drm_edid
- Move i915 DSC parameter code to common DRM helpers
- Add Colorspace functionality
aperture:
- ignore framebuffers with non-primary devices
fbdev:
- use fbdev i/o helpers
- add Kconfig options for fb_ops helpers
- use new fb io helpers directly in drivers
sysfs:
- export DRM connector ID
scheduler:
- Avoid an infinite loop
ttm:
- store function table in .rodata
- Add query for TTM mem limit
- Add NUMA awareness to pools
- Export ttm_pool_fini()
bridge:
- fsl-ldb: support i.MX6SX
- lt9211, lt9611: remove blanking packets
- tc358768: implement input bus formats, devm cleanups
- ti-snd65dsi86: implement wait_hpd_asserted
- analogix: fix endless probe loop
- samsung-dsim: support swapped clock, fix enabling, support var
clock
- display-connector: Add support for external power supply
- imx: Fix module linking
- tc358762: Support reset GPIO
panel:
- nt36523: Support Lenovo J606F
- st7703: Support Anbernic RG353V-V2
- InnoLux G070ACE-L01 support
- boe-tv101wum-nl6: Improve initialization
- sharp-ls043t1le001: Mode fixes
- simple: BOE EV121WXM-N10-1850, S6D7AA0
- Ampire AM-800480L1TMQW-T00H
- Rocktech RK043FN48H
- Starry himax83102-j02
- Starry ili9882t
amdgpu:
- add new ctx query flag to handle reset better
- add new query/set shadow buffer for rdna3
- DCN 3.2/3.1.x/3.0.x updates
- Enable DC_FP on loongarch
- PCIe fix for RDNA2
- improve DC FAMS/SubVP support for better power management
- partition support for lots of engines
- Take NUMA into account when allocating memory
- Add new DRM_AMDGPU_WERROR config parameter to help with CI
- Initial SMU13 overdrive support
- Add support for new colorspace KMS API
- W=1 fixes
amdkfd:
- Query TTM mem limit rather than hardcoding it
- GC 9.4.3 partition support
- Handle NUMA for partitions
- Add debugger interface for enabling gdb
- Add KFD event age tracking
radeon:
- Fix possible UAF
i915:
- new getparam for PXP support
- GSC/MEI proxy driver
- Meteorlake display enablement
- avoid clearing preallocated framebuffers with TTM
- implement framebuffer mmap support
- Disable sampler indirect state in bindless heap
- Enable fdinfo for GuC backends
- GuC loading and firmware table handling fixes
- Various refactors for multi-tile enablement
- Define MOCS and PAT tables for MTL
- GSC/MEI support for Meteorlake
- PMU multi-tile support
- Large driver kernel doc cleanup
- Allow VRR toggling and arbitrary refresh rates
- Support async flips on linear buffers on display ver 12+
- Expose CRTC CTM property on ILK/SNB/VLV
- New debugfs for display clock frequencies
- Hotplug refactoring
- Display refactoring
- I915_GEM_CREATE_EXT_SET_PAT for Mesa on Meteorlake
- Use large rings for compute contexts
- HuC loading for MTL
- Allow user to set cache at BO creation
- MTL powermanagement enhancements
- Switch to dedicated workqueues to stop using flush_scheduled_work()
- Move display runtime init under display/
- Remove 10bit gamma on desktop gen3 parts, they don't support it
habanalabs:
- uapi: return 0 for user queries if there was a h/w or f/w error
- Add pci health check when we lose connection with the firmware.
This can be used to distinguish between pci link down and firmware
getting stuck.
- Add more info to the error print when TPC interrupt occur.
- Firmware fixes
msm:
- Adreno A660 bindings
- SM8350 MDSS bindings fix
- Added support for DPU on sm6350 and sm6375 platforms
- Implemented tearcheck support to support vsync on SM150 and newer
platforms
- Enabled missing features (DSPP, DSC, split display) on sc8180x,
sc8280xp, sm8450
- Added support for DSI and 28nm DSI PHY on MSM8226 platform
- Added support for DSI on sm6350 and sm6375 platforms
- Added support for display controller on MSM8226 platform
- A690 GPU support
- Move cmdstream dumping out of fence signaling path
- a610 support
- Support for a6xx devices without GMU
nouveau:
- NULL ptr before deref fixes
armada:
- implement fbdev emulation as client
sun4i:
- fix mipi-dsi dotclock
- release clocks
vc4:
- rgb range toggle property
- BT601 / BT2020 HDMI support
vkms:
- convert to drmm helpers
- add reflection and rotation support
- fix rgb565 conversion
gma500:
- fix iomem access
shmobile:
- support renesas soc platform
- enable fbdev
mxsfb:
- Add support for i.MX93 LCDIF
stm:
- dsi: Use devm_ helper
- ltdc: Fix potential invalid pointer deref
renesas:
- Group drivers in renesas subdirectory to prepare for new platform
- Drop deprecated R-Car H3 ES1.x support
meson:
- Add support for MIPI DSI displays
virtio:
- add sync object support
mediatek:
- Add display binding document for MT6795"
* tag 'drm-next-2023-06-29' of git://anongit.freedesktop.org/drm/drm: (1791 commits)
drm/i915: Fix a NULL vs IS_ERR() bug
drm/i915: make i915_drm_client_fdinfo() reference conditional again
drm/i915/huc: Fix missing error code in intel_huc_init()
drm/i915/gsc: take a wakeref for the proxy-init-completion check
drm/msm/a6xx: Add A610 speedbin support
drm/msm/a6xx: Add A619_holi speedbin support
drm/msm/a6xx: Use adreno_is_aXYZ macros in speedbin matching
drm/msm/a6xx: Use "else if" in GPU speedbin rev matching
drm/msm/a6xx: Fix some A619 tunables
drm/msm/a6xx: Add A610 support
drm/msm/a6xx: Add support for A619_holi
drm/msm/adreno: Disable has_cached_coherent in GMU wrapper configurations
drm/msm/a6xx: Introduce GMU wrapper support
drm/msm/a6xx: Move CX GMU power counter enablement to hw_init
drm/msm/a6xx: Extend and explain UBWC config
drm/msm/a6xx: Remove both GBIF and RBBM GBIF halt on hw init
drm/msm/a6xx: Add a helper for software-resetting the GPU
drm/msm/a6xx: Improve a6xx_bus_clear_pending_transactions()
drm/msm/a6xx: Move a6xx_bus_clear_pending_transactions to a6xx_gpu
drm/msm/a6xx: Move force keepalive vote removal to a6xx_gmu_force_off()
...
For now, DIE_PAGE_FAULT, DIE_BREAK, DIE_SSTEPBP, DIE_UPROBE and
DIE_UPROBE_XOL are not used by any code, remove them.
Tested-by: Jeff Xie <xiehuan09@gmail.com>
Suggested-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
For now, we can use larch_insn_gen_break() to define KPROBE_BP_INSN and
KPROBE_SSTEPBP_INSN. Because larch_insn_gen_break() returns instruction
word, define kprobe_opcode_t as u32, then do some small changes related
with type conversion, no functional change intended.
Tested-by: Jeff Xie <xiehuan09@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
There exist various break insns such as BRK_KPROBE_BP, BRK_KPROBE_SSTEPBP,
BRK_UPROBE_BP and BRK_UPROBE_XOLBP, add larch_insn_gen_break() to generate
break insns simpler, this is preparation for later patch.
Tested-by: Jeff Xie <xiehuan09@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Like llsc instructions, the atomic memory access instructions shouldn't
be supported for probing, so check for them in insns_not_supported().
Closes: https://lore.kernel.org/all/SY4P282MB351877A70A0333C790FE85A5C09C9@SY4P282MB3518.AUSP282.PROD.OUTLOOK.COM/
Tested-by: Jeff Xie <xiehuan09@gmail.com>
Reported-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The three functions insns_not_supported(), insns_need_simulation() and
arch_simulate_insn() will be used for uprobes, move them from kprobes.c
to inst.c, this is preparation for later patch, no functionality change.
Tested-by: Jeff Xie <xiehuan09@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This is an adaptation of commit f3a112c0c4 ("x86,rethook,kprobes:
Replace kretprobe with rethook on x86") and commit b57c2f1240 ("riscv:
add riscv rethook implementation") to LoongArch. Mainly refer to commit
b57c2f1240 ("riscv: add riscv rethook implementation").
Replaces the kretprobe code with rethook on LoongArch. With this patch,
kretprobe on LoongArch uses the rethook instead of kretprobe specific
trampoline code.
Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add support for jump labels based on the ARM64 version.
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
We can see that DEBUG_KMEMLEAK depends on HAVE_DEBUG_KMEMLEAK after
commit b69ec42b1b ("Kconfig: clean up the long arch list for the
DEBUG_KMEMLEAK config option"), just select HAVE_DEBUG_KMEMLEAK to
support kmemleak on LoongArch.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Some PMC (Power Management Controllers) need to support DTS and will use
the suspend interfaces thus this patch was to export such interfaces for
their use.
Signed-off-by: Yinbo Zhu <zhuyinbo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Loongson-3A6000 and newer processors have hardware page table walker
(PTW) support. PTW can handle all fastpaths of TLBI/TLBL/TLBS/TLBM
exceptions by hardware, software only need to handle slowpaths (page
faults).
BTW, PTW doesn't append _PAGE_MODIFIED for page table entries, so we
change pmd_dirty() and pte_dirty() to also check _PAGE_DIRTY for the
"dirty" attribute.
Signed-off-by: Liang Gao <gaoliang@loongson.cn>
Signed-off-by: Jun Yi <yijun@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Traditionally, LoongArch uses "dbar 0" (full completion barrier) for
everything. But the full completion barrier is a performance killer, so
Loongson-3A6000 and newer processors have made finer granularity hints
available:
Bit4: ordering or completion (0: completion, 1: ordering)
Bit3: barrier for previous read (0: true, 1: false)
Bit2: barrier for previous write (0: true, 1: false)
Bit1: barrier for succeeding read (0: true, 1: false)
Bit0: barrier for succeeding write (0: true, 1: false)
Hint 0x700: barrier for "read after read" from the same address, which
is needed by LL-SC loops on old models (dbar 0x700 behaves the same as
nop if such reordering is disabled on new models).
This patch makes use of the various new hints for different kinds of
memory barriers. It brings performance improvements on Loongson-3A6000
series, while not affecting the existing models because all variants are
treated as 'dbar 0' there.
Why override queued_spin_unlock()?
After commit 01e3b958ef ("drivers: Remove explicit invocations
of mmiowb()") we need a completion barrier in queued_spin_unlock(), but
the generic implementation use smp_store_release() which only provide an
ordering barrier.
Signed-off-by: Jun Yi <yijun@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Loongson-3A6000 has SMT (Simultaneous Multi-Threading) support, each
physical core has two logical cores (threads). This patch add SMT probe
and scheduler support via ACPI PPTT.
If SCHED_SMT enabled, Loongson-3A6000 is treated as 4 cores, 8 threads;
If SCHED_SMT disabled, Loongson-3A6000 is treated as 8 cores, 8 threads.
Remove smp_num_siblings to support HMP (Heterogeneous Multi-Processing).
Signed-off-by: Liupu Wang <wangliupu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
We can see that "Time namespaces are not supported" on LoongArch:
(1) clone3 test
# cd tools/testing/selftests/clone3 && make && ./clone3
...
# Time namespaces are not supported
ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME
# Totals: pass:17 fail:0 xfail:0 xpass:0 skip:1 error:0
(2) timens test
# cd tools/testing/selftests/timens && make && ./timens
...
1..0 # SKIP Time namespaces are not supported
On LoongArch the current kernel does not support CONFIG_TIME_NS which
depends on GENERIC_VDSO_TIME_NS, select GENERIC_VDSO_TIME_NS to enable
CONFIG_TIME_NS to build kernel/time/namespace.c.
Additionally, it needs to define some arch-dependent functions for the
timens, such as __arch_get_timens_vdso_data(), arch_get_vdso_data() and
vdso_join_timens().
At the same time, modify the layout of vvar to use one page size for
generic vdso data, expand another page size for timens vdso data and
assign LOONGARCH_VDSO_DATA_SIZE (maybe exceeds a page size if expand in
the future) for loongarch vdso data, at last add the callback function
vvar_fault() and modify stack_top().
With this patch under CONFIG_TIME_NS:
(1) clone3 test
# cd tools/testing/selftests/clone3 && make && ./clone3
...
ok 18 [739] Result (0) matches expectation (0)
# Totals: pass:18 fail:0 xfail:0 xpass:0 skip:0 error:0
(2) timens test
# cd tools/testing/selftests/timens && make && ./timens
...
# Totals: pass:10 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Confirmed working with QEMU system emulation.
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This is a port of commit 08f6554ff9 ("mips: Include KBUILD_CPPFLAGS in
CHECKFLAGS invocation") to arch/loongarch, for fixing cross-compilation
of Linux/LoongArch with Clang, where previously the `--target` flag
would no longer be present for the CHECKFLAGS cc invocation leading to
build failure.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://github.com/ClangBuiltLinux/linux/issues/1787#issuecomment-1608306002
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This is a port of commit 76d7fff22b ("MIPS: VDSO: Use CLANG_FLAGS
instead of filtering out '--target='") to arch/loongarch, for fixing
cross-compilation with Clang.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://github.com/ClangBuiltLinux/linux/issues/1787#issuecomment-1608306002
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Now the arch code is mostly ready for LLVM/Clang consumption, it is time
to re-organize the CFLAGS a little to actually enable the LLVM build.
Namely, all -G0 switches from CFLAGS are removed, and -mexplicit-relocs
and -mdirect-extern-access are now wrapped with cc-option (with the
related asm/percpu.h definition guarded against toolchain combos that
are known to not work).
A build with !RELOCATABLE && !MODULE is confirmed working within a QEMU
environment; support for the two features are currently blocked on
LLVM/Clang, and will come later.
Why -G0 can be removed:
In GCC, -G stands for "small data threshold", that instructs the
compiler to put data smaller than the specified threshold in a dedicated
"small data" section (called .sdata on LoongArch and several other
arches).
However, benefiting from this would require ABI cooperation, which is
not the case for LoongArch; and current GCC behave the same whether -G0
(equal to disabling this optimization) is given or not. So, remove -G0
from CFLAGS altogether for one less thing to care about. This also
benefits LLVM/Clang compatibility where the -G switch is not supported.
Why -mexplicit-relocs can now be conditionally applied without
regressions:
Originally -mexplicit-relocs is unconditionally added to CFLAGS in case
of CONFIG_AS_HAS_EXPLICIT_RELOCS, because not having it (i.e. old GCC +
new binutils) would not work: modules will have R_LARCH_ABS_* relocs
inside, but given the rarity of such toolchain combo in the wild, it may
not be worthwhile to support it, so support for such relocs in modules
were not added back when explicit relocs support was upstreamed, and
-mexplicit-relocs is unconditionally added to fail the build early.
Now that Clang compatibility is desired, given Clang is behaving like
-mexplicit-relocs from day one but without support for the CLI flag, we
must ensure the flag is not passed in case of Clang. However, explicit
compiler flavor checks can be more brittle than feature detection: in
this case what actually matters is support for __attribute__((model))
when building modules. Given neither older GCC nor current Clang support
this attribute, probing for the attribute support and #error'ing out
would allow proper UX without checking for Clang, and also automatically
work when Clang support for the attribute is to be added in the future.
Why -mdirect-extern-access is now conditionally applied:
This is actually a nice-to-have optimization that can reduce GOT
accesses, but not having it is harmless either. Because Clang does not
support the option currently, but might do so in the future, conditional
application via cc-option ensures compatibility with both current and
future Clang versions.
Suggested-by: Xi Ruoyao <xry111@xry111.site> # cc-option changes
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The invtlb instruction has been supported by upstream LoongArch
toolchains from day one, so ditch the raw opcode trickery and just use
plain inline asm for it.
While at it, also make the invtlb asm statements barriers, for proper
modeling of the side effects. The functions are also marked as
__always_inline instead of just "inline", because they cannot work at
all if not inlined: the op argument will not be compile-time const in
that case, thus failing to satisfy the "i" constraint.
The signature of the other more specific invtlb wrappers contain unused
arguments right now, but these are not removed right away in order for
the patch to be focused. In the meantime, assertions are added to ensure
no accidental misuse happens before the refactor. (The more specific
wrappers cannot re-use the generic invtlb wrapper, because the ISA
manual says $zero shall be used in case a particular op does not take
the respective argument: re-using the generic wrapper would mean losing
control over the register usage.)
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
In addition to less visual clutter, this also makes Clang happy
regarding the const-ness of arguments. In the original approach, all
Clang gets to see is the incoming arguments whose const-ness cannot be
proven without first being inlined; so Clang errors out here while GCC
is fine.
While at it, tweak several printk format strings because the return type
of csr_read64 becomes effectively unsigned long, instead of unsigned
long long.
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The GNU assembler (as of 2.40) mis-treats FCSR operands as GPRs, but
the LLVM IAS does not. Probe for this and refer to FCSRs as "$fcsrNN"
if support is present.
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
When the kernel is compiled with LLVM, the register names being handled
during exception fixup building are ABI names instead of bare $rNN
style. Add mapping for the ABI names for LLVM compatibility.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Taking the address delta between symbols in different sections is not
supported by the LLVM IAS. Instead, do this in the linker script, so
the same data can be properly referenced in assembly.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: WANG Xuerui <git@xen0n.name>
[chenhuacai: Fix build with !CONFIG_EFI_STUB]
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add guard for the larch_insn_gen_xxx functions to verify whether the
immediate operand is within the acceptable range.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Debugfs functions are not supposed to be checked for errors. This
is sort of unusual but it is described in the comments for the
debugfs_create_dir() function. Also debugfs_create_dir() can never
return NULL.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
ACPI systems set io masters by parsing ACPI MADT, FDT systems have no
MADT so we explicitly set CPU#0 as the io master. Otherwise CPU#0 will
be considered as hotpluggable.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This modifies our user mode stack expansion code to always take the
mmap_lock for writing before modifying the VM layout.
It's actually something we always technically should have done, but
because we didn't strictly need it, we were being lazy ("opportunistic"
sounds so much better, doesn't it?) about things, and had this hack in
place where we would extend the stack vma in-place without doing the
proper locking.
And it worked fine. We just needed to change vm_start (or, in the case
of grow-up stacks, vm_end) and together with some special ad-hoc locking
using the anon_vma lock and the mm->page_table_lock, it all was fairly
straightforward.
That is, it was all fine until Ruihan Li pointed out that now that the
vma layout uses the maple tree code, we *really* don't just change
vm_start and vm_end any more, and the locking really is broken. Oops.
It's not actually all _that_ horrible to fix this once and for all, and
do proper locking, but it's a bit painful. We have basically three
different cases of stack expansion, and they all work just a bit
differently:
- the common and obvious case is the page fault handling. It's actually
fairly simple and straightforward, except for the fact that we have
something like 24 different versions of it, and you end up in a maze
of twisty little passages, all alike.
- the simplest case is the execve() code that creates a new stack.
There are no real locking concerns because it's all in a private new
VM that hasn't been exposed to anybody, but lockdep still can end up
unhappy if you get it wrong.
- and finally, we have GUP and page pinning, which shouldn't really be
expanding the stack in the first place, but in addition to execve()
we also use it for ptrace(). And debuggers do want to possibly access
memory under the stack pointer and thus need to be able to expand the
stack as a special case.
None of these cases are exactly complicated, but the page fault case in
particular is just repeated slightly differently many many times. And
ia64 in particular has a fairly complicated situation where you can have
both a regular grow-down stack _and_ a special grow-up stack for the
register backing store.
So to make this slightly more manageable, the bulk of this series is to
first create a helper function for the most common page fault case, and
convert all the straightforward architectures to it.
Thus the new 'lock_mm_and_find_vma()' helper function, which ends up
being used by x86, arm, powerpc, mips, riscv, alpha, arc, csky, hexagon,
loongarch, nios2, sh, sparc32, and xtensa. So we not only convert more
than half the architectures, we now have more shared code and avoid some
of those twisty little passages.
And largely due to this common helper function, the full diffstat of
this series ends up deleting more lines than it adds.
That still leaves eight architectures (ia64, m68k, microblaze, openrisc,
parisc, s390, sparc64 and um) that end up doing 'expand_stack()'
manually because they are doing something slightly different from the
normal pattern. Along with the couple of special cases in execve() and
GUP.
So there's a couple of patches that first create 'locked' helper
versions of the stack expansion functions, so that there's a obvious
path forward in the conversion. The execve() case is then actually
pretty simple, and is a nice cleanup from our old "grow-up stackls are
special, because at execve time even they grow down".
The #ifdef CONFIG_STACK_GROWSUP in that code just goes away, because
it's just more straightforward to write out the stack expansion there
manually, instead od having get_user_pages_remote() do it for us in some
situations but not others and have to worry about locking rules for GUP.
And the final step is then to just convert the remaining odd cases to a
new world order where 'expand_stack()' is called with the mmap_lock held
for reading, but where it might drop it and upgrade it to a write, only
to return with it held for reading (in the success case) or with it
completely dropped (in the failure case).
In the process, we remove all the stack expansion from GUP (where
dropping the lock wouldn't be ok without special rules anyway), and add
it in manually to __access_remote_vm() for ptrace().
Thanks to Adrian Glaubitz and Frank Scheiner who tested the ia64 cases.
Everything else here felt pretty straightforward, but the ia64 rules for
stack expansion are really quite odd and very different from everything
else. Also thanks to Vegard Nossum who caught me getting one of those
odd conditions entirely the wrong way around.
Anyway, I think I want to actually move all the stack expansion code to
a whole new file of its own, rather than have it split up between
mm/mmap.c and mm/memory.c, but since this will have to be backported to
the initial maple tree vma introduction anyway, I tried to keep the
patches _fairly_ minimal.
Also, while I don't think it's valid to expand the stack from GUP, the
final patch in here is a "warn if some crazy GUP user wants to try to
expand the stack" patch. That one will be reverted before the final
release, but it's left to catch any odd cases during the merge window
and release candidates.
Reported-by: Ruihan Li <lrh2000@pku.edu.cn>
* branch 'expand-stack':
gup: add warning if some caller would seem to want stack expansion
mm: always expand the stack with the mmap write lock held
execve: expand new process stack manually ahead of time
mm: make find_extend_vma() fail if write lock not held
powerpc/mm: convert coprocessor fault to lock_mm_and_find_vma()
mm/fault: convert remaining simple cases to lock_mm_and_find_vma()
arm/mm: Convert to using lock_mm_and_find_vma()
riscv/mm: Convert to using lock_mm_and_find_vma()
mips/mm: Convert to using lock_mm_and_find_vma()
powerpc/mm: Convert to using lock_mm_and_find_vma()
arm64/mm: Convert to using lock_mm_and_find_vma()
mm: make the page fault mmap locking killable
mm: introduce new 'lock_mm_and_find_vma()' page fault helper
- Introduce cmpxchg128() -- aka. the demise of cmpxchg_double().
The cmpxchg128() family of functions is basically & functionally
the same as cmpxchg_double(), but with a saner interface: instead
of a 6-parameter horror that forced u128 - u64/u64-halves layout
details on the interface and exposed users to complexity,
fragility & bugs, use a natural 3-parameter interface with u128 types.
- Restructure the generated atomic headers, and add
kerneldoc comments for all of the generic atomic{,64,_long}_t
operations. Generated definitions are much cleaner now,
and come with documentation.
- Implement lock_set_cmp_fn() on lockdep, for defining an ordering
when taking multiple locks of the same type. This gets rid of
one use of lockdep_set_novalidate_class() in the bcache code.
- Fix raw_cpu_generic_try_cmpxchg() bug due to an unintended
variable shadowing generating garbage code on Clang on certain
ARM builds.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmSav3wRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gDyxAAjCHQjpolrre7fRpyiTDwqzIKT27H04vQ
zrQVlVc42WBnn9pe8LthGy43/RvYvqlZvLoLONA4fMkuYriM6nSMsoZjeUmE+6Rs
QAElQC74P5YvEBOa67VNY3/M7sj22ftDe7ODtVV8OrnPjMk1sQNRvaK025Cs3yig
8MAI//hHGNmyVAp1dPYZMJNqxGCvluReLZ4SaUJFCMrg7YgUXgCBj/5Gi07TlKxn
sT8BFCssoEW/B9FXkh59B1t6FBCZoSy4XSZfsZe0uVAUJ4XDEOO+zBgaWFCedNQT
wP323ryBgMrkzUKA8j2/o5d3QnMA1GcBfHNNlvAl/fOfrxWXzDZnOEY26YcaLMa0
YIuRF/JNbPZlt6DCUVBUEvMPpfNYi18dFN0rat1a6xL2L4w+tm55y3mFtSsg76Ka
r7L2nWlRrAGXnuA+VEPqkqbSWRUSWOv5hT2Mcyb5BqqZRsxBETn6G8GVAzIO6j6v
giyfUdA8Z9wmMZ7NtB6usxe3p1lXtnZ/shCE7ZHXm6xstyZrSXaHgOSgAnB9DcuJ
7KpGIhhSODQSwC/h/J0KEpb9Pr/5jCWmXAQ2DWnZK6ndt1jUfFi8pfK58wm0AuAM
o9t8Mx3o8wZjbMdt6up9OIM1HyFiMx2BSaZK+8f/bWemHQ0xwez5g4k5O5AwVOaC
x9Nt+Tp0Ze4=
=DsYj
-----END PGP SIGNATURE-----
Merge tag 'locking-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
- Introduce cmpxchg128() -- aka. the demise of cmpxchg_double()
The cmpxchg128() family of functions is basically & functionally the
same as cmpxchg_double(), but with a saner interface.
Instead of a 6-parameter horror that forced u128 - u64/u64-halves
layout details on the interface and exposed users to complexity,
fragility & bugs, use a natural 3-parameter interface with u128
types.
- Restructure the generated atomic headers, and add kerneldoc comments
for all of the generic atomic{,64,_long}_t operations.
The generated definitions are much cleaner now, and come with
documentation.
- Implement lock_set_cmp_fn() on lockdep, for defining an ordering when
taking multiple locks of the same type.
This gets rid of one use of lockdep_set_novalidate_class() in the
bcache code.
- Fix raw_cpu_generic_try_cmpxchg() bug due to an unintended variable
shadowing generating garbage code on Clang on certain ARM builds.
* tag 'locking-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
locking/atomic: scripts: fix ${atomic}_dec_if_positive() kerneldoc
percpu: Fix self-assignment of __old in raw_cpu_generic_try_cmpxchg()
locking/atomic: treewide: delete arch_atomic_*() kerneldoc
locking/atomic: docs: Add atomic operations to the driver basic API documentation
locking/atomic: scripts: generate kerneldoc comments
docs: scripts: kernel-doc: accept bitwise negation like ~@var
locking/atomic: scripts: simplify raw_atomic*() definitions
locking/atomic: scripts: simplify raw_atomic_long*() definitions
locking/atomic: scripts: split pfx/name/sfx/order
locking/atomic: scripts: restructure fallback ifdeffery
locking/atomic: scripts: build raw_atomic_long*() directly
locking/atomic: treewide: use raw_atomic*_<op>()
locking/atomic: scripts: add trivial raw_atomic*_<op>()
locking/atomic: scripts: factor out order template generation
locking/atomic: scripts: remove leftover "${mult}"
locking/atomic: scripts: remove bogus order parameter
locking/atomic: xtensa: add preprocessor symbols
locking/atomic: x86: add preprocessor symbols
locking/atomic: sparc: add preprocessor symbols
locking/atomic: sh: add preprocessor symbols
...
- Scheduler SMP load-balancer improvements:
- Avoid unnecessary migrations within SMT domains on hybrid systems.
Problem:
On hybrid CPU systems, (processors with a mixture of higher-frequency
SMT cores and lower-frequency non-SMT cores), under the old code
lower-priority CPUs pulled tasks from the higher-priority cores if
more than one SMT sibling was busy - resulting in many unnecessary
task migrations.
Solution:
The new code improves the load balancer to recognize SMT cores with more
than one busy sibling and allows lower-priority CPUs to pull tasks, which
avoids superfluous migrations and lets lower-priority cores inspect all SMT
siblings for the busiest queue.
- Implement the 'runnable boosting' feature in the EAS balancer: consider CPU
contention in frequency, EAS max util & load-balance busiest CPU selection.
This improves CPU utilization for certain workloads, while leaves other key
workloads unchanged.
- Scheduler infrastructure improvements:
- Rewrite the scheduler topology setup code by consolidating it
into the build_sched_topology() helper function and building
it dynamically on the fly.
- Resolve the local_clock() vs. noinstr complications by rewriting
the code: provide separate sched_clock_noinstr() and
local_clock_noinstr() functions to be used in instrumentation code,
and make sure it is all instrumentation-safe.
- Fixes:
- Fix a kthread_park() race with wait_woken()
- Fix misc wait_task_inactive() bugs unearthed by the -rt merge:
- Fix UP PREEMPT bug by unifying the SMP and UP implementations.
- Fix task_struct::saved_state handling.
- Fix various rq clock update bugs, unearthed by turning on the rq clock
debugging code.
- Fix the PSI WINDOW_MIN_US trigger limit, which was easy to trigger by
creating enough cgroups, by removing the warnign and restricting
window size triggers to PSI file write-permission or CAP_SYS_RESOURCE.
- Propagate SMT flags in the topology when removing degenerate domain
- Fix grub_reclaim() calculation bug in the deadline scheduler code
- Avoid resetting the min update period when it is unnecessary, in
psi_trigger_destroy().
- Don't balance a task to its current running CPU in load_balance(),
which was possible on certain NUMA topologies with overlapping
groups.
- Fix the sched-debug printing of rq->nr_uninterruptible
- Cleanups:
- Address various -Wmissing-prototype warnings, as a preparation
to (maybe) enable this warning in the future.
- Remove unused code
- Mark more functions __init
- Fix shadow-variable warnings
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmSatWQRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1j62xAAuGOx1LcDfRGC6WGQzp1zOdlsVQtnDvlS
qL58zYSHgizprpVQ3j87SBaG4CHCdvd2Bo36yW0lNZS4nd203qdq7fkrMb3hPP/w
egUQUzMegf5fF6BWldKeMjuHSt+twFQz/ZAKK8iSbAir6CHNAqbNst1oL0i/+Tyk
o33hBs1hT5tnbFb1NSVZkX4k+qT3LzTW4K2QgjjGtkScr6yHh2BdEVefyigWOjdo
9s02d00ll9a2r+F5txlN7Dnw6TN7rmTXGMOJU5bZvBE90/anNiAorMXHJdEKCyUR
u9+JtBdJWiCplGa/tSRcxT16ZW1VdtTnd9q66TDhXREd2UNDFqBEyg5Wl77K4Tlf
vKFajmj/to+cTbuv6m6TVR+zyXpdEpdL6F04P44U3qiJvDobBqeDNKHHIqpmbHXl
AXUXcPWTVAzXX1Ce5M+BeAgTBQ1T7C5tELILrTNQHJvO1s9VVBRFZ/l65Ps4vu7T
wIZ781IFuopk0zWqHovNvgKrJ7oFmOQQZFttQEe8n6nafkjI7u+IZ8FayiGaUMRr
4GawFGUCEdYh8z9qyslGKe8Q/Rphfk6hxMFRYUJpDmubQ0PkMeDjDGq77jDGl1PF
VqwSDEyOaBJs7Gqf/mem00JtzBmXhkhm1SEjggHMI2IQbr/eeBXoLQOn3CDapO/N
PiDbtX760ic=
=EWQA
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Scheduler SMP load-balancer improvements:
- Avoid unnecessary migrations within SMT domains on hybrid systems.
Problem:
On hybrid CPU systems, (processors with a mixture of
higher-frequency SMT cores and lower-frequency non-SMT cores),
under the old code lower-priority CPUs pulled tasks from the
higher-priority cores if more than one SMT sibling was busy -
resulting in many unnecessary task migrations.
Solution:
The new code improves the load balancer to recognize SMT cores
with more than one busy sibling and allows lower-priority CPUs
to pull tasks, which avoids superfluous migrations and lets
lower-priority cores inspect all SMT siblings for the busiest
queue.
- Implement the 'runnable boosting' feature in the EAS balancer:
consider CPU contention in frequency, EAS max util & load-balance
busiest CPU selection.
This improves CPU utilization for certain workloads, while leaves
other key workloads unchanged.
Scheduler infrastructure improvements:
- Rewrite the scheduler topology setup code by consolidating it into
the build_sched_topology() helper function and building it
dynamically on the fly.
- Resolve the local_clock() vs. noinstr complications by rewriting
the code: provide separate sched_clock_noinstr() and
local_clock_noinstr() functions to be used in instrumentation code,
and make sure it is all instrumentation-safe.
Fixes:
- Fix a kthread_park() race with wait_woken()
- Fix misc wait_task_inactive() bugs unearthed by the -rt merge:
- Fix UP PREEMPT bug by unifying the SMP and UP implementations
- Fix task_struct::saved_state handling
- Fix various rq clock update bugs, unearthed by turning on the rq
clock debugging code.
- Fix the PSI WINDOW_MIN_US trigger limit, which was easy to trigger
by creating enough cgroups, by removing the warnign and restricting
window size triggers to PSI file write-permission or
CAP_SYS_RESOURCE.
- Propagate SMT flags in the topology when removing degenerate domain
- Fix grub_reclaim() calculation bug in the deadline scheduler code
- Avoid resetting the min update period when it is unnecessary, in
psi_trigger_destroy().
- Don't balance a task to its current running CPU in load_balance(),
which was possible on certain NUMA topologies with overlapping
groups.
- Fix the sched-debug printing of rq->nr_uninterruptible
Cleanups:
- Address various -Wmissing-prototype warnings, as a preparation to
(maybe) enable this warning in the future.
- Remove unused code
- Mark more functions __init
- Fix shadow-variable warnings"
* tag 'sched-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle()
sched/core: Avoid double calling update_rq_clock() in __balance_push_cpu_stop()
sched/core: Fixed missing rq clock update before calling set_rq_offline()
sched/deadline: Update GRUB description in the documentation
sched/deadline: Fix bandwidth reclaim equation in GRUB
sched/wait: Fix a kthread_park race with wait_woken()
sched/topology: Mark set_sched_topology() __init
sched/fair: Rename variable cpu_util eff_util
arm64/arch_timer: Fix MMIO byteswap
sched/fair, cpufreq: Introduce 'runnable boosting'
sched/fair: Refactor CPU utilization functions
cpuidle: Use local_clock_noinstr()
sched/clock: Provide local_clock_noinstr()
x86/tsc: Provide sched_clock_noinstr()
clocksource: hyper-v: Provide noinstr sched_clock()
clocksource: hyper-v: Adjust hv_read_tsc_page_tsc() to avoid special casing U64_MAX
x86/vdso: Fix gettimeofday masking
math64: Always inline u128 version of mul_u64_u64_shr()
s390/time: Provide sched_clock_noinstr()
loongarch: Provide noinstr sched_clock_read()
...
- Initialize FPU late.
Right now FPU is initialized very early during boot. There is no real
requirement to do so. The only requirement is to have it done before
alternatives are patched.
That's done in check_bugs() which does way more than what the function
name suggests.
So first rename check_bugs() to arch_cpu_finalize_init() which makes it
clear what this is about.
Move the invocation of arch_cpu_finalize_init() earlier in
start_kernel() as it has to be done before fork_init() which needs to
know the FPU register buffer size.
With those prerequisites the FPU initialization can be moved into
arch_cpu_finalize_init(), which removes it from the early and fragile
part of the x86 bringup.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmSZdNYTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoaNBEACWtVd1uhqQldIFgSvZYujsrWXlmkU+
pok6gDzKQNwZADiXW/tn5fP8SBLWT0pgLM9d+oZ5mEaLaOW7HcZLEHcVrn74e3TT
53xN8e1zCzyjCJ/x22vrKH4sn/bU+bQyzSNVu9Disqn9Fl+ts37FqAHDv/ExbneD
DaYXXCLgQsyGbPLD8B7yGOpJTGBUTJxNQS1ZFElBaRsAaw0mYZOEoPvuTFK4o7Uz
GUB2vGefmeNfX+EgLYKG9QoS0F3SMS9X2IYswy1H76ZnV/eXmTsA1S3u3X9yX7kC
XBnPtCC+iX+7o3xFkTpa0oQUdzEyGOItExZZgce6jEQu4Fl7NoIJxhlMg9/Y+vcF
ntipEKSWFLAi1GkZzeKRwSSsoWqRaFxOKLy8qhn9kud09k+UtMBkNrF1CSp9laAz
QParu3B1oHPEzx/jS0bSOCMN+AQZH8rX7LxRp4kpBOeBSZNCnfaBUzfIvmccPls+
EJTO/0JUpRm5LsPSDiJhypPRoOOIP26IloR6OoZTcI3p76NrnYblRvisvuFAgDU6
bk7Belf+GDx0kBZugqQgok7nDaHIBR7vEmca1NV8507UrffVyxLAiI4CiWPcFdOq
ovhO8K+gP4xvzZx4cXZBwYwusjvl/oxKy8yQiGgoftDiWU4sdUCSrwX3x27+hUYL
2P1OLDOXSGwESQ==
=yxMj
-----END PGP SIGNATURE-----
Merge tag 'x86-boot-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Thomas Gleixner:
"Initialize FPU late.
Right now FPU is initialized very early during boot. There is no real
requirement to do so. The only requirement is to have it done before
alternatives are patched.
That's done in check_bugs() which does way more than what the function
name suggests.
So first rename check_bugs() to arch_cpu_finalize_init() which makes
it clear what this is about.
Move the invocation of arch_cpu_finalize_init() earlier in
start_kernel() as it has to be done before fork_init() which needs to
know the FPU register buffer size.
With those prerequisites the FPU initialization can be moved into
arch_cpu_finalize_init(), which removes it from the early and fragile
part of the x86 bringup"
* tag 'x86-boot-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mem_encrypt: Unbreak the AMD_MEM_ENCRYPT=n build
x86/fpu: Move FPU initialization into arch_cpu_finalize_init()
x86/fpu: Mark init functions __init
x86/fpu: Remove cpuinfo argument from init functions
x86/init: Initialize signal frame size late
init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init()
init: Invoke arch_cpu_finalize_init() earlier
init: Remove check_bugs() leftovers
um/cpu: Switch to arch_cpu_finalize_init()
sparc/cpu: Switch to arch_cpu_finalize_init()
sh/cpu: Switch to arch_cpu_finalize_init()
mips/cpu: Switch to arch_cpu_finalize_init()
m68k/cpu: Switch to arch_cpu_finalize_init()
loongarch/cpu: Switch to arch_cpu_finalize_init()
ia64/cpu: Switch to arch_cpu_finalize_init()
ARM: cpu: Switch to arch_cpu_finalize_init()
x86/cpu: Switch to arch_cpu_finalize_init()
init: Provide arch_cpu_finalize_init()
This does the simple pattern conversion of alpha, arc, csky, hexagon,
loongarch, nios2, sh, sparc32, and xtensa to the lock_mm_and_find_vma()
helper. They all have the regular fault handling pattern without odd
special cases.
The remaining architectures all have something that keeps us from a
straightforward conversion: ia64 and parisc have stacks that can grow
both up as well as down (and ia64 has special address region checks).
And m68k, microblaze, openrisc, sparc64, and um end up having extra
rules about only expanding the stack down a limited amount below the
user space stack pointer. That is something that x86 used to do too
(long long ago), and it probably could just be skipped, but it still
makes the conversion less than trivial.
Note that this conversion was done manually and with the exception of
alpha without any build testing, because I have a fairly limited cross-
building environment. The cases are all simple, and I went through the
changes several times, but...
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now we specify the minimal version of GCC as 5.1 and Clang/LLVM as 11.0.0
in Documentation/process/changes.rst, __CHAR_BIT__ and __SIZEOF_LONG__ are
usable, it is probably fine to unify the definition of __BITS_PER_LONG as
(__CHAR_BIT__ * __SIZEOF_LONG__) in asm-generic uapi bitsperlong.h.
In order to keep safe and avoid regression, only unify uapi bitsperlong.h
for some archs such as arm64, riscv and loongarch which are using newer
toolchains that have the definitions of __CHAR_BIT__ and __SIZEOF_LONG__.
Suggested-by: Xi Ruoyao <xry111@xry111.site>
Link: https://lore.kernel.org/all/d3e255e4746de44c9903c4433616d44ffcf18d1b.camel@xry111.site/
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/linux-arch/a3a4f48a-07d4-4ed9-bc53-5d383428bdd2@app.fastmail.com/
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The previous patch ("function_graph: Support recording and printing
the return value of function") has laid the groundwork for the for
the funcgraph-retval, and this modification makes it available on
the LoongArch platform.
We introduce a new structure called fgraph_ret_regs for the LoongArch
platform to hold return registers and the frame pointer. We then fill
its content in the return_to_handler and pass its address to the
function ftrace_return_to_handler to record the return value.
Link: https://lkml.kernel.org/r/c5462255e435fab363895c2d7433bc0f5a140411.1680954589.git.pengdonglin@sangfor.com.cn
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Donglin Peng <pengdonglin@sangfor.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmSPcdMeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGWrQH/3KmuvZsWMC4PpJY
VcF9VfF9i+Zv7DoG8sjD5VpNh47e87RsR6WNOFnKol5SUrM6vsBAb5i2rfQahNIv
NSj0fPCE4/Nj9LMecKVC9WD8CitxYdbR+CF9Is21AQj1VihUl9eHXGcAWxuaMyhk
TjPUwmbOOsRVMXXdGJzjX78cvLsxqpSv8A/5OTh16IBimbh7p+YjKJFkbfj/PMWf
aF1quFkIEXgzJcHCpP6KDZHr2KbpY+jIN9hUENnGKJxHYNso5u+KrIW1kAm8meP1
x26ETSquM0T70OAzovOWg+BeVkLDac/3Rh30ztLAI4AtajrlSzycvFsU9UNEJCc2
BnM2IZI=
=ANT5
-----END PGP SIGNATURE-----
Backmerge tag 'v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next
Linux 6.4-rc7
Need this to pull in the msm work.
Signed-off-by: Dave Airlie <airlied@redhat.com>
The debugfs_create_dir() returns ERR_PTR in case of an error and the
correct way of checking it is using the IS_ERR_OR_NULL inline function
rather than the simple null comparision. This patch fixes the issue.
Cc: stable@vger.kernel.org
Suggested-By: Ivan Orlov <ivan.orlov0322@gmail.com>
Signed-off-by: Immad Mir <mirimmad17@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The hardware monitoring points for instruction fetching and load/store
operations need to align 4 bytes and 1/2/4/8 bytes respectively.
Reported-by: Colin King <colin.i.king@gmail.com>
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
LoongArch PMCFG has 10bit event id rather than 8 bit, so fix it.
Cc: stable@vger.kernel.org
Signed-off-by: Jun Yi <yijun@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The "write_fcsr()" macro uses wrong the positions for val and dest in
asm. Fix it!
Reported-by: Miao HAO <haomiao19@mails.ucas.ac.cn>
Signed-off-by: Qi Hu <huqi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
When we split a pmd into ptes, pmd_present() and pmd_trans_huge() should
return true, otherwise it would be treated as a swap pmd.
This is the same as arm64 does in commit b65399f611 ("arm64/mm: Change
THP helpers to comply with generic MM semantics"), we also add a new bit
named _PAGE_PRESENT_INVALID for LoongArch.
Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
With the intent to provide local_clock_noinstr(), a variant of
local_clock() that's safe to be called from noinstr code (with the
assumption that any such code will already be non-preemptible),
prepare for things by providing a noinstr sched_clock_read() function.
Specifically, preempt_enable_*() calls out to schedule(), which upsets
noinstr validation efforts.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com> # Hyper-V
Link: https://lore.kernel.org/r/20230519102715.502547082@infradead.org
Currently several architectures have kerneldoc comments for
arch_atomic_*(), which is unhelpful as these live in a shared namespace
where they clash, and the arch_atomic_*() ops are now an implementation
detail of the raw_atomic_*() ops, which no-one should use those
directly.
Delete the kerneldoc comments for arch_atomic_*(), along with
pseudo-kerneldoc comments which are in the correct style but are missing
the leading '/**' necessary to be true kerneldoc comments.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230605070124.3741859-28-mark.rutland@arm.com
Most architectures define the atomic/atomic64 xchg and cmpxchg
operations in terms of arch_xchg and arch_cmpxchg respectfully.
Add fallbacks for these cases and remove the trivial cases from arch
code. On some architectures the existing definitions are kept as these
are used to build other arch_atomic*() operations.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230605070124.3741859-5-mark.rutland@arm.com
Update the names of the fb_mem*() helpers to be consistent with their
regular counterparts. Hence, fb_memset() now becomes fb_memset_io(),
fb_memcpy_fromfb() now becomes fb_memcpy_fromio() and fb_memcpy_tofb()
becomes fb_memcpy_toio(). No functional changes.
v6:
* update new file fb_io_fops.c
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Sui Jingfeng <suijingfeng@loongson.cn>
Link: https://patchwork.freedesktop.org/patch/msgid/20230512102444.5438-8-tzimmermann@suse.de
Implement framebuffer I/O helpers, such as fb_read*() and fb_write*(),
in the architecture's <asm/fb.h> header file or the generic one.
The common case has been the use of regular I/O functions, such as
__raw_readb() or memset_io(). A few architectures used plain system-
memory reads and writes. Sparc used helpers for its SBus.
The architectures that used special cases provide the same code in
their __raw_*() I/O helpers. So the patch replaces this code with the
__raw_*() functions and moves it to <asm-generic/fb.h> for all
architectures.
v8:
* remove garbage after commit-message tags
v6:
* fix fb_readq()/fb_writeq() on 64-bit mips (kernel test robot)
v5:
* include <linux/io.h> in <asm-generic/fb>; fix s390 build
v4:
* ia64, loongarch, sparc64: add fb_mem*() to arch headers
to keep current semantics (Arnd)
v3:
* implement all architectures with generic helpers
* support reordering and native byte order (Geert, Arnd)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Tested-by: Sui Jingfeng <suijingfeng@loongson.cn>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230512102444.5438-7-tzimmermann@suse.de
- Introduce local{,64}_try_cmpxchg() - a slightly more optimal
primitive, which will be used in perf events ring-buffer code.
- Simplify/modify rwsems on PREEMPT_RT, to address writer starvation.
- Misc cleanups/fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmRUvUoRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1hlIhAArP33rTKi+HAndQ3UHW3XtmHRxEEQTfiE
wvIoN89h58QW4DGMeAV4ltafbIPQAkI233Aogwz903L0qbDV0Ro4OU3XJembRuWl
LeOADKwYyypXdOa8XICuY9aIP7e1/h0DF3ySs7inLcwK9JCyAIxnsVHYej+hsRXA
kZoXN98T3TR1C0V9UQy4SU3HI1lC3tsG3R9Ti9TnYUg3ygVXhRE9lOQ4kv9lFPVz
BNuj2Blj7KNiVaY9kehrhO54THI7NmsCVZO44Rcl48I0KAcFulAmFcNlE7GnR8Nj
thj38pU6XAFVHXG8MYjgE+Al+PnK48NtJxexCtHyGvGG4D2aLzRMnkolxAUCcVuK
G+UBsQm3ybjYgHgt1zuN6ehcpT+5tULkDH8JA7vrgZYaVgxHzsUaHgYfCCWKnmUY
mPR6aImEmYZwZVNLskhe0HT4mq244bp+VnWlnJ6LZK7t/itenvDhqnj7KTi4Bfej
lTHplOTitV/8uCEW8V4pX+YTEenVsIQmTc/G3iIabXP/6HzLffA3q4vyW6vKIErE
pqrpuFA0Z4GB+pU0mJXt7+I7zscDVthwI055jDyQBjA7IcdVGm2MjQ6xcNRW5FYN
UynvaEMocue4ZO4WdFsd1ZBUd9VfoNzGQspBw46DhCL1MEQBYv36SKQNjej/9aRr
ilVwqnOWI2s=
=mM0A
-----END PGP SIGNATURE-----
Merge tag 'locking-core-2023-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
- Introduce local{,64}_try_cmpxchg() - a slightly more optimal
primitive, which will be used in perf events ring-buffer code
- Simplify/modify rwsems on PREEMPT_RT, to address writer starvation
- Misc cleanups/fixes
* tag 'locking-core-2023-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/atomic: Correct (cmp)xchg() instrumentation
locking/x86: Define arch_try_cmpxchg_local()
locking/arch: Wire up local_try_cmpxchg()
locking/generic: Wire up local{,64}_try_cmpxchg()
locking/atomic: Add generic try_cmpxchg{,64}_local() support
locking/rwbase: Mitigate indefinite writer starvation
locking/arch: Rename all internal __xchg() names to __arch_xchg()
1, Better backtraces for humanization;
2, Relay BCE exceptions to userland as SIGSEGV;
3, Provide kernel fpu functions;
4, Optimize memory ops (memset/memcpy/memmove);
5, Optimize checksum and crc32(c) calculation;
6, Add ARCH_HAS_FORTIFY_SOURCE selection;
7, Add function error injection support;
8, Add ftrace with direct call support;
9, Add basic perf tools support.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmRQlUsWHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImekCTD/9fc2U+FIXhJOWV5yK9TCjJTRnK
ASvk0JMYIDA60+fnof3C85tDu9Py9M5Mvt/Ec5pBaHErn16irq85AdD74/OmyCc2
V4pRFHbYLu0WBFQN77gfNXH0XErgYXdceZvaMXajVz2H6NlSKSWZOVN/9ut5SLi3
mt0rCwCsyahj92n8+hOjjZeFbDaPfPMCQ/8n9dnadhbBm9iz35fOKY+qIBHJMJ9a
wPfZ2k3wu5DHs/2+ZjFNhlwrlURTp3RlcVQ7QWDcR1LM3Z4/lEkD8tAI/r8sR9gw
rxzoBSaQzo/zscUmYo0jh1BoW2w0n+x/GfH70Pyz3iwZky3jwpdP0nRwnB4h+tnE
wKlpa5K7RfaqUxZExFfGALmlkALtjQgiXPYbORHMsD6l6XwrOMCeyQismm1oo66m
JBlsdXCms5aracYmWhXnVmTlBqGjAgYAxm62ap62uwlmULy4qUv6kFeW0fERn9NJ
5bKgbrkcal/WkMBawQqtG03niRkykqpqFooZ95ubj4Lib4VM0BmEvFrREjgXO7AE
jpLimYsT9ROE3YQJqyWyLYkmc2ShwWj70INTpz2viMtQ2blIRKvRVsxs976bHuwS
mGsZtiiANjhT2bAUhN7bct2Cf13MtPXiuf0etcJbrNSAtoBIFk+3uRRKHH2rM+CK
oKYjO+exPyuQ9nSOBg==
=3aTV
-----END PGP SIGNATURE-----
Merge tag 'loongarch-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Better backtraces for humanization
- Relay BCE exceptions to userland as SIGSEGV
- Provide kernel fpu functions
- Optimize memory ops (memset/memcpy/memmove)
- Optimize checksum and crc32(c) calculation
- Add ARCH_HAS_FORTIFY_SOURCE selection
- Add function error injection support
- Add ftrace with direct call support
- Add basic perf tools support
* tag 'loongarch-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (24 commits)
tools/perf: Add basic support for LoongArch
LoongArch: ftrace: Add direct call trampoline samples support
LoongArch: ftrace: Add direct call support
LoongArch: ftrace: Implement ftrace_find_callable_addr() to simplify code
LoongArch: ftrace: Fix build error if DYNAMIC_FTRACE_WITH_REGS is not set
LoongArch: ftrace: Abstract DYNAMIC_FTRACE_WITH_ARGS accesses
LoongArch: Add support for function error injection
LoongArch: Add ARCH_HAS_FORTIFY_SOURCE selection
LoongArch: crypto: Add crc32 and crc32c hw acceleration
LoongArch: Add checksum optimization for 64-bit system
LoongArch: Optimize memory ops (memset/memcpy/memmove)
LoongArch: Provide kernel fpu functions
LoongArch: Relay BCE exceptions to userland as SIGSEGV with si_code=SEGV_BNDERR
LoongArch: Tweak the BADV and CPUCFG.PRID lines in show_regs()
LoongArch: Humanize the ESTAT line when showing registers
LoongArch: Humanize the ECFG line when showing registers
LoongArch: Humanize the EUEN line when showing registers
LoongArch: Humanize the PRMD line when showing registers
LoongArch: Humanize the CRMD line when showing registers
LoongArch: Fix format of CSR lines during show_regs()
...
The ftrace samples need per-architecture trampoline implementations to
save and restore argument registers around the calls to my_direct_func*
and to restore polluted registers (e.g: ra).
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Select the HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS to provide the
register_ftrace_direct[_multi] interfaces allowing users to register
the customed trampoline (direct_caller) as the mcount for one or more
target functions. And modify_ftrace_direct[_multi] are also provided
for modifying direct_caller.
There are a few cases to distinguish:
- If a direct call ops is the only one tracing a function AND the direct
called trampoline is within the reach of a 'bl' instruction
-> the ftrace patchsite jumps to the trampoline
- Else
-> the ftrace patchsite jumps to the ftrace_regs_caller trampoline points
to ftrace_list_ops so it iterates over all registered ftrace ops,
including the direct call ops and calls its call_direct_funcs handler
which stores the direct called trampoline's address in the ftrace_regs
and the ftrace_regs_caller trampoline will return to that address
instead of returning to the traced function
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
In the module processing functions, the same logic can be reused by
implementing ftrace_find_callable_addr().
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
We can see the following build error if CONFIG_DYNAMIC_FTRACE_WITH_REGS
is not set on LoongArch:
arch/loongarch/kernel/ftrace_dyn.c: In function ‘ftrace_make_call’:
arch/loongarch/kernel/ftrace_dyn.c:167:23: error: implicit declaration of function ‘__get_mod’
167 | ret = __get_mod(&mod, pc);
| ^~~~~~~~~
arch/loongarch/kernel/ftrace_dyn.c:171:24: error: implicit declaration of function ‘get_plt_addr’
171 | addr = get_plt_addr(mod, addr);
| ^~~~~~~~~~~~
The reason is that the __get_mod() and get_plt_addr() may be called in
ftrace_make_{call,nop}.
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add new ftrace_regs_{get,set}_*() helpers which can be used to manipulate
ftrace_regs. When CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y, these can always
be used on any ftrace_regs, and when CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
=n these can be used when regs are available.
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Inspired by the commit 42d038c4fb ("arm64: Add support for function
error injection") and the commit ee55ff803b ("riscv: Add support for
function error injection"), this patch supports function error injection
for LoongArch.
Mainly implement two functions:
(1) regs_set_return_value() which is used to overwrite the return value,
(2) override_function_with_return() which is used to override the probed
function returning and jump to its caller.
Here is a simple test under CONFIG_FUNCTION_ERROR_INJECTION and
CONFIG_FAIL_FUNCTION:
# echo sys_clone > /sys/kernel/debug/fail_function/inject
# echo 100 > /sys/kernel/debug/fail_function/probability
# dmesg
bash: fork: Invalid argument
# dmesg
...
FAULT_INJECTION: forcing a failure.
name fail_function, interval 1, probability 100, space 0, times 1
...
Call Trace:
[<90000000002238f4>] show_stack+0x5c/0x180
[<90000000012e384c>] dump_stack_lvl+0x60/0x88
[<9000000000b1879c>] should_fail_ex+0x1b0/0x1f4
[<900000000032ead4>] fei_kprobe_handler+0x28/0x6c
[<9000000000230970>] kprobe_breakpoint_handler+0xf0/0x118
[<90000000012e3e60>] do_bp+0x2c4/0x358
[<9000000002241924>] exception_handlers+0x1924/0x10000
[<900000000023b7d0>] sys_clone+0x0/0x4
[<90000000012e4744>] do_syscall+0x7c/0x94
[<9000000000221e44>] handle_syscall+0xc4/0x160
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
FORTIFY_SOURCE could detect various overflows at compile and run time.
ARCH_HAS_FORTIFY_SOURCE means that the architecture can be built and run
with CONFIG_FORTIFY_SOURCE. So select it in LoongArch.
See more about this feature from commit 6974f0c455 ("include/linux/
string.h: add the option of fortified string.h functions").
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
With a blatant copy of some MIPS bits we introduce the crc32 and crc32c
hw accelerated module to LoongArch.
LoongArch has provided these instructions to calculate crc32 and crc32c:
* crc.w.b.w crcc.w.b.w
* crc.w.h.w crcc.w.h.w
* crc.w.w.w crcc.w.w.w
* crc.w.d.w crcc.w.d.w
So we can make use of these instructions to improve the performance of
calculation for crc32(c) checksums.
As can be seen from the following test results, crc32(c) instructions
can improve the performance by 58%.
Software implemention Hardware acceleration
Buffer size time cost (seconds) time cost (seconds) Accel.
100 KB 0.000845 0.000534 59.1%
1 MB 0.007758 0.004836 59.4%
10 MB 0.076593 0.047682 59.4%
100 MB 0.756734 0.479126 58.5%
1000 MB 7.563841 4.778266 58.5%
Signed-off-by: Min Zhou <zhoumin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
LoongArch platform is 64-bit system, which supports 8-bytes memory
accessing, but generic checksum functions use 4-byte memory access.
So add 8-bytes memory access optimization for checksum functions on
LoongArch. And the code comes from arm64 system.
When network hw checksum is disabled, iperf performance improves about
10% with this patch.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Provide kernel_fpu_begin()/kernel_fpu_end() to allow the kernel itself
to use fpu. They can be used by some other kernel components, e.g., the
AMDGPU graphic driver for DCN.
Reported-by: WANG Xuerui <kernel@xen0n.name>
Tested-by: WANG Xuerui <kernel@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
SEGV_BNDERR was introduced initially for supporting the Intel MPX, but
fell into disuse after the MPX support was removed. The LoongArch
bounds-checking instructions behave very differently than MPX, but
overall the interface is still kind of suitable for conveying the
information to userland when bounds-checking assertions trigger, so we
wouldn't have to invent more UAPI. Specifically, when the BCE triggers,
a SEGV_BNDERR is sent to userland, with si_addr set to the out-of-bounds
address or value (in asrt{gt,le}'s case), and one of si_lower or
si_upper set to the configured bound depending on the faulting
instruction. The other bound is set to either 0 or ULONG_MAX to resemble
a range with both lower and upper bounds.
Note that it is possible to have si_addr == si_lower in case of a
failing asrtgt or {ld,st}gt, because those instructions test for strict
greater-than relationship. This should not pose a problem for userland,
though, because the faulting PC is available for the application to
associate back to the exact instruction for figuring out the
expectation.
Example exception context generated by a faulting `asrtgt.d t0, t1`
(assert t0 > t1 or BCE) with t0=100 and t1=200:
> pc 00005555558206a4 ra 00007ffff2d854fc tp 00007ffff2f2f180 sp 00007ffffbf9fb80
> a0 0000000000000002 a1 00007ffffbf9fce8 a2 00007ffffbf9fd00 a3 00007ffff2ed4558
> a4 0000000000000000 a5 00007ffff2f044c8 a6 00007ffffbf9fce0 a7 fffffffffffff000
> t0 0000000000000064 t1 00000000000000c8 t2 00007ffffbfa2d5e t3 00007ffff2f12aa0
> t4 00007ffff2ed6158 t5 00007ffff2ed6158 t6 000000000000002e t7 0000000003d8f538
> t8 0000000000000005 u0 0000000000000000 s9 0000000000000000 s0 00007ffffbf9fce8
> s1 0000000000000002 s2 0000000000000000 s3 00007ffff2f2c038 s4 0000555555820610
> s5 00007ffff2ed5000 s6 0000555555827e38 s7 00007ffffbf9fd00 s8 0000555555827e38
> ra: 00007ffff2d854fc
> ERA: 00005555558206a4
> CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> PRMD: 00000007 (PPLV3 +PIE -PWE)
> EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> ECFG: 0007181c (LIE=2-4,11-12 VS=7)
> ESTAT: 000a0000 [BCE] (IS= ECode=10 EsubCode=0)
> PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Use ISA manual names for BADV and CPUCFG.PRID lines in show_regs(), for
stylistic consistency with the other lines already touched.
While at it, also include current CPU's full name in show_regs() output.
It may be more helpful for developers looking at the resulting dumps,
because multiple distinct CPU models may share the same PRID. Not having
this info available may hide problems only found on some but not all of
the models sharing one specific PRID.
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Example output looks like:
[ xx.xxxxxx] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
Some initial machinery for this pretty-printing format has been included
in this patch as well.
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Use uppercase CSR names throughout for consistency with the manual
wording, and right-align the keys. The "CSR" part is inferrable from
context, hence dropped for more horizontal space.
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Otherwise the addresses wouldn't make sense at all.
While at it, align the "map keys" to maintain right-alignment with the
"estat:" line too; also swap the ERA and ra lines so all CSRs are shown
together.
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Show PC (CSR.ERA) in place of $zero, and also show the syscall restart
flag (conveniently stuffed in regs[0]) if non-zero.
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Define them according to the ISA manual, in order to enable matching the
sub-exceptions for humanization purposes later.
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
While interrupts are assigned ECodes `64 + interrupt number`, all
existing use sites of interrupt numbers want the 64 subtracted.
Re-arrange the definitions so that the actual interrupt number is used
everywhere, and make EXCCODE_INT_END inclusive as it is more intuitive
that way.
While at it, according to the asm/loongarch.h definitions, the total
number of architectural interrupts should be 14, but various other
places indicate otherwise (13 or 15). Those places have been adjusted
to 14 as well for consistency.
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement target specific support for local_try_cmpxchg()
and local_cmpxchg() using typed C wrappers that call their
_local counterpart and provide additional checking of
their input arguments.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230405141710.3551-4-ubizjak@gmail.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Decrease the probability of this internal facility to be used by
driver code.
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Palmer Dabbelt <palmer@rivosinc.com> [riscv]
Link: https://lore.kernel.org/r/20230118154450.73842-1-andrzej.hajda@intel.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
- Remove diagnostics and adjust config for CSD lock diagnostics
- Add a generic IPI-sending tracepoint, as currently there's no easy
way to instrument IPI origins: it's arch dependent and for some
major architectures it's not even consistently available.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmRK438RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1jJ5Q/5AZ0HGpyqwdFK8GmGznyu5qjP5HwV9pPq
gZQScqSy4tZEeza4TFMi83CoXSg9uJ7GlYJqqQMKm78LGEPomnZtXXC7oWvTA9M5
M/jAvzytmvZloSCXV6kK7jzSejMHhag97J/BjTYhZYQpJ9T+hNC87XO6J6COsKr9
lPIYqkFrIkQNr6B0U11AQfFejRYP1ics2fnbnZL86G/zZAc6x8EveM3KgSer2iHl
KbrO+xcYyGY8Ef9P2F72HhEGFfM3WslpT1yzqR3sm4Y+fuMG0oW3qOQuMJx0ZhxT
AloterY0uo6gJwI0P9k/K4klWgz81Tf/zLb0eBAtY2uJV9Fo3YhPHuZC7jGPGAy3
JusW2yNYqc8erHVEMAKDUsl/1KN4TE2uKlkZy98wno+KOoMufK5MA2e2kPPqXvUi
Jk9RvFolnWUsexaPmCftti0OCv3YFiviVAJ/t0pchfmvvJA2da0VC9hzmEXpLJVF
25nBTV/1uAOrWvOpCyo3ElrC2CkQVkFmK5rXMDdvf6ib0Nid4vFcCkCSLVfu+ePB
11mi7QYro+CcnOug1K+yKogUDmsZgV/u1kUwgQzTIpZ05Kkb49gUiXw9L2RGcBJh
yoDoiI66KPR7PWQ2qBdQoXug4zfEEtWG0O9HNLB0FFRC3hu7I+HHyiUkBWs9jasK
PA5+V7HcQRk=
=Wp7f
-----END PGP SIGNATURE-----
Merge tag 'smp-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP cross-CPU function-call updates from Ingo Molnar:
- Remove diagnostics and adjust config for CSD lock diagnostics
- Add a generic IPI-sending tracepoint, as currently there's no easy
way to instrument IPI origins: it's arch dependent and for some major
architectures it's not even consistently available.
* tag 'smp-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
trace,smp: Trace all smp_function_call*() invocations
trace: Add trace_ipi_send_cpu()
sched, smp: Trace smp callback causing an IPI
smp: reword smp call IPI comment
treewide: Trace IPIs sent via smp_send_reschedule()
irq_work: Trace self-IPIs sent via arch_irq_work_raise()
smp: Trace IPIs sent via arch_send_call_function_ipi_mask()
sched, smp: Trace IPIs sent via send_call_function_single_ipi()
trace: Add trace_ipi_send_cpumask()
kernel/smp: Make csdlock_debug= resettable
locking/csd_lock: Remove per-CPU data indirection from CSD lock debugging
locking/csd_lock: Remove added data from CSD lock debugging
locking/csd_lock: Add Kconfig option for csd_debug default
- Mark arch_cpu_idle_dead() __noreturn, make all architectures & drivers that did
this inconsistently follow this new, common convention, and fix all the fallout
that objtool can now detect statically.
- Fix/improve the ORC unwinder becoming unreliable due to UNWIND_HINT_EMPTY ambiguity,
split it into UNWIND_HINT_END_OF_STACK and UNWIND_HINT_UNDEFINED to resolve it.
- Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code.
- Generate ORC data for __pfx code
- Add more __noreturn annotations to various kernel startup/shutdown/panic functions.
- Misc improvements & fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmRK1x0RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1ghxQ/+IkCynMYtdF5OG9YwbcGJqsPSfOPMEcEM
pUSFYg+gGPBDT/fJfcVSqvUtdnWbLC2kXt9yiswXz3X3J2nmNkBk5YKQftsNDcul
TmKeqIIAK51XTncpegKH0EGnOX63oZ9Vxa8CTPdDlb+YF23Km2FoudGRI9F5qbUd
LoraXqGYeiaeySkGyWmZVl6Uc8dIxnMkTN3H/oI9aB6TOrsi059hAtFcSaFfyemP
c4LqXXCH7k2baiQt+qaLZ8cuZVG/+K5r2N2cmjO5kmJc6ynIaFnfMe4XxZLjp5LT
/PulYI15bXkvSARKx5CRh/CDHMOx5Blw+ASO0RhWbdy0WH4ZhhcaVF5AeIpPW86a
1LBcz97rMp72WmvKgrJeVO1r9+ll4SI6/YKGJRsxsCMdP3hgFpqntXyVjTFNdTM1
0gH6H5v55x06vJHvhtTk8SR3PfMTEM2fRU5jXEOrGowoGifx+wNUwORiwj6LE3KQ
SKUdT19RNzoW3VkFxhgk65ThK1S7YsJUKRoac3YdhttpqqqtFV//erenrZoR4k/p
vzvKy68EQ7RCNyD5wNWNFe0YjeJl5G8gQ8bUm4Xmab7djjgz+pn4WpQB8yYKJLAo
x9dqQ+6eUbw3Hcgk6qQ9E+r/svbulnAL0AeALAWK/91DwnZ2mCzKroFkLN7napKi
fRho4CqzrtM=
=NwEV
-----END PGP SIGNATURE-----
Merge tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
- Mark arch_cpu_idle_dead() __noreturn, make all architectures &
drivers that did this inconsistently follow this new, common
convention, and fix all the fallout that objtool can now detect
statically
- Fix/improve the ORC unwinder becoming unreliable due to
UNWIND_HINT_EMPTY ambiguity, split it into UNWIND_HINT_END_OF_STACK
and UNWIND_HINT_UNDEFINED to resolve it
- Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code
- Generate ORC data for __pfx code
- Add more __noreturn annotations to various kernel startup/shutdown
and panic functions
- Misc improvements & fixes
* tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
x86/hyperv: Mark hv_ghcb_terminate() as noreturn
scsi: message: fusion: Mark mpt_halt_firmware() __noreturn
x86/cpu: Mark {hlt,resume}_play_dead() __noreturn
btrfs: Mark btrfs_assertfail() __noreturn
objtool: Include weak functions in global_noreturns check
cpu: Mark nmi_panic_self_stop() __noreturn
cpu: Mark panic_smp_self_stop() __noreturn
arm64/cpu: Mark cpu_park_loop() and friends __noreturn
x86/head: Mark *_start_kernel() __noreturn
init: Mark start_kernel() __noreturn
init: Mark [arch_call_]rest_init() __noreturn
objtool: Generate ORC data for __pfx code
x86/linkage: Fix padding for typed functions
objtool: Separate prefix code from stack validation code
objtool: Remove superfluous dead_end_function() check
objtool: Add symbol iteration helpers
objtool: Add WARN_INSN()
scripts/objdump-func: Support multiple functions
context_tracking: Fix KCSAN noinstr violation
objtool: Add stackleak instrumentation to uaccess safe list
...
switching from a user process to a kernel thread.
- More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav.
- zsmalloc performance improvements from Sergey Senozhatsky.
- Yue Zhao has found and fixed some data race issues around the
alteration of memcg userspace tunables.
- VFS rationalizations from Christoph Hellwig:
- removal of most of the callers of write_one_page().
- make __filemap_get_folio()'s return value more useful
- Luis Chamberlain has changed tmpfs so it no longer requires swap
backing. Use `mount -o noswap'.
- Qi Zheng has made the slab shrinkers operate locklessly, providing
some scalability benefits.
- Keith Busch has improved dmapool's performance, making part of its
operations O(1) rather than O(n).
- Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
permitting userspace to wr-protect anon memory unpopulated ptes.
- Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather
than exclusive, and has fixed a bunch of errors which were caused by its
unintuitive meaning.
- Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
which causes minor faults to install a write-protected pte.
- Vlastimil Babka has done some maintenance work on vma_merge():
cleanups to the kernel code and improvements to our userspace test
harness.
- Cleanups to do_fault_around() by Lorenzo Stoakes.
- Mike Rapoport has moved a lot of initialization code out of various
mm/ files and into mm/mm_init.c.
- Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
DRM, but DRM doesn't use it any more.
- Lorenzo has also coverted read_kcore() and vread() to use iterators
and has thereby removed the use of bounce buffers in some cases.
- Lorenzo has also contributed further cleanups of vma_merge().
- Chaitanya Prakash provides some fixes to the mmap selftesting code.
- Matthew Wilcox changes xfs and afs so they no longer take sleeping
locks in ->map_page(), a step towards RCUification of pagefaults.
- Suren Baghdasaryan has improved mmap_lock scalability by switching to
per-VMA locking.
- Frederic Weisbecker has reworked the percpu cache draining so that it
no longer causes latency glitches on cpu isolated workloads.
- Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
logic.
- Liu Shixin has changed zswap's initialization so we no longer waste a
chunk of memory if zswap is not being used.
- Yosry Ahmed has improved the performance of memcg statistics flushing.
- David Stevens has fixed several issues involving khugepaged,
userfaultfd and shmem.
- Christoph Hellwig has provided some cleanup work to zram's IO-related
code paths.
- David Hildenbrand has fixed up some issues in the selftest code's
testing of our pte state changing.
- Pankaj Raghav has made page_endio() unneeded and has removed it.
- Peter Xu contributed some rationalizations of the userfaultfd
selftests.
- Yosry Ahmed has fixed an issue around memcg's page recalim accounting.
- Chaitanya Prakash has fixed some arm-related issues in the
selftests/mm code.
- Longlong Xia has improved the way in which KSM handles hwpoisoned
pages.
- Peter Xu fixes a few issues with uffd-wp at fork() time.
- Stefan Roesch has changed KSM so that it may now be used on a
per-process and per-cgroup basis.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr3zQAKCRDdBJ7gKXxA
jlLoAP0fpQBipwFxED0Us4SKQfupV6z4caXNJGPeay7Aj11/kQD/aMRC2uPfgr96
eMG3kwn2pqkB9ST2QpkaRbxA//eMbQY=
=J+Dj
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
switching from a user process to a kernel thread.
- More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
Raghav.
- zsmalloc performance improvements from Sergey Senozhatsky.
- Yue Zhao has found and fixed some data race issues around the
alteration of memcg userspace tunables.
- VFS rationalizations from Christoph Hellwig:
- removal of most of the callers of write_one_page()
- make __filemap_get_folio()'s return value more useful
- Luis Chamberlain has changed tmpfs so it no longer requires swap
backing. Use `mount -o noswap'.
- Qi Zheng has made the slab shrinkers operate locklessly, providing
some scalability benefits.
- Keith Busch has improved dmapool's performance, making part of its
operations O(1) rather than O(n).
- Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
permitting userspace to wr-protect anon memory unpopulated ptes.
- Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
rather than exclusive, and has fixed a bunch of errors which were
caused by its unintuitive meaning.
- Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
which causes minor faults to install a write-protected pte.
- Vlastimil Babka has done some maintenance work on vma_merge():
cleanups to the kernel code and improvements to our userspace test
harness.
- Cleanups to do_fault_around() by Lorenzo Stoakes.
- Mike Rapoport has moved a lot of initialization code out of various
mm/ files and into mm/mm_init.c.
- Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
DRM, but DRM doesn't use it any more.
- Lorenzo has also coverted read_kcore() and vread() to use iterators
and has thereby removed the use of bounce buffers in some cases.
- Lorenzo has also contributed further cleanups of vma_merge().
- Chaitanya Prakash provides some fixes to the mmap selftesting code.
- Matthew Wilcox changes xfs and afs so they no longer take sleeping
locks in ->map_page(), a step towards RCUification of pagefaults.
- Suren Baghdasaryan has improved mmap_lock scalability by switching to
per-VMA locking.
- Frederic Weisbecker has reworked the percpu cache draining so that it
no longer causes latency glitches on cpu isolated workloads.
- Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
logic.
- Liu Shixin has changed zswap's initialization so we no longer waste a
chunk of memory if zswap is not being used.
- Yosry Ahmed has improved the performance of memcg statistics
flushing.
- David Stevens has fixed several issues involving khugepaged,
userfaultfd and shmem.
- Christoph Hellwig has provided some cleanup work to zram's IO-related
code paths.
- David Hildenbrand has fixed up some issues in the selftest code's
testing of our pte state changing.
- Pankaj Raghav has made page_endio() unneeded and has removed it.
- Peter Xu contributed some rationalizations of the userfaultfd
selftests.
- Yosry Ahmed has fixed an issue around memcg's page recalim
accounting.
- Chaitanya Prakash has fixed some arm-related issues in the
selftests/mm code.
- Longlong Xia has improved the way in which KSM handles hwpoisoned
pages.
- Peter Xu fixes a few issues with uffd-wp at fork() time.
- Stefan Roesch has changed KSM so that it may now be used on a
per-process and per-cgroup basis.
* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
shmem: restrict noswap option to initial user namespace
mm/khugepaged: fix conflicting mods to collapse_file()
sparse: remove unnecessary 0 values from rc
mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
maple_tree: fix allocation in mas_sparse_area()
mm: do not increment pgfault stats when page fault handler retries
zsmalloc: allow only one active pool compaction context
selftests/mm: add new selftests for KSM
mm: add new KSM process and sysfs knobs
mm: add new api to enable ksm per process
mm: shrinkers: fix debugfs file permissions
mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
migrate_pages_batch: fix statistics for longterm pin retry
userfaultfd: use helper function range_in_vma()
lib/show_mem.c: use for_each_populated_zone() simplify code
mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
fs/buffer: convert create_page_buffers to folio_create_buffers
fs/buffer: add folio_create_empty_buffers helper
...
Core
----
- Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the
default value allows for better BIG TCP performances.
- Reduce compound page head access for zero-copy data transfers.
- RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when possible.
- Threaded NAPI improvements, adding defer skb free support and unneeded
softirq avoidance.
- Address dst_entry reference count scalability issues, via false
sharing avoidance and optimize refcount tracking.
- Add lockless accesses annotation to sk_err[_soft].
- Optimize again the skb struct layout.
- Extends the skb drop reasons to make it usable by multiple
subsystems.
- Better const qualifier awareness for socket casts.
BPF
---
- Add skb and XDP typed dynptrs which allow BPF programs for more
ergonomic and less brittle iteration through data and variable-sized
accesses.
- Add a new BPF netfilter program type and minimal support to hook
BPF programs to netfilter hooks such as prerouting or forward.
- Add more precise memory usage reporting for all BPF map types.
- Adds support for using {FOU,GUE} encap with an ipip device operating
in collect_md mode and add a set of BPF kfuncs for controlling encap
params.
- Allow BPF programs to detect at load time whether a particular kfunc
exists or not, and also add support for this in light skeleton.
- Bigger batch of BPF verifier improvements to prepare for upcoming BPF
open-coded iterators allowing for less restrictive looping capabilities.
- Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF
programs to NULL-check before passing such pointers into kfunc.
- Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in
local storage maps.
- Enable RCU semantics for task BPF kptrs and allow referenced kptr
tasks to be stored in BPF maps.
- Add support for refcounted local kptrs to the verifier for allowing
shared ownership, useful for adding a node to both the BPF list and
rbtree.
- Add BPF verifier support for ST instructions in convert_ctx_access()
which will help new -mcpu=v4 clang flag to start emitting them.
- Add ARM32 USDT support to libbpf.
- Improve bpftool's visual program dump which produces the control
flow graph in a DOT format by adding C source inline annotations.
Protocols
---------
- IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value
indicates the provenance of the IP address.
- IPv6: optimize route lookup, dropping unneeded R/W lock acquisition.
- Add the handshake upcall mechanism, allowing the user-space
to implement generic TLS handshake on kernel's behalf.
- Bridge: support per-{Port, VLAN} neighbor suppression, increasing
resilience to nodes failures.
- SCTP: add support for Fair Capacity and Weighted Fair Queueing
schedulers.
- MPTCP: delay first subflow allocation up to its first usage. This
will allow for later better LSM interaction.
- xfrm: Remove inner/outer modes from input/output path. These are
not needed anymore.
- WiFi:
- reduced neighbor report (RNR) handling for AP mode
- HW timestamping support
- support for randomized auth/deauth TA for PASN privacy
- per-link debugfs for multi-link
- TC offload support for mac80211 drivers
- mac80211 mesh fast-xmit and fast-rx support
- enable Wi-Fi 7 (EHT) mesh support
Netfilter
---------
- Add nf_tables 'brouting' support, to force a packet to be routed
instead of being bridged.
- Update bridge netfilter and ovs conntrack helpers to handle
IPv6 Jumbo packets properly, i.e. fetch the packet length
from hop-by-hop extension header. This is needed for BIT TCP
support.
- The iptables 32bit compat interface isn't compiled in by default
anymore.
- Move ip(6)tables builtin icmp matches to the udptcp one.
This has the advantage that icmp/icmpv6 match doesn't load the
iptables/ip6tables modules anymore when iptables-nft is used.
- Extended netlink error report for netdevice in flowtables and
netdev/chains. Allow for incrementally add/delete devices to netdev
basechain. Allow to create netdev chain without device.
Driver API
----------
- Remove redundant Device Control Error Reporting Enable, as PCI core
has already error reporting enabled at enumeration time.
- Move Multicast DB netlink handlers to core, allowing devices other
then bridge to use them.
- Allow the page_pool to directly recycle the pages from safely
localized NAPI.
- Implement lockless TX queue stop/wake combo macros, allowing for
further code de-duplication and sanitization.
- Add YNL support for user headers and struct attrs.
- Add partial YNL specification for devlink.
- Add partial YNL specification for ethtool.
- Add tc-mqprio and tc-taprio support for preemptible traffic classes.
- Add tx push buf len param to ethtool, specifies the maximum number
of bytes of a transmitted packet a driver can push directly to the
underlying device.
- Add basic LED support for switch/phy.
- Add NAPI documentation, stop relaying on external links.
- Convert dsa_master_ioctl() to netdev notifier. This is a preparatory
work to make the hardware timestamping layer selectable by user
space.
- Add transceiver support and improve the error messages for CAN-FD
controllers.
New hardware / drivers
----------------------
- Ethernet:
- AMD/Pensando core device support
- MediaTek MT7981 SoC
- MediaTek MT7988 SoC
- Broadcom BCM53134 embedded switch
- Texas Instruments CPSW9G ethernet switch
- Qualcomm EMAC3 DWMAC ethernet
- StarFive JH7110 SoC
- NXP CBTX ethernet PHY
- WiFi:
- Apple M1 Pro/Max devices
- RealTek rtl8710bu/rtl8188gu
- RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset
- Bluetooth:
- Realtek RTL8821CS, RTL8851B, RTL8852BS
- Mediatek MT7663, MT7922
- NXP w8997
- Actions Semi ATS2851
- QTI WCN6855
- Marvell 88W8997
- Can:
- STMicroelectronics bxcan stm32f429
Drivers
-------
- Ethernet NICs:
- Intel (1G, icg):
- add tracking and reporting of QBV config errors.
- add support for configuring max SDU for each Tx queue.
- Intel (100G, ice):
- refactor mailbox overflow detection to support Scalable IOV
- GNSS interface optimization
- Intel (i40e):
- support XDP multi-buffer
- nVidia/Mellanox:
- add the support for linux bridge multicast offload
- enable TC offload for egress and engress MACVLAN over bond
- add support for VxLAN GBP encap/decap flows offload
- extend packet offload to fully support libreswan
- support tunnel mode in mlx5 IPsec packet offload
- extend XDP multi-buffer support
- support MACsec VLAN offload
- add support for dynamic msix vectors allocation
- drop RX page_cache and fully use page_pool
- implement thermal zone to report NIC temperature
- Netronome/Corigine:
- add support for multi-zone conntrack offload
- Solarflare/Xilinx:
- support offloading TC VLAN push/pop actions to the MAE
- support TC decap rules
- support unicast PTP
- Other NICs:
- Broadcom (bnxt): enforce software based freq adjustments only
on shared PHC NIC
- RealTek (r8169): refactor to addess ASPM issues during NAPI poll.
- Micrel (lan8841): add support for PTP_PF_PEROUT
- Cadence (macb): enable PTP unicast
- Engleder (tsnep): add XDP socket zero-copy support
- virtio-net: implement exact header length guest feature
- veth: add page_pool support for page recycling
- vxlan: add MDB data path support
- gve: add XDP support for GQI-QPL format
- geneve: accept every ethertype
- macvlan: allow some packets to bypass broadcast queue
- mana: add support for jumbo frame
- Ethernet high-speed switches:
- Microchip (sparx5): Add support for TC flower templates.
- Ethernet embedded switches:
- Broadcom (b54):
- configure 6318 and 63268 RGMII ports
- Marvell (mv88e6xxx):
- faster C45 bus scan
- Microchip:
- lan966x:
- add support for IS1 VCAP
- better TX/RX from/to CPU performances
- ksz9477: add ETS Qdisc support
- ksz8: enhance static MAC table operations and error handling
- sama7g5: add PTP capability
- NXP (ocelot):
- add support for external ports
- add support for preemptible traffic classes
- Texas Instruments:
- add CPSWxG SGMII support for J7200 and J721E
- Intel WiFi (iwlwifi):
- preparation for Wi-Fi 7 EHT and multi-link support
- EHT (Wi-Fi 7) sniffer support
- hardware timestamping support for some devices/firwmares
- TX beacon protection on newer hardware
- Qualcomm 802.11ax WiFi (ath11k):
- MU-MIMO parameters support
- ack signal support for management packets
- RealTek WiFi (rtw88):
- SDIO bus support
- better support for some SDIO devices
(e.g. MAC address from efuse)
- RealTek WiFi (rtw89):
- HW scan support for 8852b
- better support for 6 GHz scanning
- support for various newer firmware APIs
- framework firmware backwards compatibility
- MediaTek WiFi (mt76):
- P2P support
- mesh A-MSDU support
- EHT (Wi-Fi 7) support
- coredump support
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRI/mUSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkgO0QAJGxpuN67YgYV0BIM+/atWKEEexJYG7B
9MMpU4jMO3EW/pUS5t7VRsBLUybLYVPmqCZoHodObDfnu59jiPOegb6SikJv/ZwJ
Zw62PVk5MvDnQjlu4e6kDcGwkplteN08TlgI+a49BUTedpdFitrxHAYGW8f2fRO6
cK2XSld+ZucMoym5vRwf8yWS1BwdxnslPMxDJ+/8ZbWBZv44qAnG2vMB/kIx7ObC
Vel/4m6MzTwVsLYBsRvcwMVbNNlZ9GuhztlTzEbfGA4ZhTadIAMgb5VTWXB84Ws7
Aic5wTdli+q+x6/2cxhbyeoVuB9HHObYmLBAciGg4GNljP5rnQBY3X3+KVZ/x9TI
HQB7CmhxmAZVrO9pLARFV+ECrMTH2/dy3NyrZ7uYQ3WPOXJi8hJZjOTO/eeEGL7C
eTjdz0dZBWIBK2gON/6s4nExXVQUTEF2ZsPi52jTTClKjfe5pz/ddeFQIWaY1DTm
pInEiWPAvd28JyiFmhFNHsuIBCjX/Zqe2JuMfMBeBibDAC09o/OGdKJYUI15AiRf
F46Pdb7use/puqfrYW44kSAfaPYoBiE+hj1RdeQfen35xD9HVE4vdnLNeuhRlFF9
aQfyIRHYQofkumRDr5f8JEY66cl9NiKQ4IVW1xxQfYDNdC6wQqREPG1md7rJVMrJ
vP7ugFnttneg
=ITVa
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the
default value allows for better BIG TCP performances
- Reduce compound page head access for zero-copy data transfers
- RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when
possible
- Threaded NAPI improvements, adding defer skb free support and
unneeded softirq avoidance
- Address dst_entry reference count scalability issues, via false
sharing avoidance and optimize refcount tracking
- Add lockless accesses annotation to sk_err[_soft]
- Optimize again the skb struct layout
- Extends the skb drop reasons to make it usable by multiple
subsystems
- Better const qualifier awareness for socket casts
BPF:
- Add skb and XDP typed dynptrs which allow BPF programs for more
ergonomic and less brittle iteration through data and
variable-sized accesses
- Add a new BPF netfilter program type and minimal support to hook
BPF programs to netfilter hooks such as prerouting or forward
- Add more precise memory usage reporting for all BPF map types
- Adds support for using {FOU,GUE} encap with an ipip device
operating in collect_md mode and add a set of BPF kfuncs for
controlling encap params
- Allow BPF programs to detect at load time whether a particular
kfunc exists or not, and also add support for this in light
skeleton
- Bigger batch of BPF verifier improvements to prepare for upcoming
BPF open-coded iterators allowing for less restrictive looping
capabilities
- Rework RCU enforcement in the verifier, add kptr_rcu and enforce
BPF programs to NULL-check before passing such pointers into kfunc
- Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and
in local storage maps
- Enable RCU semantics for task BPF kptrs and allow referenced kptr
tasks to be stored in BPF maps
- Add support for refcounted local kptrs to the verifier for allowing
shared ownership, useful for adding a node to both the BPF list and
rbtree
- Add BPF verifier support for ST instructions in
convert_ctx_access() which will help new -mcpu=v4 clang flag to
start emitting them
- Add ARM32 USDT support to libbpf
- Improve bpftool's visual program dump which produces the control
flow graph in a DOT format by adding C source inline annotations
Protocols:
- IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value
indicates the provenance of the IP address
- IPv6: optimize route lookup, dropping unneeded R/W lock acquisition
- Add the handshake upcall mechanism, allowing the user-space to
implement generic TLS handshake on kernel's behalf
- Bridge: support per-{Port, VLAN} neighbor suppression, increasing
resilience to nodes failures
- SCTP: add support for Fair Capacity and Weighted Fair Queueing
schedulers
- MPTCP: delay first subflow allocation up to its first usage. This
will allow for later better LSM interaction
- xfrm: Remove inner/outer modes from input/output path. These are
not needed anymore
- WiFi:
- reduced neighbor report (RNR) handling for AP mode
- HW timestamping support
- support for randomized auth/deauth TA for PASN privacy
- per-link debugfs for multi-link
- TC offload support for mac80211 drivers
- mac80211 mesh fast-xmit and fast-rx support
- enable Wi-Fi 7 (EHT) mesh support
Netfilter:
- Add nf_tables 'brouting' support, to force a packet to be routed
instead of being bridged
- Update bridge netfilter and ovs conntrack helpers to handle IPv6
Jumbo packets properly, i.e. fetch the packet length from
hop-by-hop extension header. This is needed for BIT TCP support
- The iptables 32bit compat interface isn't compiled in by default
anymore
- Move ip(6)tables builtin icmp matches to the udptcp one. This has
the advantage that icmp/icmpv6 match doesn't load the
iptables/ip6tables modules anymore when iptables-nft is used
- Extended netlink error report for netdevice in flowtables and
netdev/chains. Allow for incrementally add/delete devices to netdev
basechain. Allow to create netdev chain without device
Driver API:
- Remove redundant Device Control Error Reporting Enable, as PCI core
has already error reporting enabled at enumeration time
- Move Multicast DB netlink handlers to core, allowing devices other
then bridge to use them
- Allow the page_pool to directly recycle the pages from safely
localized NAPI
- Implement lockless TX queue stop/wake combo macros, allowing for
further code de-duplication and sanitization
- Add YNL support for user headers and struct attrs
- Add partial YNL specification for devlink
- Add partial YNL specification for ethtool
- Add tc-mqprio and tc-taprio support for preemptible traffic classes
- Add tx push buf len param to ethtool, specifies the maximum number
of bytes of a transmitted packet a driver can push directly to the
underlying device
- Add basic LED support for switch/phy
- Add NAPI documentation, stop relaying on external links
- Convert dsa_master_ioctl() to netdev notifier. This is a
preparatory work to make the hardware timestamping layer selectable
by user space
- Add transceiver support and improve the error messages for CAN-FD
controllers
New hardware / drivers:
- Ethernet:
- AMD/Pensando core device support
- MediaTek MT7981 SoC
- MediaTek MT7988 SoC
- Broadcom BCM53134 embedded switch
- Texas Instruments CPSW9G ethernet switch
- Qualcomm EMAC3 DWMAC ethernet
- StarFive JH7110 SoC
- NXP CBTX ethernet PHY
- WiFi:
- Apple M1 Pro/Max devices
- RealTek rtl8710bu/rtl8188gu
- RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset
- Bluetooth:
- Realtek RTL8821CS, RTL8851B, RTL8852BS
- Mediatek MT7663, MT7922
- NXP w8997
- Actions Semi ATS2851
- QTI WCN6855
- Marvell 88W8997
- Can:
- STMicroelectronics bxcan stm32f429
Drivers:
- Ethernet NICs:
- Intel (1G, icg):
- add tracking and reporting of QBV config errors
- add support for configuring max SDU for each Tx queue
- Intel (100G, ice):
- refactor mailbox overflow detection to support Scalable IOV
- GNSS interface optimization
- Intel (i40e):
- support XDP multi-buffer
- nVidia/Mellanox:
- add the support for linux bridge multicast offload
- enable TC offload for egress and engress MACVLAN over bond
- add support for VxLAN GBP encap/decap flows offload
- extend packet offload to fully support libreswan
- support tunnel mode in mlx5 IPsec packet offload
- extend XDP multi-buffer support
- support MACsec VLAN offload
- add support for dynamic msix vectors allocation
- drop RX page_cache and fully use page_pool
- implement thermal zone to report NIC temperature
- Netronome/Corigine:
- add support for multi-zone conntrack offload
- Solarflare/Xilinx:
- support offloading TC VLAN push/pop actions to the MAE
- support TC decap rules
- support unicast PTP
- Other NICs:
- Broadcom (bnxt): enforce software based freq adjustments only on
shared PHC NIC
- RealTek (r8169): refactor to addess ASPM issues during NAPI poll
- Micrel (lan8841): add support for PTP_PF_PEROUT
- Cadence (macb): enable PTP unicast
- Engleder (tsnep): add XDP socket zero-copy support
- virtio-net: implement exact header length guest feature
- veth: add page_pool support for page recycling
- vxlan: add MDB data path support
- gve: add XDP support for GQI-QPL format
- geneve: accept every ethertype
- macvlan: allow some packets to bypass broadcast queue
- mana: add support for jumbo frame
- Ethernet high-speed switches:
- Microchip (sparx5): Add support for TC flower templates
- Ethernet embedded switches:
- Broadcom (b54):
- configure 6318 and 63268 RGMII ports
- Marvell (mv88e6xxx):
- faster C45 bus scan
- Microchip:
- lan966x:
- add support for IS1 VCAP
- better TX/RX from/to CPU performances
- ksz9477: add ETS Qdisc support
- ksz8: enhance static MAC table operations and error handling
- sama7g5: add PTP capability
- NXP (ocelot):
- add support for external ports
- add support for preemptible traffic classes
- Texas Instruments:
- add CPSWxG SGMII support for J7200 and J721E
- Intel WiFi (iwlwifi):
- preparation for Wi-Fi 7 EHT and multi-link support
- EHT (Wi-Fi 7) sniffer support
- hardware timestamping support for some devices/firwmares
- TX beacon protection on newer hardware
- Qualcomm 802.11ax WiFi (ath11k):
- MU-MIMO parameters support
- ack signal support for management packets
- RealTek WiFi (rtw88):
- SDIO bus support
- better support for some SDIO devices (e.g. MAC address from
efuse)
- RealTek WiFi (rtw89):
- HW scan support for 8852b
- better support for 6 GHz scanning
- support for various newer firmware APIs
- framework firmware backwards compatibility
- MediaTek WiFi (mt76):
- P2P support
- mesh A-MSDU support
- EHT (Wi-Fi 7) support
- coredump support"
* tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2078 commits)
net: phy: hide the PHYLIB_LEDS knob
net: phy: marvell-88x2222: remove unnecessary (void*) conversions
tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.
net: amd: Fix link leak when verifying config failed
net: phy: marvell: Fix inconsistent indenting in led_blink_set
lan966x: Don't use xdp_frame when action is XDP_TX
tsnep: Add XDP socket zero-copy TX support
tsnep: Add XDP socket zero-copy RX support
tsnep: Move skb receive action to separate function
tsnep: Add functions for queue enable/disable
tsnep: Rework TX/RX queue initialization
tsnep: Replace modulo operation with mask
net: phy: dp83867: Add led_brightness_set support
net: phy: Fix reading LED reg property
drivers: nfc: nfcsim: remove return value check of `dev_dir`
net: phy: dp83867: Remove unnecessary (void*) conversions
net: ethtool: coalesce: try to make user settings stick twice
net: mana: Check if netdev/napi_alloc_frag returns single page
net: mana: Rename mana_refill_rxoob and remove some empty lines
net: veth: add page_pool stats
...
These are various cleanups, fixing a number of uapi header files to no
longer reference CONFIG_* symbols, and one patch that introduces the
new CONFIG_HAS_IOPORT symbol for architectures that provide working
inb()/outb() macros, as a preparation for adding driver dependencies
on those in the following release.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmRG8IkACgkQYKtH/8kJ
Uid15Q/9E/neIIEqEk6IvtyhUicrJiIZUM0rGoYtWXiz75ggk6Kx9+3I+j8zIQ/E
kf2TzAG7q9Md7nfTDFLr4FSr0IcNDj+VG4nYxUyDHdKGcARO+g9Kpdvscxip3lgU
Rw5w74Gyd30u4iUKGS39OYuxcCgl9LaFjMA9Gh402Oiaoh+OYLmgQS9h/goUD5KN
Nd+AoFvkdbnHl0/SpxthLRyL5rFEATBmAY7apYViPyMvfjS3gfDJwXJR9jkKgi6X
Qs4t8Op8BA3h84dCuo6VcFqgAJs2Wiq3nyTSUnkF8NxJ2RFTpeiVgfsLOzXHeDgz
SKDB4Lp14o3mlyZyj00MWq1uMJRRetUgNiVb6iHOoKQ/E4demBdh+mhIFRybjM5B
XNTWFcg9PWFCMa4W9jnLfZBc881X4+7T+qUF8I0W/1AbRJUmyGj8HO6jLceC4yGD
UYLn5oFPM6OWXHp6DqJrCr9Yw8h6fuviQZFEbl/ARlgVGt+J4KbYweJYk8DzfX6t
PZIj8LskOqyIpRuC2oDA1PHxkaJ1/z+N5oRBHq1uicSh4fxY5HW7HnyzgF08+R3k
cf+fjAhC3TfGusHkBwQKQJvpxrxZjPuvYXDZ0GxTvNKJRB8eMeiTm1n41E5oTVwQ
swSblSCjZj/fMVVPXLcjxEW4SBNWRxa9Lz3tIPXb3RheU10Lfy8=
=H3k4
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"These are various cleanups, fixing a number of uapi header files to no
longer reference CONFIG_* symbols, and one patch that introduces the
new CONFIG_HAS_IOPORT symbol for architectures that provide working
inb()/outb() macros, as a preparation for adding driver dependencies
on those in the following release"
* tag 'asm-generic-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
Kconfig: introduce HAS_IOPORT option and select it as necessary
scripts: Update the CONFIG_* ignore list in headers_install.sh
pktcdvd: Remove CONFIG_CDROM_PKTCDVD_WCACHE from uapi header
Move bp_type_idx to include/linux/hw_breakpoint.h
Move ep_take_care_of_epollwakeup() to fs/eventpoll.c
Move COMPAT_ATM_ADDPARTY to net/atm/svc.c
- Improve the VDSO build time checks to cover all dynamic relocations
VDSO does not allow dynamic relcations, but the build time check is
incomplete and fragile.
It's based on architectures specifying the relocation types to search
for and does not handle R_*_NONE relocation entries correctly.
R_*_NONE relocations are injected by some GNU ld variants if they fail
to determine the exact .rel[a]/dyn_size to cover trailing zeros.
R_*_NONE relocations must be ignored by dynamic loaders, so they
should be ignored in the build time check too.
Remove the architecture specific relocation types to check for and
validate strictly that no other relocations than R_*_NONE end up
in the VSDO .so file.
- Prefer signal delivery to the current thread for
CLOCK_PROCESS_CPUTIME_ID based posix-timers
Such timers prefer to deliver the signal to the main thread of a
process even if the context in which the timer expires is the current
task. This has the downside that it might wake up an idle thread.
As there is no requirement or guarantee that the signal has to be
delivered to the main thread, avoid this by preferring the current
task if it is part of the thread group which shares sighand.
This not only avoids waking idle threads, it also distributes the
signal delivery in case of multiple timers firing in the context
of different threads close to each other better.
- Align the tick period properly (again)
For a long time the tick was starting at CLOCK_MONOTONIC zero, which
allowed users space applications to either align with the tick or to
place a periodic computation so that it does not interfere with the
tick. The alignement of the tick period was more by chance than by
intention as the tick is set up before a high resolution clocksource is
installed, i.e. timekeeping is still tick based and the tick period
advances from there.
The early enablement of sched_clock() broke this alignement as the time
accumulated by sched_clock() is taken into account when timekeeping is
initialized. So the base value now(CLOCK_MONOTONIC) is not longer a
multiple of tick periods, which breaks applications which relied on
that behaviour.
Cure this by aligning the tick starting point to the next multiple of
tick periods, i.e 1000ms/CONFIG_HZ.
- A set of NOHZ fixes and enhancements
- Cure the concurrent writer race for idle and IO sleeptime statistics
The statitic values which are exposed via /proc/stat are updated from
the CPU local idle exit and remotely by cpufreq, but that happens
without any form of serialization. As a consequence sleeptimes can be
accounted twice or worse.
Prevent this by restricting the accumulation writeback to the CPU
local idle exit and let the remote access compute the accumulated
value.
- Protect idle/iowait sleep time with a sequence count
Reading idle/iowait sleep time, e.g. from /proc/stat, can race with
idle exit updates. As a consequence the readout may result in random
and potentially going backwards values.
Protect this by a sequence count, which fixes the idle time
statistics issue, but cannot fix the iowait time problem because
iowait time accounting races with remote wake ups decrementing the
remote runqueues nr_iowait counter. The latter is impossible to fix,
so the only way to deal with that is to document it properly and to
remove the assertion in the selftest which triggers occasionally due
to that.
- Restructure struct tick_sched for better cache layout
- Some small cleanups and a better cache layout for struct tick_sched
- Implement the missing timer_wait_running() callback for POSIX CPU timers
For unknown reason the introduction of the timer_wait_running() callback
missed to fixup posix CPU timers, which went unnoticed for almost four
years.
While initially only targeted to prevent livelocks between a timer
deletion and the timer expiry function on PREEMPT_RT enabled kernels, it
turned out that fixing this for mainline is not as trivial as just
implementing a stub similar to the hrtimer/timer callbacks.
The reason is that for CONFIG_POSIX_CPU_TIMERS_TASK_WORK enabled systems
there is a livelock issue independent of RT.
CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y moves the expiry of POSIX CPU timers
out from hard interrupt context to task work, which is handled before
returning to user space or to a VM. The expiry mechanism moves the
expired timers to a stack local list head with sighand lock held. Once
sighand is dropped the task can be preempted and a task which wants to
delete a timer will spin-wait until the expiry task is scheduled back
in. In the worst case this will end up in a livelock when the preempting
task and the expiry task are pinned on the same CPU.
The timer wheel has a timer_wait_running() mechanism for RT, which uses
a per CPU timer-base expiry lock which is held by the expiry code and the
task waiting for the timer function to complete blocks on that lock.
This does not work in the same way for posix CPU timers as there is no
timer base and expiry for process wide timers can run on any task
belonging to that process, but the concept of waiting on an expiry lock
can be used too in a slightly different way.
Add a per task mutex to struct posix_cputimers_work, let the expiry task
hold it accross the expiry function and let the deleting task which
waits for the expiry to complete block on the mutex.
In the non-contended case this results in an extra mutex_lock()/unlock()
pair on both sides.
This avoids spin-waiting on a task which is scheduled out, prevents the
livelock and cures the problem for RT and !RT systems.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRGrj4THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoZhdEAC/lwfDWCnTXHC8ExQQRDIVNyXmDlLb
EHB8ZY7Wc4gNZ8UEXEOLOXJHMG9bsbtPGctVewJwRGnXZWKVhpPwQba6kCRycyX0
0J6l5DlvUaGGrpoOzOZwgETRmtIZE9tEArZR8xlfRScYd93a7yLhwIjO8JaV9vKs
IQpAQMeJ/ysp6gHrS59qakYfoHU/ERUAu3Tk4GqHUtPtcyz3nX3eTlLWV8LySqs+
00qr2yc0bQFUFoKzTCxtM8lcEi9ja9SOj1rw28348O+BXE4d0HC12Ie7eU/CDN2Y
OAlWYxVjy4LMh24LDrRQKTzoVqx9MXDx2g+09B3t8NK5LgeS+EJIjujDhZF147/H
5y906nplZUKa8BiZW5Rpm/HKH8tFI80T9XWSQCRBeMgTEJyRyRU1yASAwO4xw+dY
Dn3tGmFGymcV/72o4ic9JFKQd8cTSxPjEJS3qqzMkEAtyI/zPBmKxj/Tce50OH40
6FSZq1uU21ZQzszwSHISwgFtNr75laUSK4Z1te5OhPOOz+C7O9YqHvqS/1jwhPj2
tMd8X17fRW3UTUBlBj+zqxqiEGBl/Yk2AvKrJIXGUtfWYCtjMJ7ieCf0kZ7NSVJx
9ewubA0gqseMD783YomZsy8LLtMKnhclJeslUOVb1oKs1q/WF1R/k6qjy9vUwYaB
nIJuHl8mxSetag==
=SVnj
-----END PGP SIGNATURE-----
Merge tag 'timers-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timers and timekeeping updates from Thomas Gleixner:
- Improve the VDSO build time checks to cover all dynamic relocations
VDSO does not allow dynamic relocations, but the build time check is
incomplete and fragile.
It's based on architectures specifying the relocation types to search
for and does not handle R_*_NONE relocation entries correctly.
R_*_NONE relocations are injected by some GNU ld variants if they
fail to determine the exact .rel[a]/dyn_size to cover trailing zeros.
R_*_NONE relocations must be ignored by dynamic loaders, so they
should be ignored in the build time check too.
Remove the architecture specific relocation types to check for and
validate strictly that no other relocations than R_*_NONE end up in
the VSDO .so file.
- Prefer signal delivery to the current thread for
CLOCK_PROCESS_CPUTIME_ID based posix-timers
Such timers prefer to deliver the signal to the main thread of a
process even if the context in which the timer expires is the current
task. This has the downside that it might wake up an idle thread.
As there is no requirement or guarantee that the signal has to be
delivered to the main thread, avoid this by preferring the current
task if it is part of the thread group which shares sighand.
This not only avoids waking idle threads, it also distributes the
signal delivery in case of multiple timers firing in the context of
different threads close to each other better.
- Align the tick period properly (again)
For a long time the tick was starting at CLOCK_MONOTONIC zero, which
allowed users space applications to either align with the tick or to
place a periodic computation so that it does not interfere with the
tick. The alignement of the tick period was more by chance than by
intention as the tick is set up before a high resolution clocksource
is installed, i.e. timekeeping is still tick based and the tick
period advances from there.
The early enablement of sched_clock() broke this alignement as the
time accumulated by sched_clock() is taken into account when
timekeeping is initialized. So the base value now(CLOCK_MONOTONIC) is
not longer a multiple of tick periods, which breaks applications
which relied on that behaviour.
Cure this by aligning the tick starting point to the next multiple of
tick periods, i.e 1000ms/CONFIG_HZ.
- A set of NOHZ fixes and enhancements:
* Cure the concurrent writer race for idle and IO sleeptime
statistics
The statitic values which are exposed via /proc/stat are updated
from the CPU local idle exit and remotely by cpufreq, but that
happens without any form of serialization. As a consequence
sleeptimes can be accounted twice or worse.
Prevent this by restricting the accumulation writeback to the CPU
local idle exit and let the remote access compute the accumulated
value.
* Protect idle/iowait sleep time with a sequence count
Reading idle/iowait sleep time, e.g. from /proc/stat, can race
with idle exit updates. As a consequence the readout may result
in random and potentially going backwards values.
Protect this by a sequence count, which fixes the idle time
statistics issue, but cannot fix the iowait time problem because
iowait time accounting races with remote wake ups decrementing
the remote runqueues nr_iowait counter. The latter is impossible
to fix, so the only way to deal with that is to document it
properly and to remove the assertion in the selftest which
triggers occasionally due to that.
* Restructure struct tick_sched for better cache layout
* Some small cleanups and a better cache layout for struct
tick_sched
- Implement the missing timer_wait_running() callback for POSIX CPU
timers
For unknown reason the introduction of the timer_wait_running()
callback missed to fixup posix CPU timers, which went unnoticed for
almost four years.
While initially only targeted to prevent livelocks between a timer
deletion and the timer expiry function on PREEMPT_RT enabled kernels,
it turned out that fixing this for mainline is not as trivial as just
implementing a stub similar to the hrtimer/timer callbacks.
The reason is that for CONFIG_POSIX_CPU_TIMERS_TASK_WORK enabled
systems there is a livelock issue independent of RT.
CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y moves the expiry of POSIX CPU
timers out from hard interrupt context to task work, which is handled
before returning to user space or to a VM. The expiry mechanism moves
the expired timers to a stack local list head with sighand lock held.
Once sighand is dropped the task can be preempted and a task which
wants to delete a timer will spin-wait until the expiry task is
scheduled back in. In the worst case this will end up in a livelock
when the preempting task and the expiry task are pinned on the same
CPU.
The timer wheel has a timer_wait_running() mechanism for RT, which
uses a per CPU timer-base expiry lock which is held by the expiry
code and the task waiting for the timer function to complete blocks
on that lock.
This does not work in the same way for posix CPU timers as there is
no timer base and expiry for process wide timers can run on any task
belonging to that process, but the concept of waiting on an expiry
lock can be used too in a slightly different way.
Add a per task mutex to struct posix_cputimers_work, let the expiry
task hold it accross the expiry function and let the deleting task
which waits for the expiry to complete block on the mutex.
In the non-contended case this results in an extra
mutex_lock()/unlock() pair on both sides.
This avoids spin-waiting on a task which is scheduled out, prevents
the livelock and cures the problem for RT and !RT systems
* tag 'timers-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
posix-cpu-timers: Implement the missing timer_wait_running callback
selftests/proc: Assert clock_gettime(CLOCK_BOOTTIME) VS /proc/uptime monotonicity
selftests/proc: Remove idle time monotonicity assertions
MAINTAINERS: Remove stale email address
timers/nohz: Remove middle-function __tick_nohz_idle_stop_tick()
timers/nohz: Add a comment about broken iowait counter update race
timers/nohz: Protect idle/iowait sleep time under seqcount
timers/nohz: Only ever update sleeptime from idle exit
timers/nohz: Restructure and reshuffle struct tick_sched
tick/common: Align tick period with the HZ tick.
selftests/timers/posix_timers: Test delivery of signals across threads
posix-timers: Prefer delivery of signals to the current thread
vdso: Improve cmd_vdso_check to check all dynamic relocations
Replace the architecture's fbdev helpers with the generic
ones from <asm-generic/fb.h>. No functional changes.
v2:
* use default implementation for fb_pgprotect() (Arnd)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Helge Deller <deller@gmx.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230417125651.25126-7-tzimmermann@suse.de
After commit c78c43fe7d ("LoongArch: Use acpi_arch_dma_setup() and
remove ARCH_HAS_PHYS_TO_DMA"), plat_swiotlb_setup() has been deleted,
so clean up the related code.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
We can see the following messages with CONFIG_PROVE_LOCKING=y on
LoongArch:
BUG: MAX_STACK_TRACE_ENTRIES too low!
turning off the locking correctness validator.
This is because stack_trace_save() returns a big value after call
arch_stack_walk(), here is the call trace:
save_trace()
stack_trace_save()
arch_stack_walk()
stack_trace_consume_entry()
arch_stack_walk() should return immediately if unwind_next_frame()
failed, no need to do the useless loops to increase the value of c->len
in stack_trace_consume_entry(), then we can fix the above problem.
Cc: stable@vger.kernel.org
Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/all/8a44ad71-68d2-4926-892f-72bfc7a67e2a@roeck-us.net/
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Ensure that user_watch_state can be set correctly by the user.
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This is done in order to easily calculate the number of breakpoints in
hw_break_get()/hw_break_set().
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Now we use ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP config option to
indicate devdax and hugetlb vmemmap optimization support. Hence rename
that to a generic ARCH_WANT_OPTIMIZE_VMEMMAP
Link: https://lkml.kernel.org/r/20230412050025.84346-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Tarun Sahu <tsahu@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
These got*, plt* and .text.ftrace_trampoline sections specified for
LoongArch have non-zero addressses. Non-zero section addresses in a
relocatable ELF would confuse GDB when it tries to compute the section
offsets and it ends up printing wrong symbol addresses. Therefore, set
them to zero, which mirrors the change in commit 5d8591bc0f
("arm64 module: set plt* section addresses to 0x0").
Cc: stable@vger.kernel.org
Reviewed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Chong Qiao <qiaochong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
vm_map_base, empty_zero_page and invalid_pmd_table could be accessed
widely by some out-of-tree non-GPL but important file systems or drivers
(e.g. OpenZFS). Let's use EXPORT_SYMBOL() instead of EXPORT_SYMBOL_GPL()
to export them, so as to avoid build errors.
1, Details about vm_map_base:
This is a LoongArch-specific symbol and may be referenced through macros
PCI_IOBASE, VMALLOC_START and VMALLOC_END.
2, Details about empty_zero_page:
As it stands today, only 3 architectures export empty_zero_page as a GPL
symbol: IA64, LoongArch and MIPS. LoongArch gets the GPL export by
inheriting from MIPS, and the MIPS export was first introduced in commit
497d2adcbf ("[MIPS] Export empty_zero_page for sake of the ext4
module."). The IA64 export was similar: commit a7d57ecf42 ("[IA64]
Export three symbols for module use") did so for kvm.
In both IA64 and MIPS, the export of empty_zero_page was done for
satisfying some in-kernel component built as module (kvm and ext4
respectively), and given its reasonably low-level nature, GPL is a
reasonable choice. But looking at the bigger picture it is evident most
other architectures do not regard it as GPL, so in effect the symbol
probably should not be treated as such, in favor of consistency.
3, Details about invalid_pmd_table:
Keep consistency with invalid_pte_table and make it be possible by some
modules.
Cc: stable@vger.kernel.org
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Some firmwares don't enable PG when wakeup from suspend, so do it in
kernel. This can improve code compatibility for boot kernel.
Signed-off-by: Baoqi Zhang <zhangbaoqi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Addresses should all be of unsigned type to avoid unnecessary conversions.
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
We can see the following build error on LoongArch if CONFIG_SUSPEND is
not set:
ld: drivers/acpi/sleep.o: in function 'acpi_pm_prepare':
sleep.c:(.text+0x2b8): undefined reference to 'loongarch_wakeup_start'
Here is the call trace:
acpi_pm_prepare()
__acpi_pm_prepare()
acpi_sleep_prepare()
acpi_get_wakeup_address()
loongarch_wakeup_start()
Root cause: loongarch_wakeup_start() is defined in arch/loongarch/power/
suspend_asm.S which is only built under CONFIG_SUSPEND. In order to fix
the build error, just let acpi_get_wakeup_address() return 0 if CONFIG_
SUSPEND is not set.
Fixes: 366bb35a8e ("LoongArch: Add suspend (ACPI S3) support")
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/all/11215033-fa3c-ecb1-2fc0-e9aeba47be9b@infradead.org/
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Not all LoongArch processors support CRC32 instructions. This feature
is indicated by CPUCFG1.CRC32 (Bit25) but it is wrongly defined in the
previous versions of the ISA manual (and so does in loongarch.h). The
CRC32 feature is set unconditionally now, so fix it.
BTW, expose the CRC32 feature in /proc/cpuinfo.
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
LoongArch maintains cache coherency in hardware, but when paired with
LS7A chipsets the WUC attribute (Weak-ordered UnCached, which is similar
to WriteCombine) is out of the scope of cache coherency machanism for
PCIe devices (this is a PCIe protocol violation, which may be fixed in
newer chipsets).
This means WUC can only used for write-only memory regions now, so this
option is disabled by default, making WUC silently fallback to SUC for
ioremap(). You can enable this option if the kernel is ensured to run on
hardware without this bug.
Kernel parameter writecombine=on/off can be used to override the Kconfig
option.
Cc: stable@vger.kernel.org
Suggested-by: WANG Xuerui <kernel@xen0n.name>
Reviewed-by: WANG Xuerui <kernel@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
LoongArch defines insane ranges for ARCH_FORCE_MAX_ORDER allowing
MAX_ORDER up to 63, which implies maximal contiguous allocation size of
2^63 pages.
Drop bogus definitions of ranges for ARCH_FORCE_MAX_ORDER and leave it a
simple integer with sensible defaults.
Users that *really* need to change the value of ARCH_FORCE_MAX_ORDER will
be able to do so but they won't be mislead by the bogus ranges.
Link: https://lkml.kernel.org/r/20230322081727.2516291-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We introduce a new HAS_IOPORT Kconfig option to indicate support for I/O
Port access. In a future patch HAS_IOPORT=n will disable compilation of
the I/O accessor functions inb()/outb() and friends on architectures
which can not meaningfully support legacy I/O spaces such as s390.
The following architectures do not select HAS_IOPORT:
* ARC
* C-SKY
* Hexagon
* Nios II
* OpenRISC
* s390
* User-Mode Linux
* Xtensa
All other architectures select HAS_IOPORT at least conditionally.
The "depends on" relations on HAS_IOPORT in drivers as well as ifdefs
for HAS_IOPORT specific sections will be added in subsequent patches on
a per subsystem basis.
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net> # for ARCH=um
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Just skip the opcode(BPF_ST | BPF_NOSPEC) in the BPF JIT instead of
failing to JIT the entire program, given LoongArch currently has no
couterpart of a speculation barrier instruction. To verify the issue,
use the ltp testcase as shown below.
Also, Wang says:
I can confirm there's currently no speculation barrier equivalent
on LonogArch. (Loongson says there are builtin mitigations for
Spectre-V1 and V2 on their chips, and AFAIK efforts to port the
exploits to mips/LoongArch have all failed a few years ago.)
Without this patch:
$ ./bpf_prog02
[...]
bpf_common.c:123: TBROK: Failed verification: ??? (524)
[...]
Summary:
passed 0
failed 0
broken 1
skipped 0
warnings 0
With this patch:
$ ./bpf_prog02
[...]
Summary:
passed 0
failed 0
broken 0
skipped 0
warnings 0
Fixes: 5dc615520c ("LoongArch: Add BPF JIT support")
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: WANG Xuerui <git@xen0n.name>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/bpf/20230328071335.2664966-1-guodongtai@kylinos.cn
To be able to trace invocations of smp_send_reschedule(), rename the
arch-specific definitions of it to arch_smp_send_reschedule() and wrap it
into an smp_send_reschedule() that contains a tracepoint.
Changes to include the declaration of the tracepoint were driven by the
following coccinelle script:
@func_use@
@@
smp_send_reschedule(...);
@include@
@@
#include <trace/events/ipi.h>
@no_include depends on func_use && !include@
@@
#include <...>
+
+ #include <trace/events/ipi.h>
[csky bits]
[riscv bits]
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Guo Ren <guoren@kernel.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230307143558.294354-6-vschneid@redhat.com
The actual intention is that no dynamic relocation exists in the VDSO. For
this the VDSO build validates that the resulting .so file does not have any
relocations which are specified via $(ARCH_REL_TYPE_ABS) per architecture,
which is fragile as e.g. ARM64 lacks an entry for R_AARCH64_RELATIVE. Aside
of that ARCH_REL_TYPE_ABS is a misnomer as it checks for relative
relocations too.
However, some GNU ld ports produce unneeded R_*_NONE relocation entries. If
a port fails to determine the exact .rel[a].dyn size, the trailing zeros
become R_*_NONE relocations. E.g. ld's powerpc port recently fixed
https://sourceware.org/bugzilla/show_bug.cgi?id=29540). R_*_NONE are
generally a no-op in the dynamic loaders. So just ignore them.
Remove the ARCH_REL_TYPE_ABS defines and just validate that the resulting
.so file does not contain any R_* relocation entries except R_*_NONE.
Signed-off-by: Fangrui Song <maskray@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> # for aarch64
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> # for vDSO, aarch64
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Link: https://lore.kernel.org/r/20230310190750.3323802-1-maskray@google.com
There are likely no users of this driver as the hardware has been
discontinued since 2010. Remove the driver and all references to it
in documentation.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before commit 076cbf5d2163 ("x86/xen: don't let xen_pv_play_dead()
return"), in Xen, when a previously offlined CPU was brought back
online, it unexpectedly resumed execution where it left off in the
middle of the idle loop.
There were some hacks to make that work, but the behavior was surprising
as do_idle() doesn't expect an offlined CPU to return from the dead (in
arch_cpu_idle_dead()).
Now that Xen has been fixed, and the arch-specific implementations of
arch_cpu_idle_dead() also don't return, give it a __noreturn attribute.
This will cause the compiler to complain if an arch-specific
implementation might return. It also improves code generation for both
caller and callee.
Also fixes the following warning:
vmlinux.o: warning: objtool: do_idle+0x25f: unreachable instruction
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/60d527353da8c99d4cf13b6473131d46719ed16d.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZAZsBwAKCRDbK58LschI
g3W1AQCQnO6pqqX5Q2aYDAZPlZRtV2TRLjuqrQE0dHW/XLAbBgD/bgsAmiKhPSCG
2mTt6izpTQVlZB0e8KcDIvbYd9CE3Qc=
=EjJQ
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-03-06
We've added 85 non-merge commits during the last 13 day(s) which contain
a total of 131 files changed, 7102 insertions(+), 1792 deletions(-).
The main changes are:
1) Add skb and XDP typed dynptrs which allow BPF programs for more
ergonomic and less brittle iteration through data and variable-sized
accesses, from Joanne Koong.
2) Bigger batch of BPF verifier improvements to prepare for upcoming BPF
open-coded iterators allowing for less restrictive looping capabilities,
from Andrii Nakryiko.
3) Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF
programs to NULL-check before passing such pointers into kfunc,
from Alexei Starovoitov.
4) Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in
local storage maps, from Kumar Kartikeya Dwivedi.
5) Add BPF verifier support for ST instructions in convert_ctx_access()
which will help new -mcpu=v4 clang flag to start emitting them,
from Eduard Zingerman.
6) Make uprobe attachment Android APK aware by supporting attachment
to functions inside ELF objects contained in APKs via function names,
from Daniel Müller.
7) Add a new flag BPF_F_TIMER_ABS flag for bpf_timer_start() helper
to start the timer with absolute expiration value instead of relative
one, from Tero Kristo.
8) Add a new kfunc bpf_cgroup_from_id() to look up cgroups via id,
from Tejun Heo.
9) Extend libbpf to support users manually attaching kprobes/uprobes
in the legacy/perf/link mode, from Menglong Dong.
10) Implement workarounds in the mips BPF JIT for DADDI/R4000,
from Jiaxun Yang.
11) Enable mixing bpf2bpf and tailcalls for the loongarch BPF JIT,
from Hengqi Chen.
12) Extend BPF instruction set doc with describing the encoding of BPF
instructions in terms of how bytes are stored under big/little endian,
from Jose E. Marchesi.
13) Follow-up to enable kfunc support for riscv BPF JIT, from Pu Lehui.
14) Fix bpf_xdp_query() backwards compatibility on old kernels,
from Yonghong Song.
15) Fix BPF selftest cross compilation with CLANG_CROSS_FLAGS,
from Florent Revest.
16) Improve bpf_cpumask_ma to only allocate one bpf_mem_cache,
from Hou Tao.
17) Fix BPF verifier's check_subprogs to not unnecessarily mark
a subprogram with has_tail_call, from Ilya Leoshkevich.
18) Fix arm syscall regs spec in libbpf's bpf_tracing.h, from Puranjay Mohan.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
selftests/bpf: Add test for legacy/perf kprobe/uprobe attach mode
selftests/bpf: Split test_attach_probe into multi subtests
libbpf: Add support to set kprobe/uprobe attach mode
tools/resolve_btfids: Add /libsubcmd to .gitignore
bpf: add support for fixed-size memory pointer returns for kfuncs
bpf: generalize dynptr_get_spi to be usable for iters
bpf: mark PTR_TO_MEM as non-null register type
bpf: move kfunc_call_arg_meta higher in the file
bpf: ensure that r0 is marked scratched after any function call
bpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper
bpf: clean up visit_insn()'s instruction processing
selftests/bpf: adjust log_fixup's buffer size for proper truncation
bpf: honor env->test_state_freq flag in is_state_visited()
selftests/bpf: enhance align selftest's expected log matching
bpf: improve regsafe() checks for PTR_TO_{MEM,BUF,TP_BUFFER}
bpf: improve stack slot state printing
selftests/bpf: Disassembler tests for verifier.c:convert_ctx_access()
selftests/bpf: test if pointer type is tracked for BPF_ST_MEM
bpf: allow ctx writes using BPF_ST_MEM instruction
bpf: Use separate RCU callbacks for freeing selem
...
====================
Link: https://lore.kernel.org/r/20230307004346.27578-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
play_dead() doesn't return. Make that more explicit with a BUG().
BUG() is preferable to unreachable() because BUG() is a more explicit
failure mode and avoids undefined behavior like falling off the edge of
the function into whatever code happens to be next.
Link: https://lore.kernel.org/r/21245d687ffeda34dbcf04961a2df3724f04f7c8.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
1, Make -mstrict-align configurable;
2, Add kernel relocation and KASLR support;
3, Add single kernel image implementation for kdump;
4, Add hardware breakpoints/watchpoints support;
5, Add kprobes/kretprobes/kprobes_on_ftrace support;
6, Add LoongArch support for some selftests.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmP+9H0WHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImerz+D/98MjkLXM4qtgfAxuBKpVdEVA4U
bzO19UlpqWlwTJbwrhf0GYsRrAis37PTVJG4eNORJairJ/oTkMtEEBPhwq0D9Whc
URDEh+VrjzFztLsu2OlvzOA9gE7lpg+xAx2LKflP7ixlOELOWeercDLW3octp5/J
CJDE8wPaw9tJrMHFWuiVybs03yZmY3YFV55JdWL9hY8Ryy4DY5997mruOfzjvHpl
EfDgQM2zCn2JSQwaD+Kl3MHxHyRx07Tj2wnZAh9ptaGeptK/yplc7nqRwhe7BevS
QwClhJNPICcOi+evZ7cDUY0PTL4evpw2KRnF1N4zw+58RhZECjVrCEJNdf6L1scj
muptQngWKrE/TJvn4way3cJr44stSCtT71elPhn629S23my/CauMmFqCqKpYOPOf
pxwzzCaqDcaZKwMu96qBkZS76tIrhoNeNFntj+C9RS+8ezY3+o144S3vF1A6A9Zb
M4gwa2NiQuLqnCUwKK6dZkLQVX2NMIMViUkYNKdUStxNWx/K7fFmXcl0ycAFpGYp
8Q95LLH34jUrpSgqMSCmcylsPvNiN1QnuXFnw8Tu+zDthp5dOzio60tORLPM1ZUq
gobPeGjeTQInq4eMCf2B5HH8fOMVtJyj6H4K9G1M6HUMg64UtcBp6BvEbwPxTxNN
sIOFUjDfDnBiIXWF4w==
=SzL5
-----END PGP SIGNATURE-----
Merge tag 'loongarch-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Make -mstrict-align configurable
- Add kernel relocation and KASLR support
- Add single kernel image implementation for kdump
- Add hardware breakpoints/watchpoints support
- Add kprobes/kretprobes/kprobes_on_ftrace support
- Add LoongArch support for some selftests.
* tag 'loongarch-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (23 commits)
selftests/ftrace: Add LoongArch kprobe args string tests support
selftests/seccomp: Add LoongArch selftesting support
tools: Add LoongArch build infrastructure
samples/kprobes: Add LoongArch support
LoongArch: Mark some assembler symbols as non-kprobe-able
LoongArch: Add kprobes on ftrace support
LoongArch: Add kretprobes support
LoongArch: Add kprobes support
LoongArch: Simulate branch and PC* instructions
LoongArch: ptrace: Add hardware single step support
LoongArch: ptrace: Add function argument access API
LoongArch: ptrace: Expose hardware breakpoints to debuggers
LoongArch: Add hardware breakpoints/watchpoints support
LoongArch: kdump: Add crashkernel=YM handling
LoongArch: kdump: Add single kernel image implementation
LoongArch: Add support for kernel address space layout randomization (KASLR)
LoongArch: Add support for kernel relocation
LoongArch: Add la_abs macro implementation
LoongArch: Add JUMP_VIRT_ADDR macro implementation to avoid using la.abs
LoongArch: Use la.pcrel instead of la.abs when it's trivially possible
...
Some assembler symbols are not kprobe safe, such as handle_syscall (used
as syscall exception handler), *memset*/*memcpy*/*memmove* (may cause
recursive exceptions), they can not be instrumented, just blacklist them
for kprobing.
Here is a related problem and discussion:
Link: https://lore.kernel.org/lkml/20230114143859.7ccc45c1c5d9ce302113ab0a@kernel.org/
Tested-by: Jeff Xie <xiehuan09@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Use the generic kretprobe trampoline handler to add kretprobes support
for LoongArch.
Tested-by: Jeff Xie <xiehuan09@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Kprobes allows you to trap at almost any kernel address and execute a
callback function, this commit adds kprobes support for LoongArch.
Tested-by: Jeff Xie <xiehuan09@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT,
PTRACE_KILL and PTRACE_SINGLESTEP handling. This implies defining
arch_has_single_step() and implementing the user_enable_single_step()
and user_disable_single_step() functions.
LoongArch cannot do hardware single-stepping per se, the hardware
single-stepping it is achieved by configuring the instruction fetch
watchpoints (FWPS) and specifies that the next instruction must trigger
the watch exception by setting the mask bit. In some scenarios
CSR.FWPS.Skip is used to ignore the next hit result, avoid endless
repeated triggering of the same watchpoint without canceling it.
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add regs_get_argument() which returns N th argument of the function
call, This enables ftrace kprobe events to access kernel function
arguments via $argN syntax for later use.
E.g.:
echo 'p bio_add_page arg1=$arg1' > kprobe_events
bash: echo: write error: Invalid argument
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Implement the regset-based ptrace interface that exposes hardware
breakpoints to user-space debuggers to query and set instruction and
data breakpoints.
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Use perf framework to manage hardware instruction and data breakpoints.
LoongArch defines hardware watchpoint functions for instruction fetch
and memory load/store operations. After the software configures hardware
watchpoints, the processor hardware will monitor the access address of
the instruction fetch and load/store operation, and trigger an exception
of the watchpoint when it meets the conditions set by the watchpoint.
The hardware monitoring points for instruction fetching and load/store
operations each have a register for the overall configuration of all
monitoring points, a register for recording the status of all monitoring
points, and four registers required for configuration of each watchpoint
individually.
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>