Commit Graph

3879 Commits

Author SHA1 Message Date
Linus Torvalds
3d5ad2d4ec BPF fixes:
- Fix BPF verifier to not affect subreg_def marks in its range
   propagation, from Eduard Zingerman.
 
 - Fix a truncation bug in the BPF verifier's handling of
   coerce_reg_to_size_sx, from Dimitar Kanaliev.
 
 - Fix the BPF verifier's delta propagation between linked
   registers under 32-bit addition, from Daniel Borkmann.
 
 - Fix a NULL pointer dereference in BPF devmap due to missing
   rxq information, from Florian Kauer.
 
 - Fix a memory leak in bpf_core_apply, from Jiri Olsa.
 
 - Fix an UBSAN-reported array-index-out-of-bounds in BTF
   parsing for arrays of nested structs, from Hou Tao.
 
 - Fix build ID fetching where memory areas backing the file
   were created with memfd_secret, from Andrii Nakryiko.
 
 - Fix BPF task iterator tid filtering which was incorrectly
   using pid instead of tid, from Jordan Rome.
 
 - Several fixes for BPF sockmap and BPF sockhash redirection
   in combination with vsocks, from Michal Luczaj.
 
 - Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered,
   from Andrea Parri.
 
 - Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the
   possibility of an infinite BPF tailcall, from Pu Lehui.
 
 - Fix a build warning from resolve_btfids that bpf_lsm_key_free
   cannot be resolved, from Thomas Weißschuh.
 
 - Fix a bug in kfunc BTF caching for modules where the wrong
   BTF object was returned, from Toke Høiland-Jørgensen.
 
 - Fix a BPF selftest compilation error in cgroup-related tests
   with musl libc, from Tony Ambardar.
 
 - Several fixes to BPF link info dumps to fill missing fields,
   from Tyrone Wu.
 
 - Add BPF selftests for kfuncs from multiple modules, checking
   that the correct kfuncs are called, from Simon Sundberg.
 
 - Ensure that internal and user-facing bpf_redirect flags
   don't overlap, also from Toke Høiland-Jørgensen.
 
 - Switch to use kvzmalloc to allocate BPF verifier environment,
   from Rik van Riel.
 
 - Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic
   splat under RT, from Wander Lairson Costa.
 
 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZxK4OhUcZGFuaWVsQGlv
 Z2VhcmJveC5uZXQACgkQ2yufC7HISIOCrwEAib2kC5EEQn5+wKVE/bnZryVX2leT
 YXdfItDCBU6zCYUA+wTU5hGGn9lcDUcZx72l/KZPDyPw7HdzNJ+6iR1zQqoM
 =f9kv
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Daniel Borkmann:

 - Fix BPF verifier to not affect subreg_def marks in its range
   propagation (Eduard Zingerman)

 - Fix a truncation bug in the BPF verifier's handling of
   coerce_reg_to_size_sx (Dimitar Kanaliev)

 - Fix the BPF verifier's delta propagation between linked registers
   under 32-bit addition (Daniel Borkmann)

 - Fix a NULL pointer dereference in BPF devmap due to missing rxq
   information (Florian Kauer)

 - Fix a memory leak in bpf_core_apply (Jiri Olsa)

 - Fix an UBSAN-reported array-index-out-of-bounds in BTF parsing for
   arrays of nested structs (Hou Tao)

 - Fix build ID fetching where memory areas backing the file were
   created with memfd_secret (Andrii Nakryiko)

 - Fix BPF task iterator tid filtering which was incorrectly using pid
   instead of tid (Jordan Rome)

 - Several fixes for BPF sockmap and BPF sockhash redirection in
   combination with vsocks (Michal Luczaj)

 - Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered (Andrea Parri)

 - Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the possibility
   of an infinite BPF tailcall (Pu Lehui)

 - Fix a build warning from resolve_btfids that bpf_lsm_key_free cannot
   be resolved (Thomas Weißschuh)

 - Fix a bug in kfunc BTF caching for modules where the wrong BTF object
   was returned (Toke Høiland-Jørgensen)

 - Fix a BPF selftest compilation error in cgroup-related tests with
   musl libc (Tony Ambardar)

 - Several fixes to BPF link info dumps to fill missing fields (Tyrone
   Wu)

 - Add BPF selftests for kfuncs from multiple modules, checking that the
   correct kfuncs are called (Simon Sundberg)

 - Ensure that internal and user-facing bpf_redirect flags don't overlap
   (Toke Høiland-Jørgensen)

 - Switch to use kvzmalloc to allocate BPF verifier environment (Rik van
   Riel)

 - Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic splat
   under RT (Wander Lairson Costa)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (38 commits)
  lib/buildid: Handle memfd_secret() files in build_id_parse()
  selftests/bpf: Add test case for delta propagation
  bpf: Fix print_reg_state's constant scalar dump
  bpf: Fix incorrect delta propagation between linked registers
  bpf: Properly test iter/task tid filtering
  bpf: Fix iter/task tid filtering
  riscv, bpf: Make BPF_CMPXCHG fully ordered
  bpf, vsock: Drop static vsock_bpf_prot initialization
  vsock: Update msg_count on read_skb()
  vsock: Update rx_bytes on read_skb()
  bpf, sockmap: SK_DROP on attempted redirects of unsupported af_vsock
  selftests/bpf: Add asserts for netfilter link info
  bpf: Fix link info netfilter flags to populate defrag flag
  selftests/bpf: Add test for sign extension in coerce_subreg_to_size_sx()
  selftests/bpf: Add test for truncation after sign extension in coerce_reg_to_size_sx()
  bpf: Fix truncation bug in coerce_reg_to_size_sx()
  selftests/bpf: Assert link info uprobe_multi count & path_size if unset
  bpf: Fix unpopulated path_size when uprobe_multi fields unset
  selftests/bpf: Fix cross-compiling urandom_read
  selftests/bpf: Add test for kfunc module order
  ...
2024-10-18 16:27:14 -07:00
Andrea Parri
e59db0623f riscv, bpf: Make BPF_CMPXCHG fully ordered
According to the prototype formal BPF memory consistency model
discussed e.g. in [1] and following the ordering properties of
the C/in-kernel macro atomic_cmpxchg(), a BPF atomic operation
with the BPF_CMPXCHG modifier is fully ordered.  However, the
current RISC-V JIT lowerings fail to meet such memory ordering
property.  This is illustrated by the following litmus test:

BPF BPF__MP+success_cmpxchg+fence
{
 0:r1=x; 0:r3=y; 0:r5=1;
 1:r2=y; 1:r4=f; 1:r7=x;
}
 P0                               | P1                                         ;
 *(u64 *)(r1 + 0) = 1             | r1 = *(u64 *)(r2 + 0)                      ;
 r2 = cmpxchg_64 (r3 + 0, r4, r5) | r3 = atomic_fetch_add((u64 *)(r4 + 0), r5) ;
                                  | r6 = *(u64 *)(r7 + 0)                      ;
exists (1:r1=1 /\ 1:r6=0)

whose "exists" clause is not satisfiable according to the BPF
memory model.  Using the current RISC-V JIT lowerings, the test
can be mapped to the following RISC-V litmus test:

RISCV RISCV__MP+success_cmpxchg+fence
{
 0:x1=x; 0:x3=y; 0:x5=1;
 1:x2=y; 1:x4=f; 1:x7=x;
}
 P0                 | P1                          ;
 sd x5, 0(x1)       | ld x1, 0(x2)                ;
 L00:               | amoadd.d.aqrl x3, x5, 0(x4) ;
 lr.d x2, 0(x3)     | ld x6, 0(x7)                ;
 bne x2, x4, L01    |                             ;
 sc.d x6, x5, 0(x3) |                             ;
 bne x6, x4, L00    |                             ;
 fence rw, rw       |                             ;
 L01:               |                             ;
exists (1:x1=1 /\ 1:x6=0)

where the two stores in P0 can be reordered.  Update the RISC-V
JIT lowerings/implementation of BPF_CMPXCHG to emit an SC with
RELEASE ("rl") annotation in order to meet the expected memory
ordering guarantees.  The resulting RISC-V JIT lowerings of
BPF_CMPXCHG match the RISC-V lowerings of the C atomic_cmpxchg().

Other lowerings were fixed via 20a759df3b ("riscv, bpf: make
some atomic operations fully ordered").

Fixes: dd642ccb45 ("riscv, bpf: Implement more atomic operations for RV64")
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lpc.events/event/18/contributions/1949/attachments/1665/3441/bpfmemmodel.2024.09.19p.pdf [1]
Link: https://lore.kernel.org/bpf/20241017143628.2673894-1-parri.andrea@gmail.com
2024-10-17 17:14:48 +02:00
Pu Lehui
30a59cc797 riscv, bpf: Fix possible infinite tailcall when CONFIG_CFI_CLANG is enabled
When CONFIG_CFI_CLANG is enabled, the number of prologue instructions
skipped by tailcall needs to include the kcfi instruction, otherwise the
TCC will be initialized every tailcall is called, which may result in
infinite tailcalls.

Fixes: e63985ecd2 ("bpf, riscv64/cfi: Support kCFI + BPF on riscv64")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20241008124544.171161-1-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-09 18:23:06 -07:00
Alexandre Ghiti
cfb10de185
riscv: Fix kernel stack size when KASAN is enabled
We use Kconfig to select the kernel stack size, doubling the default
size if KASAN is enabled.

But that actually only works if KASAN is selected from the beginning,
meaning that if KASAN config is added later (for example using
menuconfig), CONFIG_THREAD_SIZE_ORDER won't be updated, keeping the
default size, which is not enough for KASAN as reported in [1].

So fix this by moving the logic to compute the right kernel stack into a
header.

Fixes: a7555f6b62 ("riscv: stack: Add config of thread stack size")
Reported-by: syzbot+ba9eac24453387a9d502@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000eb301906222aadc2@google.com/ [1]
Cc: stable@vger.kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240917150328.59831-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-01 13:08:11 -07:00
Linus Torvalds
3efc57369a x86:
* KVM currently invalidates the entirety of the page tables, not just
   those for the memslot being touched, when a memslot is moved or deleted.
   The former does not have particularly noticeable overhead, but Intel's
   TDX will require the guest to re-accept private pages if they are
   dropped from the secure EPT, which is a non starter.  Actually,
   the only reason why this is not already being done is a bug which
   was never fully investigated and caused VM instability with assigned
   GeForce GPUs, so allow userspace to opt into the new behavior.
 
 * Advertise AVX10.1 to userspace (effectively prep work for the "real" AVX10
   functionality that is on the horizon).
 
 * Rework common MSR handling code to suppress errors on userspace accesses to
   unsupported-but-advertised MSRs.  This will allow removing (almost?) all of
   KVM's exemptions for userspace access to MSRs that shouldn't exist based on
   the vCPU model (the actual cleanup is non-trivial future work).
 
 * Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC) splits the
   64-bit value into the legacy ICR and ICR2 storage, whereas Intel (APICv)
   stores the entire 64-bit value at the ICR offset.
 
 * Fix a bug where KVM would fail to exit to userspace if one was triggered by
   a fastpath exit handler.
 
 * Add fastpath handling of HLT VM-Exit to expedite re-entering the guest when
   there's already a pending wake event at the time of the exit.
 
 * Fix a WARN caused by RSM entering a nested guest from SMM with invalid guest
   state, by forcing the vCPU out of guest mode prior to signalling SHUTDOWN
   (the SHUTDOWN hits the VM altogether, not the nested guest)
 
 * Overhaul the "unprotect and retry" logic to more precisely identify cases
   where retrying is actually helpful, and to harden all retry paths against
   putting the guest into an infinite retry loop.
 
 * Add support for yielding, e.g. to honor NEED_RESCHED, when zapping rmaps in
   the shadow MMU.
 
 * Refactor pieces of the shadow MMU related to aging SPTEs in prepartion for
   adding multi generation LRU support in KVM.
 
 * Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is enabled,
   i.e. when the CPU has already flushed the RSB.
 
 * Trace the per-CPU host save area as a VMCB pointer to improve readability
   and cleanup the retrieval of the SEV-ES host save area.
 
 * Remove unnecessary accounting of temporary nested VMCB related allocations.
 
 * Set FINAL/PAGE in the page fault error code for EPT violations if and only
   if the GVA is valid.  If the GVA is NOT valid, there is no guest-side page
   table walk and so stuffing paging related metadata is nonsensical.
 
 * Fix a bug where KVM would incorrectly synthesize a nested VM-Exit instead of
   emulating posted interrupt delivery to L2.
 
 * Add a lockdep assertion to detect unsafe accesses of vmcs12 structures.
 
 * Harden eVMCS loading against an impossible NULL pointer deref (really truly
   should be impossible).
 
 * Minor SGX fix and a cleanup.
 
 * Misc cleanups
 
 Generic:
 
 * Register KVM's cpuhp and syscore callbacks when enabling virtualization in
   hardware, as the sole purpose of said callbacks is to disable and re-enable
   virtualization as needed.
 
 * Enable virtualization when KVM is loaded, not right before the first VM
   is created.  Together with the previous change, this simplifies a
   lot the logic of the callbacks, because their very existence implies
   virtualization is enabled.
 
 * Fix a bug that results in KVM prematurely exiting to userspace for coalesced
   MMIO/PIO in many cases, clean up the related code, and add a testcase.
 
 * Fix a bug in kvm_clear_guest() where it would trigger a buffer overflow _if_
   the gpa+len crosses a page boundary, which thankfully is guaranteed to not
   happen in the current code base.  Add WARNs in more helpers that read/write
   guest memory to detect similar bugs.
 
 Selftests:
 
 * Fix a goof that caused some Hyper-V tests to be skipped when run on bare
   metal, i.e. NOT in a VM.
 
 * Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES guest.
 
 * Explicitly include one-off assets in .gitignore.  Past Sean was completely
   wrong about not being able to detect missing .gitignore entries.
 
 * Verify userspace single-stepping works when KVM happens to handle a VM-Exit
   in its fastpath.
 
 * Misc cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmb201AUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOM1gf+Ij7dpCh0KwoNYlHfW2aCHAv3PqQd
 cKMDSGxoCernbJEyPO/3qXNUK+p4zKedk3d92snW3mKa+cwxMdfthJ3i9d7uoNiw
 7hAgcfKNHDZGqAQXhx8QcVF3wgp+diXSyirR+h1IKrGtCCmjMdNC8ftSYe6voEkw
 VTVbLL+tER5H0Xo5UKaXbnXKDbQvWLXkdIqM8dtLGFGLQ2PnF/DdMP0p6HYrKf1w
 B7LBu0rvqYDL8/pS82mtR3brHJXxAr9m72fOezRLEUbfUdzkTUi/b1vEe6nDCl0Q
 i/PuFlARDLWuetlR0VVWKNbop/C/l4EmwCcKzFHa+gfNH3L9361Oz+NzBw==
 =Q7kz
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull x86 kvm updates from Paolo Bonzini:
 "x86:

   - KVM currently invalidates the entirety of the page tables, not just
     those for the memslot being touched, when a memslot is moved or
     deleted.

     This does not traditionally have particularly noticeable overhead,
     but Intel's TDX will require the guest to re-accept private pages
     if they are dropped from the secure EPT, which is a non starter.

     Actually, the only reason why this is not already being done is a
     bug which was never fully investigated and caused VM instability
     with assigned GeForce GPUs, so allow userspace to opt into the new
     behavior.

   - Advertise AVX10.1 to userspace (effectively prep work for the
     "real" AVX10 functionality that is on the horizon)

   - Rework common MSR handling code to suppress errors on userspace
     accesses to unsupported-but-advertised MSRs

     This will allow removing (almost?) all of KVM's exemptions for
     userspace access to MSRs that shouldn't exist based on the vCPU
     model (the actual cleanup is non-trivial future work)

   - Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC)
     splits the 64-bit value into the legacy ICR and ICR2 storage,
     whereas Intel (APICv) stores the entire 64-bit value at the ICR
     offset

   - Fix a bug where KVM would fail to exit to userspace if one was
     triggered by a fastpath exit handler

   - Add fastpath handling of HLT VM-Exit to expedite re-entering the
     guest when there's already a pending wake event at the time of the
     exit

   - Fix a WARN caused by RSM entering a nested guest from SMM with
     invalid guest state, by forcing the vCPU out of guest mode prior to
     signalling SHUTDOWN (the SHUTDOWN hits the VM altogether, not the
     nested guest)

   - Overhaul the "unprotect and retry" logic to more precisely identify
     cases where retrying is actually helpful, and to harden all retry
     paths against putting the guest into an infinite retry loop

   - Add support for yielding, e.g. to honor NEED_RESCHED, when zapping
     rmaps in the shadow MMU

   - Refactor pieces of the shadow MMU related to aging SPTEs in
     prepartion for adding multi generation LRU support in KVM

   - Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is
     enabled, i.e. when the CPU has already flushed the RSB

   - Trace the per-CPU host save area as a VMCB pointer to improve
     readability and cleanup the retrieval of the SEV-ES host save area

   - Remove unnecessary accounting of temporary nested VMCB related
     allocations

   - Set FINAL/PAGE in the page fault error code for EPT violations if
     and only if the GVA is valid. If the GVA is NOT valid, there is no
     guest-side page table walk and so stuffing paging related metadata
     is nonsensical

   - Fix a bug where KVM would incorrectly synthesize a nested VM-Exit
     instead of emulating posted interrupt delivery to L2

   - Add a lockdep assertion to detect unsafe accesses of vmcs12
     structures

   - Harden eVMCS loading against an impossible NULL pointer deref
     (really truly should be impossible)

   - Minor SGX fix and a cleanup

   - Misc cleanups

  Generic:

   - Register KVM's cpuhp and syscore callbacks when enabling
     virtualization in hardware, as the sole purpose of said callbacks
     is to disable and re-enable virtualization as needed

   - Enable virtualization when KVM is loaded, not right before the
     first VM is created

     Together with the previous change, this simplifies a lot the logic
     of the callbacks, because their very existence implies
     virtualization is enabled

   - Fix a bug that results in KVM prematurely exiting to userspace for
     coalesced MMIO/PIO in many cases, clean up the related code, and
     add a testcase

   - Fix a bug in kvm_clear_guest() where it would trigger a buffer
     overflow _if_ the gpa+len crosses a page boundary, which thankfully
     is guaranteed to not happen in the current code base. Add WARNs in
     more helpers that read/write guest memory to detect similar bugs

  Selftests:

   - Fix a goof that caused some Hyper-V tests to be skipped when run on
     bare metal, i.e. NOT in a VM

   - Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES
     guest

   - Explicitly include one-off assets in .gitignore. Past Sean was
     completely wrong about not being able to detect missing .gitignore
     entries

   - Verify userspace single-stepping works when KVM happens to handle a
     VM-Exit in its fastpath

   - Misc cleanups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
  Documentation: KVM: fix warning in "make htmldocs"
  s390: Enable KVM_S390_UCONTROL config in debug_defconfig
  selftests: kvm: s390: Add VM run test case
  KVM: SVM: let alternatives handle the cases when RSB filling is required
  KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid
  KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded equivalent
  KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul' literals
  KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range()
  KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific helper
  KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is allowed
  KVM: x86/mmu: Add a helper to walk and zap rmaps for a memslot
  KVM: x86/mmu: Plumb a @can_yield parameter into __walk_slot_rmaps()
  KVM: x86/mmu: Move walk_slot_rmaps() up near for_each_slot_rmap_range()
  KVM: x86/mmu: WARN on MMIO cache hit when emulating write-protected gfn
  KVM: x86/mmu: Detect if unprotect will do anything based on invalid_list
  KVM: x86/mmu: Subsume kvm_mmu_unprotect_page() into the and_retry() version
  KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure()
  KVM: x86: Update retry protection fields when forcing retry on emulation failure
  KVM: x86: Apply retry protection to "unprotect on failure" path
  KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn
  ...
2024-09-28 09:20:14 -07:00
Linus Torvalds
5701725692 Rust changes for v6.12
Toolchain and infrastructure:
 
  - Support 'MITIGATION_{RETHUNK,RETPOLINE,SLS}' (which cleans up objtool
    warnings), teach objtool about 'noreturn' Rust symbols and mimic
    '___ADDRESSABLE()' for 'module_{init,exit}'. With that, we should be
    objtool-warning-free, so enable it to run for all Rust object files.
 
  - KASAN (no 'SW_TAGS'), KCFI and shadow call sanitizer support.
 
  - Support 'RUSTC_VERSION', including re-config and re-build on change.
 
  - Split helpers file into several files in a folder, to avoid conflicts
    in it. Eventually those files will be moved to the right places with
    the new build system. In addition, remove the need to manually export
    the symbols defined there, reusing existing machinery for that.
 
  - Relax restriction on configurations with Rust + GCC plugins to just
    the RANDSTRUCT plugin.
 
 'kernel' crate:
 
  - New 'list' module: doubly-linked linked list for use with reference
    counted values, which is heavily used by the upcoming Rust Binder.
    This includes 'ListArc' (a wrapper around 'Arc' that is guaranteed
    unique for the given ID), 'AtomicTracker' (tracks whether a 'ListArc'
    exists using an atomic), 'ListLinks' (the prev/next pointers for an
    item in a linked list), 'List' (the linked list itself), 'Iter' (an
    iterator over a 'List'), 'Cursor' (a cursor into a 'List' that allows
    to remove elements), 'ListArcField' (a field exclusively owned by a
    'ListArc'), as well as support for heterogeneous lists.
 
  - New 'rbtree' module: red-black tree abstractions used by the upcoming
    Rust Binder. This includes 'RBTree' (the red-black tree itself),
    'RBTreeNode' (a node), 'RBTreeNodeReservation' (a memory reservation
    for a node), 'Iter' and 'IterMut' (immutable and mutable iterators),
    'Cursor' (bidirectional cursor that allows to remove elements), as
    well as an entry API similar to the Rust standard library one.
 
  - 'init' module: add 'write_[pin_]init' methods and the 'InPlaceWrite'
    trait. Add the 'assert_pinned!' macro.
 
  - 'sync' module: implement the 'InPlaceInit' trait for 'Arc' by
    introducing an associated type in the trait.
 
  - 'alloc' module: add 'drop_contents' method to 'BoxExt'.
 
  - 'types' module: implement the 'ForeignOwnable' trait for
    'Pin<Box<T>>' and improve the trait's documentation. In addition,
    add the 'into_raw' method to the 'ARef' type.
 
  - 'error' module: in preparation for the upcoming Rust support for
    32-bit architectures, like arm, locally allow Clippy lint for those.
 
 Documentation:
 
  - https://rust.docs.kernel.org has been announced, so link to it.
 
  - Enable rustdoc's "jump to definition" feature, making its output a
    bit closer to the experience in a cross-referencer.
 
  - Debian Testing now also provides recent Rust releases (outside of
    the freeze period), so add it to the list.
 
 MAINTAINERS:
 
  - Trevor is joining as reviewer of the "RUST" entry.
 
 And a few other small bits.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmbzNz4ACgkQGXyLc2ht
 IW3muA/9HcPL0QqVB5+SqSRqcatmrFU/wq8Oaa6Z/No0JaynqyikK+R1WNokUd/5
 WpQi4PC1OYV+ekyAuWdkooKmaSqagH5r53XlezNw+cM5zo8y7p0otVlbepQ0t3Ky
 pVEmfDRIeSFXsKrg91BJUKyJf70TQlgSggDVCExlanfOjPz88C1+s3EcJ/XWYGKQ
 cRk/XDdbF5eNaldp2MriVF0fw7XktgIrmVzxt/z0lb4PE7RaCAnO6gSQI+90Vb2d
 zvyOYKS4AkqE3suFvDIIUlPUv+8XbACj0c4wvBZHH5uZGTbgWUffqygJ45GqChEt
 c4fS/+E8VaM1z0EvxNczC0nQkfLwkTc1mgbP+sG3VZJMPVCJ2zQan1/ond7GqCpw
 pt6uQaGvDsAvllm7sbiAIVaAY81icqyYWKfNBXLLEL7DhY5je5Wq+E83XQ8d5u5F
 EuapnZhW3y12d6UCsSe9bD8W45NFoWHPXky1TzT+whTxnX1yH9YsPXbJceGSbbgd
 Lw3GmUtZx2bVAMToVjNFD2lPA3OmPY1e2lk0jwzTuQrEXfnZYuzbjqs3YUijb7xR
 AlsWfIb0IHBwHWpB7da24ezqWP2VD4eaDdD8/+LmDSj6XLngxMNWRLKmXT000eTW
 vIFP9GJrvag2R3YFPhrurgGpRsp8HUTLtvcZROxp2JVQGQ7Z4Ww=
 =52BN
 -----END PGP SIGNATURE-----

Merge tag 'rust-6.12' of https://github.com/Rust-for-Linux/linux

Pull Rust updates from Miguel Ojeda:
 "Toolchain and infrastructure:

   - Support 'MITIGATION_{RETHUNK,RETPOLINE,SLS}' (which cleans up
     objtool warnings), teach objtool about 'noreturn' Rust symbols and
     mimic '___ADDRESSABLE()' for 'module_{init,exit}'. With that, we
     should be objtool-warning-free, so enable it to run for all Rust
     object files.

   - KASAN (no 'SW_TAGS'), KCFI and shadow call sanitizer support.

   - Support 'RUSTC_VERSION', including re-config and re-build on
     change.

   - Split helpers file into several files in a folder, to avoid
     conflicts in it. Eventually those files will be moved to the right
     places with the new build system. In addition, remove the need to
     manually export the symbols defined there, reusing existing
     machinery for that.

   - Relax restriction on configurations with Rust + GCC plugins to just
     the RANDSTRUCT plugin.

  'kernel' crate:

   - New 'list' module: doubly-linked linked list for use with reference
     counted values, which is heavily used by the upcoming Rust Binder.

     This includes 'ListArc' (a wrapper around 'Arc' that is guaranteed
     unique for the given ID), 'AtomicTracker' (tracks whether a
     'ListArc' exists using an atomic), 'ListLinks' (the prev/next
     pointers for an item in a linked list), 'List' (the linked list
     itself), 'Iter' (an iterator over a 'List'), 'Cursor' (a cursor
     into a 'List' that allows to remove elements), 'ListArcField' (a
     field exclusively owned by a 'ListArc'), as well as support for
     heterogeneous lists.

   - New 'rbtree' module: red-black tree abstractions used by the
     upcoming Rust Binder.

     This includes 'RBTree' (the red-black tree itself), 'RBTreeNode' (a
     node), 'RBTreeNodeReservation' (a memory reservation for a node),
     'Iter' and 'IterMut' (immutable and mutable iterators), 'Cursor'
     (bidirectional cursor that allows to remove elements), as well as
     an entry API similar to the Rust standard library one.

   - 'init' module: add 'write_[pin_]init' methods and the
     'InPlaceWrite' trait. Add the 'assert_pinned!' macro.

   - 'sync' module: implement the 'InPlaceInit' trait for 'Arc' by
     introducing an associated type in the trait.

   - 'alloc' module: add 'drop_contents' method to 'BoxExt'.

   - 'types' module: implement the 'ForeignOwnable' trait for
     'Pin<Box<T>>' and improve the trait's documentation. In addition,
     add the 'into_raw' method to the 'ARef' type.

   - 'error' module: in preparation for the upcoming Rust support for
     32-bit architectures, like arm, locally allow Clippy lint for
     those.

  Documentation:

   - https://rust.docs.kernel.org has been announced, so link to it.

   - Enable rustdoc's "jump to definition" feature, making its output a
     bit closer to the experience in a cross-referencer.

   - Debian Testing now also provides recent Rust releases (outside of
     the freeze period), so add it to the list.

  MAINTAINERS:

   - Trevor is joining as reviewer of the "RUST" entry.

  And a few other small bits"

* tag 'rust-6.12' of https://github.com/Rust-for-Linux/linux: (54 commits)
  kasan: rust: Add KASAN smoke test via UAF
  kbuild: rust: Enable KASAN support
  rust: kasan: Rust does not support KHWASAN
  kbuild: rust: Define probing macros for rustc
  kasan: simplify and clarify Makefile
  rust: cfi: add support for CFI_CLANG with Rust
  cfi: add CONFIG_CFI_ICALL_NORMALIZE_INTEGERS
  rust: support for shadow call stack sanitizer
  docs: rust: include other expressions in conditional compilation section
  kbuild: rust: replace proc macros dependency on `core.o` with the version text
  kbuild: rust: rebuild if the version text changes
  kbuild: rust: re-run Kconfig if the version text changes
  kbuild: rust: add `CONFIG_RUSTC_VERSION`
  rust: avoid `box_uninit_write` feature
  MAINTAINERS: add Trevor Gross as Rust reviewer
  rust: rbtree: add `RBTree::entry`
  rust: rbtree: add cursor
  rust: rbtree: add mutable iterator
  rust: rbtree: add iterator
  rust: rbtree: add red-black tree implementation backed by the C version
  ...
2024-09-25 10:25:40 -07:00
Linus Torvalds
97d8894b6f RISC-V Patches for the 6.12 Merge Window, Part 1
* Support for using Zkr to seed KASLR.
 * Support for IPI-triggered CPU backtracing.
 * Support for generic CPU vulnerabilities reporting to userspace.
 * A few cleanups for missing licenses.
 * The size limit on the XIP kernel has been removed.
 * Support for tracing userspace stacks.
 * Support for the Svvptc extension.
 * Various cleanups and fixes throughout the tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmbykZITHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiTDLEACABTPlMX+Ke1lMYWj6MLBTXMSlQJ6w
 TKLVk3slp4POPsd+ViWy4J6EJDpCkNp6iK30Bv4AGainA3RJnbS7neKYy+MTw0ZV
 pJWTn73sxHeF+APnZ+geFYcmyteL/SZplgHgwLipFaojs7i/+tOvLFRRD3LueCz6
 Q6Muvke9R5uN6LL3tUrmIhJeyZjOwJtKEYoRz6Fo5Tq3RRK0VHYZkdpqRQ86rr9U
 yNbRNlBLlANstq8xjtiwm22ImPGIsvvhs0AvFft0H72zhf3Tfy0VcTLlDJYwkdKN
 Bz59v4N4jg1ajXuAsj4/BQdj5uRkzJNZ3Yq1yvMmGI+2UV1tInHEMeZi63+4gs8F
 0FYCziA+j6/08u8v8CrwdDOpJ9Iq/NMV6359bt5ySu3rM6QnPRGnHGkv4Ptk9Oyh
 sSPiuHS8YxpRBOB8xXNp5xFipTZ4+65VqCz253pAsbbp5aZ/o9Jw/bbBFFWcSsRZ
 tV2xhELVzPiC7dowfYkQhcNZOLlCaJ/Mvo2nMOtjbUwzaXkcjIRcYEy7+dTlfXyr
 3h5sStjWAKEmtLEvCYSI+lAT544tbV1x+lCMJGuvztapsTMtAP4GpQKblXl5qqnV
 L+VWIPJV1elI26H/p/Max9Z1s48NtfwoJSRL7Rnaya6uJ3iWG/BtajHlFbvgOkJf
 ObPZ//Yr9e91gg==
 =zDwL
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support using Zkr to seed KASLR

 - Support IPI-triggered CPU backtracing

 - Support for generic CPU vulnerabilities reporting to userspace

 - A few cleanups for missing licenses

 - The size limit on the XIP kernel has been removed

 - Support for tracing userspace stacks

 - Support for the Svvptc extension

 - Various cleanups and fixes throughout the tree

* tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (47 commits)
  crash: Fix riscv64 crash memory reserve dead loop
  perf/riscv-sbi: Add platform specific firmware event handling
  tools: Optimize ring buffer for riscv
  tools: Add riscv barrier implementation
  RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t
  ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
  riscv: Enable bitops instrumentation
  riscv: Omit optimized string routines when using KASAN
  ACPI: RISCV: Make acpi_numa_get_nid() to be static
  riscv: Randomize lower bits of stack address
  selftests: riscv: Allow mmap test to compile on 32-bit
  riscv: Make riscv_isa_vendor_ext_andes array static
  riscv: Use LIST_HEAD() to simplify code
  riscv: defconfig: Disable RZ/Five peripheral support
  RISC-V: Implement kgdb_roundup_cpus() to enable future NMI Roundup
  riscv: avoid Imbalance in RAS
  riscv: cacheinfo: Add back init_cache_level() function
  riscv: Remove unused _TIF_WORK_MASK
  drivers/perf: riscv: Remove redundant macro check
  riscv: define ILLEGAL_POINTER_VALUE for 64bit
  ...
2024-09-24 10:59:17 -07:00
Linus Torvalds
4e2c9cd7dc i2c-for-6.12-rc1
I2C core
 ========
 
 After 15 years of deprecation, the I2C_COMPAT symbol has finally been
 removed. Also client addresses are now locked during initialization to
 prevent race conditions between different kinds of instantiation. Scoped
 foreach OF child loops are now used. And the testunit has received some
 cleanups and documentation improvements as well as two new tests, one
 for repeated start and one for triggering SMBusAlert interrupts.
 
 I2C host drivers
 ================
 
 The DesignWare and the Renesas I2C drivers have received most of
 the changes in this pull request.
 
 The first has has undergone through a series of cleanups that
 have been sent to the mailing list a year ago for the first time
 and finally get merged in this pull request. They are many, from
 typos (e.g. i2/i2c), to cosmetics, to refactoring (e.g. move
 inline functions to librarieas) and many others.
 
 Besides that, all the DesignWare Kconfig options have been
 grouped under the I2C_DESIGNWARE_CORE and this required some
 adaptation in many of the kernel configuration files for
 different arm and mips boards.
 
 Follows the list of the rest of the changes grouped by type of
 change.
 
 Cleanups
 --------
 The Qualcomm Geni platform improves the exit path in the runtime
 resume function.
 
 The Intel LJCA driver loses "target_addr" parameter in
 ljca_i2c_stop() because it was unused.
 
 The MediaTek controller intializes the restart_flag in the
 transfer function using the ternary conditional operator ("? :")
 instead of initializing it in different parts.
 
 Constified a few global data structures in the virtio driver.
 
 The Renesas driver simplifies the bus speed handling in the init
 function making it more readable.
 
 Improved an if/else statement in probe function of the Renesas
 R-Car driver.
 
 The iMX/MXC driver switches to using the RUNTIME_PM_OPS() instead
 of SET_RUNTIME_PM_OPS().
 
 Still in the iMX/MXC driver a comma ',' has been replaced by a
 semicolon ';', while in different drivers the ',' has been
 removed from the '{ }' delimiters.
 
 Finally three devm_clk_get_enabled() have been used to simplify
 the devm_clk_get/clk_prepare_enable tuple in the Renesas EMEV2,
 Ingenic and MPC drivers.
 
 Refactors
 ---------
 The Nuvoton fixes a potential out of boundary array access. This
 is not a bug fix because the issue could never occur due to
 hardware not having the properties listed in the array. The
 change makes the driver more future proof and, at the same time,
 silences code analyzers.
 
 Improvements
 ------------
 The Renesas I2C (riic) driver undergoes several patches improving
 the runtime power management handling.
 
 The Intel i801 driver uses a more descriptive adapter's name to
 show the presence of the IDF feature.
 
 In the Intel Denverton (ismt) adapter the pending transactions
 are killed when irq's can't complete their handling, triggering a
 timeout. This could have been considered as a bug fix, but
 because, standing to Vasily, it's very sporadic, I preferred
 considering the patch rather as an improvement.
 
 New Feature
 -----------
 The Renesas I2C (riic) driver now supports the fast mode plus.
 
 New support
 -----------
 Added support for:
 
     - Renesas R9A08G045
     - Rockchip RK3576
     - KEBA I2C
     - Theobroma Systems Mule Multiplexer.
 
 The Keba comes with a new driver, i2c-keba.c.
 The Mule is an i2c multiplexer and it also comes with a new
 driver, mux/i2c-mux-mule.c.
 
 Core patch
 ----------
 This pull request includes also a patch in the I2C framework, in
 i2c-core-base.c where the runtime PM functions have been replaced
 in order to allow to be accessed during the device add.
 
 Devicetree
 ----------
 Some cleanups in the devicetree, as well. nVidia and Qualcomm
 bindings improve their "if:then:" blocks. While the aspeed
 binding loses the "multi-master" property because it was
 redundant.
 
 The i2c-sprd binding has been converted to YAML.
 
 AT24 updates
 ============
 
 - document a new model from giantec in DT bindings
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEOZGx6rniZ1Gk92RdFA3kzBSgKbYFAmbxGRsACgkQFA3kzBSg
 KbY2IhAApRRZHNmxxcMRmxbDnbNweJdRGbtADEdkLMDui8oeK9SJHgScD0bR308p
 pkHEvyOg7UiW0N4wkaDK0YpORIrGSHys2DqIC05/OLPLONz2Ry/qKuNkisZBxo6l
 oP9uVKDMuCHgZq7xsxqupmpefMJ0m8XBGoMKPLbyBMu7ga4vB8o4uEQZfQLKs3YP
 mFm4plZvECCVPgJ5/bp43cFFmhfPLTd088k9/XzFwB730uXPO6VsBuaYzQ7tMOR3
 NQCmh/8sFJmVlJvkTnQ5QRNTo2zn+hNmjV1avFJwo5lqz35TmfpVR/+TjYPwi9v6
 7H5KjHrIxQHmeaLwm94wOuJSriFzQ3DUQkxvH7vRxXDef+6nTRdD6xC+zxePKLXo
 R4dYslP+5yXvtPYHonJUTXXZkfug58iO7W6Isc/5ody1y4FD22daTG5HXuWRlaAP
 7O0kiyQmrwy5IZCqpwVPBJ7f+dpZzpCVP0OyXeHVXyK61rZT4zG9FvEiLQYjmYOn
 MOSbddFm5yQRu+OB8GVmYKlVlCG0S+Y11fFMCO/yJZQJqLXZm2AbonwB8sz0OqZu
 4zDgXg+z3Xy/Go6/FFfjltoWq/9dYzzFzUi0oB6rm0U/pFawtCtAYhfCodzJDZI4
 QvzALJuFWwQZjGNmqDVfYABcX8wFxE8zAteQy+htu0Fn7qSV7GU=
 =DTNY
 -----END PGP SIGNATURE-----

Merge tag 'i2c-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:
 "I2C core:

   - finally remove the I2C_COMPAT symbol after 15 years of deprecation

   - lock client addresses during initialization to prevent race
     conditions between different kinds of instantiation

   - use scoped foreach OF child loops

   - testunit cleanups and documentation improvements, as well as two
     new tests, one for repeated start and one for triggering SMBusAlert
     interrupts

  I2C host drivers:

   - DesignWare and Renesas I2C driver updates.

     The first has has undergone through a series of cleanups that have
     been sent to the mailing list a year ago for the first time and
     finally get merged in this pull request. They are many, from typos
     (e.g. i2/i2c), to cosmetics, to refactoring (e.g. move inline
     functions to librarieas) and many others.

   - all the DesignWare Kconfig options have been grouped under the
     I2C_DESIGNWARE_CORE and this required some adaptation in many of
     the kernel configuration files for different arm and mips boards

  Cleanups:

   - improve the exit path in the runtime resume function for the
     Qualcomm Geni platform

   - get rid of the unused "target_addr" parameter in the Intel LJCA
     driver

   - intialize the restart_flag in the MediaTek controller in one single
     place

   - constify a few global data structures in the virtio driver

   - simplify the bus speed handling in the Renesas driver init function
     making it more readable

   - improved probe function of the Renesas R-Car driver

   - switch the iMX/MXC driver to use RUNTIME_PM_OPS() instead of
     SET_RUNTIME_PM_OPS()

   - iMX/MXC driver cleanups

   - use devm_clk_get_enabled() to simplify the Renesas EMEV2, Ingenic
     and MPC drivers

  Refactoring:

   - Fix a potential out of boundary array access in the Nuvoton driver.

     This is not a bug fix because the issue could never occur due to
     hardware not having the properties listed in the array. The change
     makes the driver more future proof and, at the same time, silences
     code analyzers.

  Improvements:

   - several patches improving the runtime power management handling of
     the Renesas I2C (riic) driver

   - use a more descriptive adapter name in the Intel i801 driver to
     show the presence of the IDF feature

   - kill pending transactions when irq's can't complete their handling
     in the Intel Denverton (ismt) driver, triggering a timeout

  New Feature:

   - support fast mode plus in the Renesas I2C (riic) driver

  New support:

   - Added support for:
      - Renesas R9A08G045
      - Rockchip RK3576
      - KEBA I2C
      - Theobroma Systems Mule Multiplexer.

   - new i2c-keba.c driver

   - new driver for The Mule i2c multiplexer

  Core I2C framework:

   - move runtime PM functions in order to allow them to be accessed
     during device add

  Devicetree:

   - nVidia and Qualcomm binding improvements

   - get rid of redundant "multi-master" property in the aspeed binding

   - convert i2c-sprd binding to YAML

  AT24 updates:

  - document a new model from giantec in DT bindings"

* tag 'i2c-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (69 commits)
  i2c: designware: Use pci_get_drvdata()
  i2c: designware: Propagate firmware node
  i2c: designware: Uninline i2c_dw_probe()
  i2c: ljca: Remove unused "target_addr" parameter
  i2c: keba: Add KEBA I2C controller support
  i2c: i801: Use a different adapter-name for IDF adapters
  i2c: core: Setup i2c_adapter runtime-pm before calling device_add()
  dt-bindings: i2c: i2c-sprd: convert to YAML
  i2c: ismt: kill transaction in hardware on timeout
  i2c: designware: Group all DesignWare drivers under a single option
  net: txgbe: Fix I2C Kconfig dependencies
  RISC-V: configs: enable I2C_DESIGNWARE_CORE with I2C_DESIGNWARE_PLATFORM
  mips: configs: enable I2C_DESIGNWARE_CORE with I2C_DESIGNWARE_PLATFORM
  arm64: defconfig: enable I2C_DESIGNWARE_CORE with I2C_DESIGNWARE_PLATFORM
  ARM: configs: enable I2C_DESIGNWARE_CORE with I2C_DESIGNWARE_PLATFORM
  ARC: configs: enable I2C_DESIGNWARE_CORE with I2C_DESIGNWARE_PLATFORM
  i2c: virtio: Constify struct i2c_algorithm and struct virtio_device_id
  i2c: rcar: tidyup priv->devtype handling on rcar_i2c_probe()
  i2c: imx: Convert comma to semicolon
  i2c: jz4780: Use devm_clk_get_enabled() helpers
  ...
2024-09-23 14:34:19 -07:00
Linus Torvalds
7856a56541 Many singleton patches - please see the various changelogs for details.
Quite a lot of nilfs2 work this time around.
 
 Notable patch series in this pull request are:
 
 "mul_u64_u64_div_u64: new implementation" by Nicolas Pitre, with
 assistance from Uwe Kleine-König.  Reimplement mul_u64_u64_div_u64() to
 provide (much) more accurate results.  The current implementation was
 causing Uwe some issues in the PWM drivers.
 
 "xz: Updates to license, filters, and compression options" from Lasse
 Collin.  Miscellaneous maintenance and kinor feature work to the xz
 decompressor.
 
 "Fix some GDB command error and add some GDB commands" from Kuan-Ying Lee.
 Fixes and enhancements to the gdb scripts.
 
 "treewide: add missing MODULE_DESCRIPTION() macros" from Jeff Johnson.
 Adds lots of MODULE_DESCRIPTIONs, thus fixing lots of warnings about this.
 
 "nilfs2: add support for some common ioctls" from Ryusuke Konishi.  Adds
 various commonly-available ioctls to nilfs2.
 
 "This series fixes a number of formatting issues in kernel doc comments"
 from Ryusuke Konishi does that.
 
 "nilfs2: prevent unexpected ENOENT propagation" from Ryusuke Konishi.  Fix
 issues where -ENOENT was being unintentionally and inappropriately
 returned to userspace.
 
 "nilfs2: assorted cleanups" from Huang Xiaojia.
 
 "nilfs2: fix potential issues with empty b-tree nodes" from Ryusuke
 Konishi fixes some issues which can occur on corrupted nilfs2 filesystems.
 
 "scripts/decode_stacktrace.sh: improve error reporting and usability" from
 Luca Ceresoli does those things.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZu7dpAAKCRDdBJ7gKXxA
 jsPqAPwMDEZyKlfSw7QioEHNHDkmkbP7VYCYR0CbUnppbztwpAD8D37aVbWQ+UzM
 3nnOq3W2Pc2o/20zqi8Upf1mnvUrygQ=
 =/NWE
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-09-21-07-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "Many singleton patches - please see the various changelogs for
  details.

  Quite a lot of nilfs2 work this time around.

  Notable patch series in this pull request are:

   - "mul_u64_u64_div_u64: new implementation" by Nicolas Pitre, with
     assistance from Uwe Kleine-König. Reimplement mul_u64_u64_div_u64()
     to provide (much) more accurate results. The current implementation
     was causing Uwe some issues in the PWM drivers.

   - "xz: Updates to license, filters, and compression options" from
     Lasse Collin. Miscellaneous maintenance and kinor feature work to
     the xz decompressor.

   - "Fix some GDB command error and add some GDB commands" from
     Kuan-Ying Lee. Fixes and enhancements to the gdb scripts.

   - "treewide: add missing MODULE_DESCRIPTION() macros" from Jeff
     Johnson. Adds lots of MODULE_DESCRIPTIONs, thus fixing lots of
     warnings about this.

   - "nilfs2: add support for some common ioctls" from Ryusuke Konishi.
     Adds various commonly-available ioctls to nilfs2.

   - "This series fixes a number of formatting issues in kernel doc
     comments" from Ryusuke Konishi does that.

   - "nilfs2: prevent unexpected ENOENT propagation" from Ryusuke
     Konishi. Fix issues where -ENOENT was being unintentionally and
     inappropriately returned to userspace.

   - "nilfs2: assorted cleanups" from Huang Xiaojia.

   - "nilfs2: fix potential issues with empty b-tree nodes" from Ryusuke
     Konishi fixes some issues which can occur on corrupted nilfs2
     filesystems.

   - "scripts/decode_stacktrace.sh: improve error reporting and
     usability" from Luca Ceresoli does those things"

* tag 'mm-nonmm-stable-2024-09-21-07-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (103 commits)
  list: test: increase coverage of list_test_list_replace*()
  list: test: fix tests for list_cut_position()
  proc: use __auto_type more
  treewide: correct the typo 'retun'
  ocfs2: cleanup return value and mlog in ocfs2_global_read_info()
  nilfs2: remove duplicate 'unlikely()' usage
  nilfs2: fix potential oob read in nilfs_btree_check_delete()
  nilfs2: determine empty node blocks as corrupted
  nilfs2: fix potential null-ptr-deref in nilfs_btree_insert()
  user_namespace: use kmemdup_array() instead of kmemdup() for multiple allocation
  tools/mm: rm thp_swap_allocator_test when make clean
  squashfs: fix percpu address space issues in decompressor_multi_percpu.c
  lib: glob.c: added null check for character class
  nilfs2: refactor nilfs_segctor_thread()
  nilfs2: use kthread_create and kthread_stop for the log writer thread
  nilfs2: remove sc_timer_task
  nilfs2: do not repair reserved inode bitmap in nilfs_new_inode()
  nilfs2: eliminate the shared counter and spinlock for i_generation
  nilfs2: separate inode type information from i_state field
  nilfs2: use the BITS_PER_LONG macro
  ...
2024-09-21 08:20:50 -07:00
Linus Torvalds
617a814f14 ALong with the usual shower of singleton patches, notable patch series in
this pull request are:
 
 "Align kvrealloc() with krealloc()" from Danilo Krummrich.  Adds
 consistency to the APIs and behaviour of these two core allocation
 functions.  This also simplifies/enables Rustification.
 
 "Some cleanups for shmem" from Baolin Wang.  No functional changes - mode
 code reuse, better function naming, logic simplifications.
 
 "mm: some small page fault cleanups" from Josef Bacik.  No functional
 changes - code cleanups only.
 
 "Various memory tiering fixes" from Zi Yan.  A small fix and a little
 cleanup.
 
 "mm/swap: remove boilerplate" from Yu Zhao.  Code cleanups and
 simplifications and .text shrinkage.
 
 "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt.  This
 is a feature, it adds new feilds to /proc/vmstat such as
 
     $ grep kstack /proc/vmstat
     kstack_1k 3
     kstack_2k 188
     kstack_4k 11391
     kstack_8k 243
     kstack_16k 0
 
 which tells us that 11391 processes used 4k of stack while none at all
 used 16k.  Useful for some system tuning things, but partivularly useful
 for "the dynamic kernel stack project".
 
 "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov.
 Teaches kmemleak to detect leaksage of percpu memory.
 
 "mm: memcg: page counters optimizations" from Roman Gushchin.  "3
 independent small optimizations of page counters".
 
 "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David
 Hildenbrand.  Improves PTE/PMD splitlock detection, makes powerpc/8xx work
 correctly by design rather than by accident.
 
 "mm: remove arch_make_page_accessible()" from David Hildenbrand.  Some
 folio conversions which make arch_make_page_accessible() unneeded.
 
 "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel.
 Cleans up and fixes our handling of the resetting of the cgroup/process
 peak-memory-use detector.
 
 "Make core VMA operations internal and testable" from Lorenzo Stoakes.
 Rationalizaion and encapsulation of the VMA manipulation APIs.  With a
 view to better enable testing of the VMA functions, even from a
 userspace-only harness.
 
 "mm: zswap: fixes for global shrinker" from Takero Funaki.  Fix issues in
 the zswap global shrinker, resulting in improved performance.
 
 "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao.  Fill in
 some missing info in /proc/zoneinfo.
 
 "mm: replace follow_page() by folio_walk" from David Hildenbrand.  Code
 cleanups and rationalizations (conversion to folio_walk()) resulting in
 the removal of follow_page().
 
 "improving dynamic zswap shrinker protection scheme" from Nhat Pham.  Some
 tuning to improve zswap's dynamic shrinker.  Significant reductions in
 swapin and improvements in performance are shown.
 
 "mm: Fix several issues with unaccepted memory" from Kirill Shutemov.
 Improvements to the new unaccepted memory feature,
 
 "mm/mprotect: Fix dax puds" from Peter Xu.  Implements mprotect on DAX
 PUDs.  This was missing, although nobody seems to have notied yet.
 
 "Introduce a store type enum for the Maple tree" from Sidhartha Kumar.
 Cleanups and modest performance improvements for the maple tree library
 code.
 
 "memcg: further decouple v1 code from v2" from Shakeel Butt.  Move more
 cgroup v1 remnants away from the v2 memcg code.
 
 "memcg: initiate deprecation of v1 features" from Shakeel Butt.  Adds
 various warnings telling users that memcg v1 features are deprecated.
 
 "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li.
 Greatly improves the success rate of the mTHP swap allocation.
 
 "mm: introduce numa_memblks" from Mike Rapoport.  Moves various disparate
 per-arch implementations of numa_memblk code into generic code.
 
 "mm: batch free swaps for zap_pte_range()" from Barry Song.  Greatly
 improves the performance of munmap() of swap-filled ptes.
 
 "support large folio swap-out and swap-in for shmem" from Baolin Wang.
 With this series we no longer split shmem large folios into simgle-page
 folios when swapping out shmem.
 
 "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao.  Nice performance
 improvements and code reductions for gigantic folios.
 
 "support shmem mTHP collapse" from Baolin Wang.  Adds support for
 khugepaged's collapsing of shmem mTHP folios.
 
 "mm: Optimize mseal checks" from Pedro Falcato.  Fixes an mprotect()
 performance regression due to the addition of mseal().
 
 "Increase the number of bits available in page_type" from Matthew Wilcox.
 Increases the number of bits available in page_type!
 
 "Simplify the page flags a little" from Matthew Wilcox.  Many legacy page
 flags are now folio flags, so the page-based flags and their
 accessors/mutators can be removed.
 
 "mm: store zero pages to be swapped out in a bitmap" from Usama Arif.  An
 optimization which permits us to avoid writing/reading zero-filled zswap
 pages to backing store.
 
 "Avoid MAP_FIXED gap exposure" from Liam Howlett.  Fixes a race window
 which occurs when a MAP_FIXED operqtion is occurring during an unrelated
 vma tree walk.
 
 "mm: remove vma_merge()" from Lorenzo Stoakes.  Major rotorooting of the
 vma_merge() functionality, making ot cleaner, more testable and better
 tested.
 
 "misc fixups for DAMON {self,kunit} tests" from SeongJae Park.  Minor
 fixups of DAMON selftests and kunit tests.
 
 "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang.  Code
 cleanups and folio conversions.
 
 "Shmem mTHP controls and stats improvements" from Ryan Roberts.  Cleanups
 for shmem controls and stats.
 
 "mm: count the number of anonymous THPs per size" from Barry Song.  Expose
 additional anon THP stats to userspace for improved tuning.
 
 "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio
 conversions and removal of now-unused page-based APIs.
 
 "replace per-quota region priorities histogram buffer with per-context
 one" from SeongJae Park.  DAMON histogram rationalization.
 
 "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae
 Park.  DAMON documentation updates.
 
 "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve
 related doc and warn" from Jason Wang: fixes usage of page allocator
 __GFP_NOFAIL and GFP_ATOMIC flags.
 
 "mm: split underused THPs" from Yu Zhao.  Improve THP=always policy - this
 was overprovisioning THPs in sparsely accessed memory areas.
 
 "zram: introduce custom comp backends API" frm Sergey Senozhatsky.  Add
 support for zram run-time compression algorithm tuning.
 
 "mm: Care about shadow stack guard gap when getting an unmapped area" from
 Mark Brown.  Fix up the various arch_get_unmapped_area() implementations
 to better respect guard areas.
 
 "Improve mem_cgroup_iter()" from Kinsey Ho.  Improve the reliability of
 mem_cgroup_iter() and various code cleanups.
 
 "mm: Support huge pfnmaps" from Peter Xu.  Extends the usage of huge
 pfnmap support.
 
 "resource: Fix region_intersects() vs add_memory_driver_managed()" from
 Huang Ying.  Fix a bug in region_intersects() for systems with CXL memory.
 
 "mm: hwpoison: two more poison recovery" from Kefeng Wang.  Teaches a
 couple more code paths to correctly recover from the encountering of
 poisoned memry.
 
 "mm: enable large folios swap-in support" from Barry Song.  Support the
 swapin of mTHP memory into appropriately-sized folios, rather than into
 single-page folios.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZu1BBwAKCRDdBJ7gKXxA
 jlWNAQDYlqQLun7bgsAN4sSvi27VUuWv1q70jlMXTfmjJAvQqwD/fBFVR6IOOiw7
 AkDbKWP2k0hWPiNJBGwoqxdHHx09Xgo=
 =s0T+
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Along with the usual shower of singleton patches, notable patch series
  in this pull request are:

   - "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds
     consistency to the APIs and behaviour of these two core allocation
     functions. This also simplifies/enables Rustification.

   - "Some cleanups for shmem" from Baolin Wang. No functional changes -
     mode code reuse, better function naming, logic simplifications.

   - "mm: some small page fault cleanups" from Josef Bacik. No
     functional changes - code cleanups only.

   - "Various memory tiering fixes" from Zi Yan. A small fix and a
     little cleanup.

   - "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and
     simplifications and .text shrinkage.

   - "Kernel stack usage histogram" from Pasha Tatashin and Shakeel
     Butt. This is a feature, it adds new feilds to /proc/vmstat such as

       $ grep kstack /proc/vmstat
       kstack_1k 3
       kstack_2k 188
       kstack_4k 11391
       kstack_8k 243
       kstack_16k 0

     which tells us that 11391 processes used 4k of stack while none at
     all used 16k. Useful for some system tuning things, but
     partivularly useful for "the dynamic kernel stack project".

   - "kmemleak: support for percpu memory leak detect" from Pavel
     Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory.

   - "mm: memcg: page counters optimizations" from Roman Gushchin. "3
     independent small optimizations of page counters".

   - "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from
     David Hildenbrand. Improves PTE/PMD splitlock detection, makes
     powerpc/8xx work correctly by design rather than by accident.

   - "mm: remove arch_make_page_accessible()" from David Hildenbrand.
     Some folio conversions which make arch_make_page_accessible()
     unneeded.

   - "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David
     Finkel. Cleans up and fixes our handling of the resetting of the
     cgroup/process peak-memory-use detector.

   - "Make core VMA operations internal and testable" from Lorenzo
     Stoakes. Rationalizaion and encapsulation of the VMA manipulation
     APIs. With a view to better enable testing of the VMA functions,
     even from a userspace-only harness.

   - "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix
     issues in the zswap global shrinker, resulting in improved
     performance.

   - "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill
     in some missing info in /proc/zoneinfo.

   - "mm: replace follow_page() by folio_walk" from David Hildenbrand.
     Code cleanups and rationalizations (conversion to folio_walk())
     resulting in the removal of follow_page().

   - "improving dynamic zswap shrinker protection scheme" from Nhat
     Pham. Some tuning to improve zswap's dynamic shrinker. Significant
     reductions in swapin and improvements in performance are shown.

   - "mm: Fix several issues with unaccepted memory" from Kirill
     Shutemov. Improvements to the new unaccepted memory feature,

   - "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on
     DAX PUDs. This was missing, although nobody seems to have notied
     yet.

   - "Introduce a store type enum for the Maple tree" from Sidhartha
     Kumar. Cleanups and modest performance improvements for the maple
     tree library code.

   - "memcg: further decouple v1 code from v2" from Shakeel Butt. Move
     more cgroup v1 remnants away from the v2 memcg code.

   - "memcg: initiate deprecation of v1 features" from Shakeel Butt.
     Adds various warnings telling users that memcg v1 features are
     deprecated.

   - "mm: swap: mTHP swap allocator base on swap cluster order" from
     Chris Li. Greatly improves the success rate of the mTHP swap
     allocation.

   - "mm: introduce numa_memblks" from Mike Rapoport. Moves various
     disparate per-arch implementations of numa_memblk code into generic
     code.

   - "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly
     improves the performance of munmap() of swap-filled ptes.

   - "support large folio swap-out and swap-in for shmem" from Baolin
     Wang. With this series we no longer split shmem large folios into
     simgle-page folios when swapping out shmem.

   - "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice
     performance improvements and code reductions for gigantic folios.

   - "support shmem mTHP collapse" from Baolin Wang. Adds support for
     khugepaged's collapsing of shmem mTHP folios.

   - "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect()
     performance regression due to the addition of mseal().

   - "Increase the number of bits available in page_type" from Matthew
     Wilcox. Increases the number of bits available in page_type!

   - "Simplify the page flags a little" from Matthew Wilcox. Many legacy
     page flags are now folio flags, so the page-based flags and their
     accessors/mutators can be removed.

   - "mm: store zero pages to be swapped out in a bitmap" from Usama
     Arif. An optimization which permits us to avoid writing/reading
     zero-filled zswap pages to backing store.

   - "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race
     window which occurs when a MAP_FIXED operqtion is occurring during
     an unrelated vma tree walk.

   - "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of
     the vma_merge() functionality, making ot cleaner, more testable and
     better tested.

   - "misc fixups for DAMON {self,kunit} tests" from SeongJae Park.
     Minor fixups of DAMON selftests and kunit tests.

   - "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang.
     Code cleanups and folio conversions.

   - "Shmem mTHP controls and stats improvements" from Ryan Roberts.
     Cleanups for shmem controls and stats.

   - "mm: count the number of anonymous THPs per size" from Barry Song.
     Expose additional anon THP stats to userspace for improved tuning.

   - "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more
     folio conversions and removal of now-unused page-based APIs.

   - "replace per-quota region priorities histogram buffer with
     per-context one" from SeongJae Park. DAMON histogram
     rationalization.

   - "Docs/damon: update GitHub repo URLs and maintainer-profile" from
     SeongJae Park. DAMON documentation updates.

   - "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and
     improve related doc and warn" from Jason Wang: fixes usage of page
     allocator __GFP_NOFAIL and GFP_ATOMIC flags.

   - "mm: split underused THPs" from Yu Zhao. Improve THP=always policy.
     This was overprovisioning THPs in sparsely accessed memory areas.

   - "zram: introduce custom comp backends API" frm Sergey Senozhatsky.
     Add support for zram run-time compression algorithm tuning.

   - "mm: Care about shadow stack guard gap when getting an unmapped
     area" from Mark Brown. Fix up the various arch_get_unmapped_area()
     implementations to better respect guard areas.

   - "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability
     of mem_cgroup_iter() and various code cleanups.

   - "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge
     pfnmap support.

   - "resource: Fix region_intersects() vs add_memory_driver_managed()"
     from Huang Ying. Fix a bug in region_intersects() for systems with
     CXL memory.

   - "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches
     a couple more code paths to correctly recover from the encountering
     of poisoned memry.

   - "mm: enable large folios swap-in support" from Barry Song. Support
     the swapin of mTHP memory into appropriately-sized folios, rather
     than into single-page folios"

* tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (416 commits)
  zram: free secondary algorithms names
  uprobes: turn xol_area->pages[2] into xol_area->page
  uprobes: introduce the global struct vm_special_mapping xol_mapping
  Revert "uprobes: use vm_special_mapping close() functionality"
  mm: support large folios swap-in for sync io devices
  mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios
  mm: fix swap_read_folio_zeromap() for large folios with partial zeromap
  mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries
  set_memory: add __must_check to generic stubs
  mm/vma: return the exact errno in vms_gather_munmap_vmas()
  memcg: cleanup with !CONFIG_MEMCG_V1
  mm/show_mem.c: report alloc tags in human readable units
  mm: support poison recovery from copy_present_page()
  mm: support poison recovery from do_cow_fault()
  resource, kunit: add test case for region_intersects()
  resource: make alloc_free_mem_region() works for iomem_resource
  mm: z3fold: deprecate CONFIG_Z3FOLD
  vfio/pci: implement huge_fault support
  mm/arm64: support large pfn mappings
  mm/x86: support large pfn mappings
  ...
2024-09-21 07:29:05 -07:00
Mayuresh Chitale
f0c9363db2
perf/riscv-sbi: Add platform specific firmware event handling
The SBI v2.0 specification pointed to by the link below reserves the
event code 0xffff for platform specific firmware events. Update the driver
to be able to parse and program such events. The platform specific
firmware events must now be specified in the perf command as below:
perf stat -e rCxxx ...
where bits[63:62] = 0x3 of the event config indicate a platform specific
firmware event and xxx indicate the actual event code which is passed
as the event data.

Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Link: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/v2.0/riscv-sbi.pdf
Link: https://lore.kernel.org/r/20240812051109.6496-1-mchitale@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 05:58:11 -07:00
Palmer Dabbelt
ad380f6a0a
RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t
I recently ended up with a warning on some compilers along the lines of

      CC      kernel/resource.o
    In file included from include/linux/ioport.h:16,
                     from kernel/resource.c:15:
    kernel/resource.c: In function 'gfr_start':
    include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow]
       49 |         ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); })
          |                                     ^
    include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique'
       52 |         __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_))
          |         ^~~~~~~~~~~~~~~~~
    include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once'
      161 | #define min_t(type, x, y) __cmp_once(min, type, x, y)
          |                           ^~~~~~~~~~
    kernel/resource.c:1829:23: note: in expansion of macro 'min_t'
     1829 |                 end = min_t(resource_size_t, base->end,
          |                       ^~~~~
    kernel/resource.c: In function 'gfr_continue':
    include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow]
       49 |         ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); })
          |                                     ^
    include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique'
       52 |         __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_))
          |         ^~~~~~~~~~~~~~~~~
    include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once'
      161 | #define min_t(type, x, y) __cmp_once(min, type, x, y)
          |                           ^~~~~~~~~~
    kernel/resource.c:1847:24: note: in expansion of macro 'min_t'
     1847 |                addr <= min_t(resource_size_t, base->end,
          |                        ^~~~~
    cc1: all warnings being treated as errors

which looks like a real problem: our phys_addr_t is only 32 bits now, so
having 34-bit masks is just going to result in overflows.

Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240731162159.9235-2-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 01:32:39 -07:00
Haibo Xu
732b177663
ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
Currently, only acpi_early_node_map[0] was initialized to NUMA_NO_NODE.
To ensure all the values were properly initialized, switch to initialize
all of them to NUMA_NO_NODE.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> (arm64 platform)
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240729035958.1957185-1-haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 01:31:26 -07:00
Linus Torvalds
baeb9a7d8b Enable PREEMPT_RT on supported architectures:
After twenty years of development we finally reached the point to enable
   PREEMPT_RT support in the mainline kernel.
 
   All prerequisites are merged, so enable it on the supported architectures
   ARM64, RISCV and X86(32/64-bit).
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbpR28THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZXyD/972yR+oUuGNu8mb02H4kEcOCrsMOuM
 gMAneOjJY7+m4P2ois1gPoK6aPCT6AUlQj6DxOwjOhlmpWV6TaznymiVkKVZMhiT
 hRm5mWGxjXFtak/cnNixzAy/FsVGzBkQ/urPRfjb8MOGN9+9AOwmenXWsKUziaq9
 b/XrovT1nkr5DA7fTbKa8Mw4+9PpZC5HacAfZwtbtPhKX7CbbCQugjMcrzN3h17/
 g2EOPBLORfaEdWtnce6ZW+LZJ7y9dLdodoE6S2vZg/PfobsHqKhw7Kkw/Wr/iB1/
 dHWps2b55X+3Oo410vm7Q4sEHY2Z4n0a51mHR7N2pqsEZLke+70SQuQ9MU7JRAKv
 ospsscPKCnbG4T4XYk8k3g56bbuu1xHnfGYFA6vhE48IrHMB/601lkH5Z5Xl2a3W
 x7wrXRuAwkPuLxiTRSp3MH3asq8cwBZKXMVelC7ctr6QqQbF3DSJFbyWezIvP+kz
 IyI7L3CcRYtExW8wY5ocXvMmwCDzz7XaQL9cqegLtkyxPd3CifzcDc9T8vWd6Zec
 PLMHBOFEaBWy+AsiOevvpSmy1kE8Ncm29xqafP06MyECAPQRzaexwVVBA5zalXIG
 zHyd0KdrVE1vix82JKGPn7ngDxdZPR6AEXc1NE1135fBCzSYM15T3JYrGXTzhFR1
 c+Qo+hqOoKztbQ==
 =+vJ6
 -----END PGP SIGNATURE-----

Merge tag 'sched-rt-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RT enablement from Thomas Gleixner:
 "Enable PREEMPT_RT on supported architectures:

  After twenty years of development we finally reached the point to
  enable PREEMPT_RT support in the mainline kernel.

  All prerequisites are merged, so enable it on the supported
  architectures ARM64, RISCV and X86(32/64-bit)"

* tag 'sched-rt-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  riscv: Allow to enable PREEMPT_RT.
  arm64: Allow to enable PREEMPT_RT.
  x86: Allow to enable PREEMPT_RT.
2024-09-20 06:04:27 +02:00
Palmer Dabbelt
5835437609
Merge patch series "riscv: Improve KASAN coverage to fix unit tests"
Samuel Holland <samuel.holland@sifive.com> says:

This series fixes two areas where uninstrumented assembly routines
caused gaps in KASAN coverage on RISC-V, which were caught by KUnit
tests. The KASAN KUnit test suite passes after applying this series.

This series fixes the following test failures:
  # kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1520
  KASAN failure expected in "kasan_int_result = strcmp(ptr, "2")", but none occurred
  # kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1524
  KASAN failure expected in "kasan_int_result = strlen(ptr)", but none occurred
  not ok 60 kasan_strings
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1531
  KASAN failure expected in "set_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1533
  KASAN failure expected in "clear_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1535
  KASAN failure expected in "clear_bit_unlock(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1536
  KASAN failure expected in "__clear_bit_unlock(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1537
  KASAN failure expected in "change_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1543
  KASAN failure expected in "test_and_set_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1545
  KASAN failure expected in "test_and_set_bit_lock(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1546
  KASAN failure expected in "test_and_clear_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1548
  KASAN failure expected in "test_and_change_bit(nr, addr)", but none occurred
  not ok 61 kasan_bitops_generic

Samuel Holland (2):
  riscv: Omit optimized string routines when using KASAN
  riscv: Enable bitops instrumentation

arch/riscv/include/asm/bitops.h | 43 ++++++++++++++++++---------------
 arch/riscv/include/asm/string.h |  2 ++
 arch/riscv/kernel/riscv_ksyms.c |  3 ---
 arch/riscv/lib/Makefile         |  2 ++
 arch/riscv/lib/strcmp.S         |  1 +
 arch/riscv/lib/strlen.S         |  1 +
 arch/riscv/lib/strncmp.S        |  1 +
 arch/riscv/purgatory/Makefile   |  2 ++
 8 files changed, 32 insertions(+), 23 deletions(-)

* b4-shazam-merge:
  riscv: Enable bitops instrumentation
  riscv: Omit optimized string routines when using KASAN

Link: https://lore.kernel.org/r/20240801033725.28816-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-19 01:10:44 -07:00
Samuel Holland
77514915b7
riscv: Enable bitops instrumentation
Instead of implementing the bitops functions directly in assembly,
provide the arch_-prefixed versions and use the wrappers from
asm-generic to add instrumentation. This improves KASAN coverage and
fixes the kasan_bitops_generic() unit test.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240801033725.28816-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-19 01:10:04 -07:00
Samuel Holland
58ff537109
riscv: Omit optimized string routines when using KASAN
The optimized string routines are implemented in assembly, so they are
not instrumented for use with KASAN. Fall back to the C version of the
routines in order to improve KASAN coverage. This fixes the
kasan_strings() unit test.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240801033725.28816-2-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-19 01:10:00 -07:00
Hanjun Guo
21d98d658f
ACPI: RISCV: Make acpi_numa_get_nid() to be static
acpi_numa_get_nid() is only called in acpi_numa.c for riscv,
no need to add it in head file, so make it static and remove
related functions in the asm/acpi.h.

Spotted by doing some cleanup for arm64 ACPI.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Haibo Xu <haibo1.xu@intel.com>
Link: https://lore.kernel.org/r/20240811031804.3347298-1-guohanjun@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 12:02:48 -07:00
Paolo Bonzini
c09dd2bb57 Merge branch 'kvm-redo-enable-virt' into HEAD
Register KVM's cpuhp and syscore callbacks when enabling virtualization in
hardware, as the sole purpose of said callbacks is to disable and re-enable
virtualization as needed.

The primary motivation for this series is to simplify dealing with enabling
virtualization for Intel's TDX, which needs to enable virtualization
when kvm-intel.ko is loaded, i.e. long before the first VM is created.

That said, this is a nice cleanup on its own.  By registering the callbacks
on-demand, the callbacks themselves don't need to check kvm_usage_count,
because their very existence implies a non-zero count.

Patch 1 (re)adds a dedicated lock for kvm_usage_count.  This avoids a
lock ordering issue between cpus_read_lock() and kvm_lock.  The lock
ordering issue still exist in very rare cases, and will be fixed for
good by switching vm_list to an (S)RCU-protected list.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-09-17 11:38:20 -04:00
Yunhui Cui
048e2906d4
riscv: Randomize lower bits of stack address
Implement arch_align_stack() to randomize the lower bits
of the stack address.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240625030502.68988-1-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 08:05:10 -07:00
Charlie Jenkins
594ffcf4ef
riscv: Make riscv_isa_vendor_ext_andes array static
Since this array is only used in this file, it should be static.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407241530.ej5SVgX1-lkp@intel.com/
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240807-make_andes_static-v1-1-b64bf4c3d941@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 08:05:08 -07:00
Jinjie Ruan
3cc754c237
riscv: Use LIST_HEAD() to simplify code
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD().

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240904013344.2026738-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 06:26:07 -07:00
Geert Uytterhoeven
e36ddf3226
riscv: defconfig: Disable RZ/Five peripheral support
There is not much point in keeping support for RZ/Five peripherals
enabled, as the RZ/Five platform option (ARCH_R9A07G043) is gated behind
NONPORTABLE.  Hence drop all config options that enable built-in or
modular support for peripherals found on RZ/Five SoCs.

Disable USB_XHCI_RCAR explicitly, as its value defaults to the value of
ARCH_RENESAS, which is still enabled.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/89ad70c7d6e8078208fecfd41dc03f6028531729.1722353710.git.geert+renesas@glider.be
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 06:13:21 -07:00
Jinjie Ruan
983f121499
RISC-V: Implement kgdb_roundup_cpus() to enable future NMI Roundup
Until now, the generic weak kgdb_roundup_cpus() has been used for kgdb on
RISCV. A custom one allows to debug CPUs that are stuck with interrupts
disabled with NMI support in the future. And using an IPI is better than
the generic one since it avoids the potential situation described in the
generic kgdb_call_nmi_hook(). As Andrew pointed out, once there is NMI
support, we can easily extend this and the CPU backtrace support
to use NMIs.

After this patch, the kgdb test show that:
	# echo g > /proc/sysrq-trigger
	[2]kdb> btc
	btc: cpu status: Currently on cpu 2
	Available cpus: 0-1(-), 2, 3(-)
	Stack traceback for pid 0
	0xffffffff81c13a40        0        0  1    0   -  0xffffffff81c14510  swapper/0
	CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.10.0-g3120273055b6-dirty #51
	Hardware name: riscv-virtio,qemu (DT)
	Call Trace:
	[<ffffffff80006c48>] dump_backtrace+0x28/0x30
	[<ffffffff80fceb38>] show_stack+0x38/0x44
	[<ffffffff80fe6a04>] dump_stack_lvl+0x58/0x7a
	[<ffffffff80fe6a3e>] dump_stack+0x18/0x20
	[<ffffffff801143fa>] kgdb_cpu_enter+0x682/0x6b2
	[<ffffffff801144ca>] kgdb_nmicallback+0xa0/0xac
	[<ffffffff8000a392>] handle_IPI+0x9c/0x120
	[<ffffffff800a2baa>] handle_percpu_devid_irq+0xa4/0x1e4
	[<ffffffff8009cca8>] generic_handle_domain_irq+0x28/0x36
	[<ffffffff800a9e5c>] ipi_mux_process+0xe8/0x110
	[<ffffffff806e1e30>] imsic_handle_irq+0xf8/0x13a
	[<ffffffff8009cca8>] generic_handle_domain_irq+0x28/0x36
	[<ffffffff806dff12>] riscv_intc_aia_irq+0x2e/0x40
	[<ffffffff80fe6ab0>] handle_riscv_irq+0x54/0x86
	[<ffffffff80ff2e4a>] call_on_irq_stack+0x32/0x40

Rebased on Ryo Takakura's "RISC-V: Enable IPI CPU Backtrace" patch.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240727063438.886155-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 05:52:44 -07:00
Sebastian Andrzej Siewior
2638e4e6b1 riscv: Allow to enable PREEMPT_RT.
It is really time.

riscv has all the required architecture related changes, that have been
identified over time, in order to enable PREEMPT_RT. With the recent
printk changes, the last known road block has been addressed.

Allow to enable PREEMPT_RT on riscv.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nam Cao <namcao@linutronix.de> # Visionfive 2
Link: https://lore.kernel.org/all/20240906111841.562402-4-bigeasy@linutronix.de
2024-09-17 11:06:08 +02:00
Linus Torvalds
38ea77ab07 soc: defconfig updates for 6.12
The updates to the defconfig files are fairly small, enabling
 drivers for eight of the arm and riscv based platforms.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmboHkgACgkQYKtH/8kJ
 Uiedsw//ee8uHQj003iuqawzMn0oBKcjBO0MpDSanv3qL0Lm5yLZ4s1Nw2jNlsEd
 uiQytgkduiYwQ9hM4u19cAPi3xab1ruoj9wAp3kcx1dmEbjNP5QTwPkNiNwZIcMu
 f6Mz7SOTWwp1YWxAXqVU6usR6H3N7VR3P5XdNP/TBSS7MNW6bRPqiO6kAc4AT1H2
 RNik3m1kCQIPpX1pdYK2bnCFRIl+TIr2jrHV9B9zYKc1f/il4MuIjVP2U1WDQ10U
 r5xKCb8MrIQtju+aCfmYwYqUYQ95ptjSas1MwV0xURzxDL/x9B8wL+NDeNsomwJq
 vlk6vr3HtIsJduXylIxqQ5KY6q/uxJ9whmtU943mPdEZgp3nZeXbK+EHXT5KZbB3
 GAGaf//Pt/0+3jQA/qThD9+GF2OMxKUiyi4ZIezI4pj9VoB6uhittmHQlmPc0KvP
 8UpkaHiyztgO9R08XGvmQ03A6B0UdJMPFAy1onYZkyL+MhRF2a04D16nkLBGa2jU
 DvNo9ojqUfSEKiLpQtCwVFc0lQkAdnoVsywcrAp4+KXqzNC51I1Rj59SvaebvCcg
 eTat1cSVQRvHnSbG/TOiWBtVZ9GqisLf9vAIwYV05b7jT5fWff69olXDXTcQma1P
 8aYIGtUIeeDmr+qE6U31aToj/BXpOuI6MvY0tkBfdyqRYuq6lN8=
 =A/M4
 -----END PGP SIGNATURE-----

Merge tag 'soc-defconfig-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC defconfig updates from Arnd Bergmann:
 "The updates to the defconfig files are fairly small, enabling drivers
  for eight of the arm and riscv based platforms"

* tag 'soc-defconfig-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: defconfig: enable mt8365 sound
  riscv: defconfig: Enable pinctrl support for CV18XX Series SoC
  arm64: defconfig: Enable ADP5585 GPIO and PWM drivers
  arm64: defconfig: Enable Tegra194 PCIe Endpoint
  arm64: defconfig: Enable E5010 JPEG Encoder
  riscv: defconfig: sophgo: enable clks for sg2042
  arm64: defconfig: build CONFIG_REGULATOR_QCOM_REFGEN as module
  ARM: configs: at91: enable config flags for sam9x7 SoC family
  arm64: defconfig: Enable R-Car Ethernet-TSN support
  ARM: shmobile: defconfig: Enable slab hardening and kmalloc buckets
  arm64: defconfig: Enable AK4619 codec support
2024-09-17 10:53:21 +02:00
Linus Torvalds
7b17f5ebd5 soc: devicetree updates for 6.12
New SoC support for Broadcom bcm2712 (Raspberry Pi 5) and Renesas
 R9A09G057 (RZ/V2H(P)) and Qualcomm Snapdragon 414 (MSM8929), all three
 of these are variants of already supported chips, in particular the last
 one is almost identical to MSM8939.
 
 Lots of updates to Mediatek, ASpeed, Rockchips, Amlogic, Qualcomm,
 STM32, NXP i.MX, Sophgo, TI K3, Renesas, Microchip at91, NVIDIA Tegra,
 and T-HEAD.
 
 The added Qualcomm platform support once again dominates the changes,
 with seven phones and three laptops getting added in addition to
 many new features on existing machines. The Snapdragon X1E support
 specifically keeps improving.
 
 The other new machines are:
 
  - eight new machines using various 64-bit Rockchips SoCs, both
    on the consumer/gaming side and developer boards
  - three industrial boards with 64-bit i.MX, which is a very
    low number for them.
  - four more servers using a 32-bit Speed BMC
  - three boards using STM32MP1 SoCs
  - one new machine each using allwinner, amlogic, broadcom
    and renesas chips.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmboLzkACgkQYKtH/8kJ
 Uid+1g/+J8rQQxIjLxxbx+TkhECt5X1u5mQZTZBIeCZmJQz2rNvmo3bm89ZAR32Z
 FnjSN0fXw7eZqnxImwNAIU7g7RBhj5zs1gKXsB2lb0vv7722KyQ1xz2Fh1NQWQ09
 OMCVjI1+19zBZYCB0C1Y2WTsFRUl5ISE3H3Wx8MJT1GWDDao/D2ULkEda0uTSu3i
 CBYBNwCtBJU7TsGe5a04P7rGKvOlDdVj+2VvMKaX6bFa+MDxoMtlABWLZRJCwOy8
 04+Oz9AO0r6HpsrAKOgxxNod7Jkw13UUG22PoTS4+B2Bc7/9oXTcJM8e+44BEe4J
 nyJButDCAf7IsqOuB0S/4J0YxtcDGnzJXNQrUg11owwVXC+uzVvkUExOneRBXqUc
 179OlY5tCXaaRtmoeUTOH9C4rk5x6o5jHCLs2DJNf9TsOwD2VjzUvUWp5WBhDDG4
 qxIUvflGm2pXhF9OeK+7fPllTc1pUmA2/LZ9LXc/13Zn3eZKGn/Kql1SNFC0CIi0
 8kQnIcV0dOh7E+zPcYENR+NGuTUU2GH3iQM9frHIaPc+KcaXPRVJDqREe/RNYRqN
 qDY7yIGkeqmH9mKhdV+WQGBjJ6z3ElOMYVST6Kq3JBDiF12UaCPEhG2t8inmvEsA
 t7nL84iWpeC1Gh+AT8UJBlRSFzQoafIrVav26pqwCvOrK7UHMZk=
 =r07W
 -----END PGP SIGNATURE-----

Merge tag 'soc-dt-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC devicetree updates from Arnd Bergmann:
 "New SoC support for Broadcom bcm2712 (Raspberry Pi 5) and Renesas
  R9A09G057 (RZ/V2H(P)) and Qualcomm Snapdragon 414 (MSM8929), all three
  of these are variants of already supported chips, in particular the
  last one is almost identical to MSM8939.

  Lots of updates to Mediatek, ASpeed, Rockchips, Amlogic, Qualcomm,
  STM32, NXP i.MX, Sophgo, TI K3, Renesas, Microchip at91, NVIDIA Tegra,
  and T-HEAD.

  The added Qualcomm platform support once again dominates the changes,
  with seven phones and three laptops getting added in addition to many
  new features on existing machines. The Snapdragon X1E support
  specifically keeps improving.

  The other new machines are:

   - eight new machines using various 64-bit Rockchips SoCs, both on the
     consumer/gaming side and developer boards

   - three industrial boards with 64-bit i.MX, which is a very low
     number for them.

   - four more servers using a 32-bit Speed BMC

   - three boards using STM32MP1 SoCs

   - one new machine each using allwinner, amlogic, broadcom and renesas
     chips"

* tag 'soc-dt-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (672 commits)
  arm64: dts: allwinner: h5: NanoPi NEO Plus2: Use regulators for pio
  arm64: dts: mediatek: add audio support for mt8365-evk
  arm64: dts: mediatek: add afe support for mt8365 SoC
  arm64: dts: mediatek: mt8186-corsola: Disable DPI display interface
  arm64: dts: mediatek: mt8186: Add svs node
  arm64: dts: mediatek: mt8186: Add power domain for DPI
  arm64: dts: mediatek: mt8195: Correct clock order for dp_intf*
  arm64: dts: mt8183: add dpi node to mt8183
  arm64: dts: allwinner: h5: NanoPi Neo Plus2: Fix regulators
  arm64: dts: rockchip: add CAN0 and CAN1 interfaces to mecsbc board
  arm64: dts: rockchip: add CAN-FD controller nodes to rk3568
  arm64: dts: nuvoton: ma35d1: Add uart pinctrl settings
  arm64: dts: nuvoton: ma35d1: Add pinctrl and gpio nodes
  arm64: dts: nuvoton: Add syscon to the system-management node
  ARM: dts: Fix undocumented LM75 compatible nodes
  arm64: dts: toshiba: Fix pl011 and pl022 clocks
  ARM: dts: stm32: Use SAI to generate bit and frame clock on STM32MP15xx DHCOM PDK2
  ARM: dts: stm32: Switch bitclock/frame-master to flag on STM32MP15xx DHCOM PDK2
  ARM: dts: stm32: Sort properties in audio endpoints on STM32MP15xx DHCOM PDK2
  ARM: dts: stm32: Add MECIO1 and MECT1S board variants
  ...
2024-09-17 10:41:21 +02:00
Linus Torvalds
11b3125073 ACPI updates for 6.12-rc1
- Check return value in acpi_db_convert_to_package() (Pei Xiao).
 
  - Detect FACS and allow setting the waking vector on reduced-hardware
    ACPI platforms (Jiaqing Zhao).
 
  - Allow ACPICA to represent semaphores as integers (Adrien Destugues).
 
  - Complete CXL 3.0 CXIMS structures support in ACPICA (Zhang Rui).
 
  - Make ACPICA support SPCR version 4 and add RISC-V SBI Subtype to
    DBG2 (Sia Jee Heng).
 
  - Implement the Dword_PCC Resource Descriptor Macro in ACPICA (Jose
    Marinho).
 
  - Correct the typo in struct acpi_mpam_msc_node member (Punit Agrawal).
 
  - Implement ACPI_WARNING_ONCE() and ACPI_ERROR_ONCE() and use them to
    prevent a Stall() violation warning from being printed every time
    this takes place (Vasily Khoruzhick).
 
  - Allow PCC Data Type in MCTP resource (Adam Young).
 
  - Fix memory leaks on acpi_ps_get_next_namepath()
    and acpi_ps_get_next_field() failures  (Armin Wolf).
 
  - Add support for supressing leading zeros in hex strings when
    converting them to integers and update integer-to-hex-string
    conversions in ACPICA (Armin Wolf).
 
  - Add support for Windows 11 22H2 _OSI string (Armin Wolf).
 
  - Avoid warning for Dump Functions in ACPICA (Adam Lackorzynski).
 
  - Add extended linear address mode to HMAT MSCIS in ACPICA (Dave
    Jiang).
 
  - Handle empty connection_node in iasl (Aleksandrs Vinarskis).
 
  - Allow for more flexibility in _DSM args (Saket Dumbre).
 
  - Setup for ACPICA release 20240827 (Saket Dumbre).
 
  - Add ACPI device enumeration support for interrupt controller probing
    including taking dependencies into account (Sunil V L).
 
  - Implement ACPI-based interrupt controller probing on RISC-V (Sunil V L).
 
  - Add ACPI support for AIA in riscv-intc and add ACPI support to
    riscv-imsic, riscv-aplic, and sifive-plic (Sunil V L).
 
  - Do not release locks during operation region accesses in the ACPI EC
    driver (Rafael Wysocki).
 
  - Fix up the _STR handling in the ACPI device object sysfs interface,
    make it represent the device object attributes as an attribute group
    and make it rely on driver core functionality for sysfs attrubute
    management (Thomas Weißschuh).
 
  - Extend error messages printed to the kernel log when acpi_evaluate_dsm()
    fails to include revision and function number (David Wang).
 
  - Add a new AMDI0015 platform device ID to the ACPi APD driver for AMD
    SoCs (Shyam Sundar S K).
 
  - Use the driver core for the async probing management in the ACPI
    battery driver (Thomas Weißschuh).
 
  - Remove redundant initalizations of a local variable to NULL from the
    ACPI battery driver (Ilpo Järvinen).
 
  - Remove unneeded check in tps68470_pmic_opregion_probe() (Aleksandr
    Mishin).
 
  - Add support for setting the EPP register through the ACPI CPPC sysfs
    interface if it is in FFH (Mario Limonciello).
 
  - Fix MASK_VAL() usage in the ACPI CPPC library (Clément Léger).
 
  - Reduce the log level of a per-CPU message about idle states in the
    ACPI processor driver (Li RongQing).
 
  - Fix crash in exit_round_robin() in the ACPI processor aggregator
    device (PAD) driver (Seiji Nishikawa).
 
  - Add force_vendor quirk for Panasonic Toughbook CF-18 in the ACPI
    backlight driver (Hans de Goede).
 
  - Make the DMI checks related to backlight handling on Lenovo Yoga
    Tab 3 X90F less strict (Hans de Goede).
 
  - Enforce native backlight handling on Apple MacbookPro9,2 (Esther
    Shimanovich).
 
  - Add IRQ override quirks for Asus Vivobook Go E1404GAB and MECHREV
    GM7XG0M, and refine the TongFang GMxXGxx quirk (Li Chen, Tamim Khan,
    Werner Sembach).
 
  - Quirk ASUS ROG M16 to default to S3 sleep (Luke D. Jones).
 
  - Define and use symbols for device and class name lengths in the ACPI
    bus type code and make the code use strscpy() instead of strcpy() in
    several places (Muhammad Qasim Abdul Majeed).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmbjJ9kSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxhfMP/3i4Nrkmf2HpiSJ/zFMSISNbAEmLSqQQ
 gSo0Mmj1OHN9W9rBiIVDgJjeakyLg2IHB1sFZ9ABtU1JvO9mMchU7OlDKIt8Q8sf
 VJa+q0tcA4kny5BZa47fPjZaaM6f9boVTm5WRn9T7KSLA+EGBAxE+UXQ2ibxiPCc
 ZWX8obeYe78Zv2i5U8LiO4mQlB2viGEgO/5vKywmNKYVpurOMAv4zGjvDfRxK3ZQ
 GXIZLUCh0inu8VomrbI5B1bpqNTxUrLoEAExKpyAyIiRYay+nyv8Vm2sSw9roe3a
 C9pux4pojT0zfkmCVJmXET0982GcMSDaB0Rb1ypwbC2EdTtEoauC/HTyTixNBxBa
 MnHntDe/l6Z9gLhbj8dcfB0ZVUkahqFzndWA9EBwroor2S7woZNtA3jL9VNHbM1J
 kKNPQ2YCQi1ObQcftZDC9UYYx62KVvWNZCTS1+ZjnpKNH8hcEEBwMlnmE1VTYeHf
 TN0vbB6QJSDu26qOyiWMCgLAR45TW/YzA3CrJi7/zGMSUyEQvHQAe5wh5H3ygbAR
 GbDau0AVSvCO7lTRpqkzS6aeTLIbp1oqGwnnSXQDy30biI2FeQyg76Nq8Rgj5Lun
 8+GvmkuVSjbjTXYbLqjt/gW97O/HUdygfL7hhjS10TB+3C34mQm/pwFxNYxJdFyO
 mhMeKq4DdOJ+
 =XHaD
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "These update the ACPICA code in the kernel to upstream version
  20240827, add support for ACPI-based enumeration of interrupt
  controllers on RISC-V along with some related irqchip updates, clean
  up the ACPI device object sysfs interface, add some quirks for
  backlight handling and IRQ overrides, fix assorted issues and clean up
  code.

  Specifics:

   - Check return value in acpi_db_convert_to_package() (Pei Xiao)

   - Detect FACS and allow setting the waking vector on reduced-hardware
     ACPI platforms (Jiaqing Zhao)

   - Allow ACPICA to represent semaphores as integers (Adrien Destugues)

   - Complete CXL 3.0 CXIMS structures support in ACPICA (Zhang Rui)

   - Make ACPICA support SPCR version 4 and add RISC-V SBI Subtype to
     DBG2 (Sia Jee Heng)

   - Implement the Dword_PCC Resource Descriptor Macro in ACPICA (Jose
     Marinho)

   - Correct the typo in struct acpi_mpam_msc_node member (Punit
     Agrawal)

   - Implement ACPI_WARNING_ONCE() and ACPI_ERROR_ONCE() and use them to
     prevent a Stall() violation warning from being printed every time
     this takes place (Vasily Khoruzhick)

   - Allow PCC Data Type in MCTP resource (Adam Young)

   - Fix memory leaks on acpi_ps_get_next_namepath() and
     acpi_ps_get_next_field() failures (Armin Wolf)

   - Add support for supressing leading zeros in hex strings when
     converting them to integers and update integer-to-hex-string
     conversions in ACPICA (Armin Wolf)

   - Add support for Windows 11 22H2 _OSI string (Armin Wolf)

   - Avoid warning for Dump Functions in ACPICA (Adam Lackorzynski)

   - Add extended linear address mode to HMAT MSCIS in ACPICA (Dave
     Jiang)

   - Handle empty connection_node in iasl (Aleksandrs Vinarskis)

   - Allow for more flexibility in _DSM args (Saket Dumbre)

   - Setup for ACPICA release 20240827 (Saket Dumbre)

   - Add ACPI device enumeration support for interrupt controller
     probing including taking dependencies into account (Sunil V L)

   - Implement ACPI-based interrupt controller probing on RISC-V
     (Sunil V L)

   - Add ACPI support for AIA in riscv-intc and add ACPI support to
     riscv-imsic, riscv-aplic, and sifive-plic (Sunil V L)

   - Do not release locks during operation region accesses in the ACPI
     EC driver (Rafael Wysocki)

   - Fix up the _STR handling in the ACPI device object sysfs interface,
     make it represent the device object attributes as an attribute
     group and make it rely on driver core functionality for sysfs
     attrubute management (Thomas Weißschuh)

   - Extend error messages printed to the kernel log when
     acpi_evaluate_dsm() fails to include revision and function number
     (David Wang)

   - Add a new AMDI0015 platform device ID to the ACPi APD driver for
     AMD SoCs (Shyam Sundar S K)

   - Use the driver core for the async probing management in the ACPI
     battery driver (Thomas Weißschuh)

   - Remove redundant initalizations of a local variable to NULL from
     the ACPI battery driver (Ilpo Järvinen)

   - Remove unneeded check in tps68470_pmic_opregion_probe() (Aleksandr
     Mishin)

   - Add support for setting the EPP register through the ACPI CPPC
     sysfs interface if it is in FFH (Mario Limonciello)

   - Fix MASK_VAL() usage in the ACPI CPPC library (Clément Léger)

   - Reduce the log level of a per-CPU message about idle states in the
     ACPI processor driver (Li RongQing)

   - Fix crash in exit_round_robin() in the ACPI processor aggregator
     device (PAD) driver (Seiji Nishikawa)

   - Add force_vendor quirk for Panasonic Toughbook CF-18 in the ACPI
     backlight driver (Hans de Goede)

   - Make the DMI checks related to backlight handling on Lenovo Yoga
     Tab 3 X90F less strict (Hans de Goede)

   - Enforce native backlight handling on Apple MacbookPro9,2 (Esther
     Shimanovich)

   - Add IRQ override quirks for Asus Vivobook Go E1404GAB and MECHREV
     GM7XG0M, and refine the TongFang GMxXGxx quirk (Li Chen, Tamim
     Khan, Werner Sembach)

   - Quirk ASUS ROG M16 to default to S3 sleep (Luke D. Jones)

   - Define and use symbols for device and class name lengths in the
     ACPI bus type code and make the code use strscpy() instead of
     strcpy() in several places (Muhammad Qasim Abdul Majeed)"

* tag 'acpi-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (70 commits)
  ACPI: resource: Add another DMI match for the TongFang GMxXGxx
  ACPI: CPPC: Add support for setting EPP register in FFH
  ACPI: PM: Quirk ASUS ROG M16 to default to S3 sleep
  ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18
  ACPI: battery: use driver core managed async probing
  ACPI: button: Use strscpy() instead of strcpy()
  ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB
  ACPI: CPPC: Fix MASK_VAL() usage
  irqchip/sifive-plic: Add ACPI support
  ACPICA: Setup for ACPICA release 20240827
  ACPICA: Allow for more flexibility in _DSM args
  ACPICA: iasl: handle empty connection_node
  ACPICA: HMAT: Add extended linear address mode to MSCIS
  ACPICA: Avoid warning for Dump Functions
  ACPICA: Add support for Windows 11 22H2 _OSI string
  ACPICA: Update integer-to-hex-string conversions
  ACPICA: Add support for supressing leading zeros in hex strings
  ACPICA: Allow for supressing leading zeros when using acpi_ex_convert_to_ascii()
  ACPICA: Fix memory leak if acpi_ps_get_next_field() fails
  ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails
  ...
2024-09-16 07:41:48 +02:00
Linus Torvalds
64dd3b6a79 ARM:
* New Stage-2 page table dumper, reusing the main ptdump infrastructure
 
 * FP8 support
 
 * Nested virtualization now supports the address translation (FEAT_ATS1A)
   family of instructions
 
 * Add selftest checks for a bunch of timer emulation corner cases
 
 * Fix multiple cases where KVM/arm64 doesn't correctly handle the guest
   trying to use a GICv3 that wasn't advertised
 
 * Remove REG_HIDDEN_USER from the sysreg infrastructure, making
   things little simpler
 
 * Prevent MTE tags being restored by userspace if we are actively
   logging writes, as that's a recipe for disaster
 
 * Correct the refcount on a page that is not considered for MTE tag
   copying (such as a device)
 
 * When walking a page table to split block mappings, synchronize only
   at the end the walk rather than on every store
 
 * Fix boundary check when transfering memory using FFA
 
 * Fix pKVM TLB invalidation, only affecting currently out of tree
   code but worth addressing for peace of mind
 
 LoongArch:
 
 * Revert qspinlock to test-and-set simple lock on VM.
 
 * Add Loongson Binary Translation extension support.
 
 * Add PMU support for guest.
 
 * Enable paravirt feature control from VMM.
 
 * Implement function kvm_para_has_feature().
 
 RISC-V:
 
 * Fix sbiret init before forwarding to userspace
 
 * Don't zero-out PMU snapshot area before freeing data
 
 * Allow legacy PMU access from guest
 
 * Fix to allow hpmcounter31 from the guest
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmbmghAUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPFQgf+Ijeqlx90BGy96pyzo/NkYKPeEc8G
 gKhlm8PdtdZYaRdJ53MVRLLpzbLuzqbwrn0ZX2tvoDRLzuAqTt2GTFoT6e2HtY5B
 Sf7KQMFwHWGtGklC1EmZ1fXsCocswpuAcexCLKLRBoWUcKABlgwV3N3vJo5gx/Ag
 8XXhYpcLTh+p7bjMdJShQy019pTwEDE68pPVnL2NPzla1G6Qox7ZJIdOEMZXuyJA
 MJ4jbFWE/T8vLFUf/8MGQ/+bo+4140kzB8N9wkazNcBRoodY6Hx+Lm1LiZjNudO1
 ilIdB4P3Ht+D8UuBv2DO5XTakfJz9T9YsoRcPlwrOWi/8xBRbt236gFB3Q==
 =sHTI
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-non-x86' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "These are the non-x86 changes (mostly ARM, as is usually the case).
  The generic and x86 changes will come later"

  ARM:

   - New Stage-2 page table dumper, reusing the main ptdump
     infrastructure

   - FP8 support

   - Nested virtualization now supports the address translation
     (FEAT_ATS1A) family of instructions

   - Add selftest checks for a bunch of timer emulation corner cases

   - Fix multiple cases where KVM/arm64 doesn't correctly handle the
     guest trying to use a GICv3 that wasn't advertised

   - Remove REG_HIDDEN_USER from the sysreg infrastructure, making
     things little simpler

   - Prevent MTE tags being restored by userspace if we are actively
     logging writes, as that's a recipe for disaster

   - Correct the refcount on a page that is not considered for MTE tag
     copying (such as a device)

   - When walking a page table to split block mappings, synchronize only
     at the end the walk rather than on every store

   - Fix boundary check when transfering memory using FFA

   - Fix pKVM TLB invalidation, only affecting currently out of tree
     code but worth addressing for peace of mind

  LoongArch:

   - Revert qspinlock to test-and-set simple lock on VM.

   - Add Loongson Binary Translation extension support.

   - Add PMU support for guest.

   - Enable paravirt feature control from VMM.

   - Implement function kvm_para_has_feature().

  RISC-V:

   - Fix sbiret init before forwarding to userspace

   - Don't zero-out PMU snapshot area before freeing data

   - Allow legacy PMU access from guest

   - Fix to allow hpmcounter31 from the guest"

* tag 'for-linus-non-x86' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (64 commits)
  LoongArch: KVM: Implement function kvm_para_has_feature()
  LoongArch: KVM: Enable paravirt feature control from VMM
  LoongArch: KVM: Add PMU support for guest
  KVM: arm64: Get rid of REG_HIDDEN_USER visibility qualifier
  KVM: arm64: Simplify visibility handling of AArch32 SPSR_*
  KVM: arm64: Simplify handling of CNTKCTL_EL12
  LoongArch: KVM: Add vm migration support for LBT registers
  LoongArch: KVM: Add Binary Translation extension support
  LoongArch: KVM: Add VM feature detection function
  LoongArch: Revert qspinlock to test-and-set simple lock on VM
  KVM: arm64: Register ptdump with debugfs on guest creation
  arm64: ptdump: Don't override the level when operating on the stage-2 tables
  arm64: ptdump: Use the ptdump description from a local context
  arm64: ptdump: Expose the attribute parsing functionality
  KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer
  KVM: arm64: Move pagetable definitions to common header
  KVM: arm64: nv: Add support for FEAT_ATS1A
  KVM: arm64: nv: Plumb handling of AT S1* traps from EL2
  KVM: arm64: nv: Make AT+PAN instructions aware of FEAT_PAN3
  KVM: arm64: nv: Sanitise SCTLR_EL1.EPAN according to VM configuration
  ...
2024-09-16 07:38:18 +02:00
Jisheng Zhang
8f1534e744
riscv: avoid Imbalance in RAS
Inspired by[1], modify the code to remove the code of modifying ra to
avoid imbalance RAS (return address stack) which may lead to incorret
predictions on return.

Link: https://lore.kernel.org/linux-riscv/20240607061335.2197383-1-cyrilbur@tenstorrent.com/ [1]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com>
Link: https://lore.kernel.org/r/20240720170659.1522-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:58:25 -07:00
Palmer Dabbelt
7e340f4fad
Merge patch series "Svvptc extension to remove preventive sfence.vma"
Alexandre Ghiti <alexghiti@rivosinc.com> says:

In RISC-V, after a new mapping is established, a sfence.vma needs to be
emitted for different reasons:

- if the uarch caches invalid entries, we need to invalidate it otherwise
  we would trap on this invalid entry,
- if the uarch does not cache invalid entries, a reordered access could fail
  to see the new mapping and then trap (sfence.vma acts as a fence).

We can actually avoid emitting those (mostly) useless and costly sfence.vma
by handling the traps instead:

- for new kernel mappings: only vmalloc mappings need to be taken care of,
  other new mapping are rare and already emit the required sfence.vma if
  needed.
  That must be achieved very early in the exception path as explained in
  patch 3, and this also fixes our fragile way of dealing with vmalloc faults.

- for new user mappings: Svvptc makes update_mmu_cache() a no-op but we can
  take some gratuitous page faults (which are very unlikely though).

Patch 1 and 2 introduce Svvptc extension probing.

On our uarch that does not cache invalid entries and a 6.5 kernel, the
gains are measurable:

* Kernel boot:                  6%
* ltp - mmapstress01:           8%
* lmbench - lat_pagefault:      20%
* lmbench - lat_mmap:           5%

Here are the corresponding numbers of sfence.vma emitted:

* Ubuntu boot to login:
Before: ~630k sfence.vma
After:  ~200k sfence.vma

* ltp - mmapstress01
Before: ~45k
After:  ~6.3k

* lmbench - lat_pagefault
Before: ~665k
After:   832 (!)

* lmbench - lat_mmap
Before: ~546k
After:   718 (!)

Thanks to Ved and Matt Evans for triggering the discussion that led to
this patchset!

* b4-shazam-merge:
  riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc
  riscv: Stop emitting preventive sfence.vma for new vmalloc mappings
  dt-bindings: riscv: Add Svvptc ISA extension description
  riscv: Add ISA extension parsing for Svvptc

Link: https://lore.kernel.org/r/20240717060125.139416-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:58:24 -07:00
Steffen Persvold
1845d381f2
riscv: cacheinfo: Add back init_cache_level() function
commit 5944ce092b (arch_topology: Build cacheinfo from primary CPU)
removed the init_cache_level() function from arch/riscv/kernel/cacheinfo.c
and relies on the init_cpu_topology() function in drivers/base/arch_topology.c
to call fetch_cache_info() which in turn calls init_of_cache_level() to
populate the cache hierarchy information. However, init_cpu_topology() is only
called from smpboot.c:smp_prepare_cpus() and thus only available when
CONFIG_SMP is defined.

To support non-SMP enabled kernels to still detect cache hierarchy, we add back
the init_cache_level() function. The init_level_allocate_ci() function handles
this gracefully on SMP-enabled kernels anyway where fetch_cache_info() is
called from init_cpu_topology() earlier in the boot phase.

Signed-off-by: Steffen Persvold <spersvold@gmail.com>
Link: https://lore.kernel.org/r/20240707003515.5058-1-spersvold@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:50 -07:00
Jinjie Ruan
cea9d27705
riscv: Remove unused _TIF_WORK_MASK
Since commit f0bddf5058 ("riscv: entry: Convert to generic entry"),
_TIF_WORK_MASK is no longer used, so remove it.

Fixes: f0bddf5058 ("riscv: entry: Convert to generic entry")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Link: https://lore.kernel.org/r/20240711111508.1373322-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:50 -07:00
Palmer Dabbelt
9b2863e2cc
Merge patch series "riscv: select ARCH_USE_SYM_ANNOTATIONS"
Jisheng Zhang <jszhang@kernel.org> says:

commit 76329c6939 ("riscv: Use SYM_*() assembly macros instead
of deprecated ones"), most riscv has been to converted the new style
SYM_ assembler annotations. The remaining one is sifive's
errata_cip_453.S, so convert to new style SYM_ annotations as well.
After that select ARCH_USE_SYM_ANNOTATIONS.

* b4-shazam-merge:
  riscv: select ARCH_USE_SYM_ANNOTATIONS
  riscv: errata: sifive: Use SYM_*() assembly macros

Link: https://lore.kernel.org/r/20240709160536.3690-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:49 -07:00
Palmer Dabbelt
f25170a053
Merge patch series "riscv: stacktrace: Add USER_STACKTRACE support"
Jinjie Ruan <ruanjinjie@huawei.com> says:

Add RISC-V USER_STACKTRACE support, and fix the fp alignment bug
in perf_callchain_user() by the way as Björn pointed out.

* b4-shazam-merge:
  riscv: stacktrace: Add USER_STACKTRACE support
  riscv: Fix fp alignment bug in perf_callchain_user()

Link: https://lore.kernel.org/r/20240708032847.2998158-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:47 -07:00
Jisheng Zhang
5c178472af
riscv: define ILLEGAL_POINTER_VALUE for 64bit
This is used in poison.h for poison pointer offset. Based on current
SV39, SV48 and SV57 vm layout, 0xdead000000000000 is a proper value
that is not mappable, this can avoid potentially turning an oops to
an expolit.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: fbe934d69e ("RISC-V: Build Infrastructure")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240705170210.3236-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:40 -07:00
Alexandre Ghiti
7a21b2e370
riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc
The preventive sfence.vma were emitted because new mappings must be made
visible to the page table walker but Svvptc guarantees that it will
happen within a bounded timeframe, so no need to sfence.vma for the uarchs
that implement this extension, we will then take gratuitous (but very
unlikely) page faults, similarly to x86 and arm64.

This allows to drastically reduce the number of sfence.vma emitted:

* Ubuntu boot to login:
Before: ~630k sfence.vma
After:  ~200k sfence.vma

* ltp - mmapstress01
Before: ~45k
After:  ~6.3k

* lmbench - lat_pagefault
Before: ~665k
After:   832 (!)

* lmbench - lat_mmap
Before: ~546k
After:   718 (!)

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240717060125.139416-5-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:11:05 -07:00
Alexandre Ghiti
503638e0ba
riscv: Stop emitting preventive sfence.vma for new vmalloc mappings
In 6.5, we removed the vmalloc fault path because that can't work (see
[1] [2]). Then in order to make sure that new page table entries were
seen by the page table walker, we had to preventively emit a sfence.vma
on all harts [3] but this solution is very costly since it relies on IPI.

And even there, we could end up in a loop of vmalloc faults if a vmalloc
allocation is done in the IPI path (for example if it is traced, see
[4]), which could result in a kernel stack overflow.

Those preventive sfence.vma needed to be emitted because:

- if the uarch caches invalid entries, the new mapping may not be
  observed by the page table walker and an invalidation may be needed.
- if the uarch does not cache invalid entries, a reordered access
  could "miss" the new mapping and traps: in that case, we would actually
  only need to retry the access, no sfence.vma is required.

So this patch removes those preventive sfence.vma and actually handles
the possible (and unlikely) exceptions. And since the kernel stacks
mappings lie in the vmalloc area, this handling must be done very early
when the trap is taken, at the very beginning of handle_exception: this
also rules out the vmalloc allocations in the fault path.

Link: https://lore.kernel.org/linux-riscv/20230531093817.665799-1-bjorn@kernel.org/ [1]
Link: https://lore.kernel.org/linux-riscv/20230801090927.2018653-1-dylan@andestech.com [2]
Link: https://lore.kernel.org/linux-riscv/20230725132246.817726-1-alexghiti@rivosinc.com/ [3]
Link: https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/ [4]
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240717060125.139416-4-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:11:04 -07:00
Alexandre Ghiti
a6efe33cc5
riscv: Add ISA extension parsing for Svvptc
Add support to parse the Svvptc string in the riscv,isa string.

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240717060125.139416-2-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:11:02 -07:00
Jisheng Zhang
7c9d980e46
riscv: select ARCH_USE_SYM_ANNOTATIONS
Now, riscv has been converted to the new style SYM_ assembler
annotations. So select ARCH_USE_SYM_ANNOTATIONS to ensure the
deprecated macros such as ENTRY(), END(), WEAK() and so on are not
available and we don't regress.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-By: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240709160536.3690-3-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:03:23 -07:00
Jisheng Zhang
6868d12e02
riscv: errata: sifive: Use SYM_*() assembly macros
ENTRY()/END() macros are deprecated and we should make use of the
new SYM_*() macros [1] for better annotation of symbols. Replace the
deprecated ones with the new ones.

[1] https://docs.kernel.org/core-api/asm-annotations.html

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-By: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240709160536.3690-2-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:03:22 -07:00
Jinjie Ruan
1a74833182
riscv: stacktrace: Add USER_STACKTRACE support
Currently, userstacktrace is unsupported for riscv. So use the
perf_callchain_user() code as blueprint to implement the
arch_stack_walk_user() which add userstacktrace support on riscv.
Meanwhile, we can use arch_stack_walk_user() to simplify the implementation
of perf_callchain_user().

A ftrace test case is shown as below:

	# cd /sys/kernel/debug/tracing
	# echo 1 > options/userstacktrace
	# echo 1 > options/sym-userobj
	# echo 1 > events/sched/sched_process_fork/enable
	# cat trace
	......
	            bash-178     [000] ...1.    97.968395: sched_process_fork: comm=bash pid=178 child_comm=bash child_pid=231
	            bash-178     [000] ...1.    97.970075: <user stack trace>
	 => /lib/libc.so.6[+0xb5090]

Also a simple perf test is ok as below:

	# perf record -e cpu-clock --call-graph fp top
	# perf report --call-graph

	.....
	[[31m  66.54%[[m     0.00%  top      [kernel.kallsyms]            [k] ret_from_exception
            |
            ---ret_from_exception
               |
               |--[[31m58.97%[[m--do_trap_ecall_u
               |          |
               |          |--[[31m17.34%[[m--__riscv_sys_read
               |          |          ksys_read
               |          |          |
               |          |           --[[31m16.88%[[m--vfs_read
               |          |                     |
               |          |                     |--[[31m10.90%[[m--seq_read

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Tested-by: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240708032847.2998158-3-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 23:57:16 -07:00
Jinjie Ruan
22ab08955e
riscv: Fix fp alignment bug in perf_callchain_user()
The standard RISC-V calling convention said:
	"The stack grows downward and the stack pointer is always
	kept 16-byte aligned".

So perf_callchain_user() should check whether 16-byte aligned for fp.

Link: https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf

Fixes: dbeb90b0c1 ("riscv: Add perf callchain support")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240708032847.2998158-2-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 23:57:15 -07:00
Paolo Bonzini
0cdcc99eea Merge tag 'kvm-riscv-6.12-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.12

- Fix sbiret init before forwarding to userspace
- Don't zero-out PMU snapshot area before freeing data
- Allow legacy PMU access from guest
- Fix to allow hpmcounter31 from the guest
2024-09-15 02:43:17 -04:00
Stuart Menefy
d6a1928134
riscv: Remove redundant restriction on memory size
The original reason for reserving the top 4GiB of the direct map
(space for modules/BPF/kernel) hasn't applied since the address
map was reworked for KASAN.

Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240624121723.2186279-1-stuart.menefy@codasip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 01:08:56 -07:00
Changbin Du
7587a3602b
riscv: vdso: do not strip debugging info for vdso.so.dbg
The vdso.so.dbg is a debug version of vdso and could be used for debugging
purpose. For example, perf-annotate requires debugging info to show source
lines. So let's keep its debugging info.

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240611040947.3024710-1-changbin.du@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 01:02:30 -07:00
Alice Ryhl
d077242d68 rust: support for shadow call stack sanitizer
Add all of the flags that are needed to support the shadow call stack
(SCS) sanitizer with Rust, and updates Kconfig to allow only
configurations that work.

The -Zfixed-x18 flag is required to use SCS on arm64, and requires rustc
version 1.80.0 or greater. This restriction is reflected in Kconfig.

When CONFIG_DYNAMIC_SCS is enabled, the build will be configured to
include unwind tables in the build artifacts. Dynamic SCS uses the
unwind tables at boot to find all places that need to be patched. The
-Cforce-unwind-tables=y flag ensures that unwind tables are available
for Rust code.

In non-dynamic mode, the -Zsanitizer=shadow-call-stack flag is what
enables the SCS sanitizer. Using this flag requires rustc version 1.82.0
or greater on the targets used by Rust in the kernel. This restriction
is reflected in Kconfig.

It is possible to avoid the requirement of rustc 1.80.0 by using
-Ctarget-feature=+reserve-x18 instead of -Zfixed-x18. However, this flag
emits a warning during the build, so this patch does not add support for
using it and instead requires 1.80.0 or greater.

The dependency is placed on `select HAVE_RUST` to avoid a situation
where enabling Rust silently turns off the sanitizer. Instead, turning
on the sanitizer results in Rust being disabled. We generally do not
want changes to CONFIG_RUST to result in any mitigations being changed
or turned off.

At the time of writing, rustc 1.82.0 only exists via the nightly release
channel. There is a chance that the -Zsanitizer=shadow-call-stack flag
will end up needing 1.83.0 instead, but I think it is small.

Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240829-shadow-call-stack-v7-1-2f62a4432abf@google.com
[ Fixed indentation using spaces. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-13 00:03:14 +02:00
Linus Torvalds
8581ae1ea0 RISC-V Fixes for 6.11-rc8
* Two fixes for smp_processor_id() calls in preemptible sections: one if
   the perf driver, and one in the fence.i prctl.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmbjDTkTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiRsKD/428yAcDT6JKjG+jzYnMARDxl+OS0hJ
 KKwfeglvpPXWqZoZlyhPgJMJ/fNalrH/dVwl01a3Aobjy8/JV6W6L2Cc/lwRBNXv
 O+nfCzzJoZs+v7D0MR2Ad101vfToafDaiZ9Q6jH4/9yi/m5zAaIHdEnjYQQW5rLc
 FMnfGDHXo40WkrnjEjefeh7n88d5dtyD3szwS8khSnJulGFRHHWW0XL54wVeY3wE
 NvLKdD5PDR7kzOE6BvofPQ1tIJxpWnE+jqSgHmS1eYKlwwzb1kvtGXiktUl/gcwb
 /+jFgK49gqWMpx1ji+utw4R357uIZY0DPIiMvY7K5PjeY+TXtltTBgfyvjklpLWm
 YrZ1snpMiMl/VTxFYVFMCvowK/DvylvSQ737zwMaxTq/DGwvr8GPuRO1Pnwc/3QZ
 wCLy3rsK56LE46+9IrYeauYNiOl1bd6znuASzrPhr8Ecexm2DW+JZqRRZCMl3UTK
 p9hrr3CxUT4hi0bksjkSOBks7cLNp3Rqa/fEBtM6YKKzSU6NlEoNpuiH7sjh7ETE
 Zk4qUjRPiRcje0XL2NTKfpKSGuxK/xP4188aYA5jZ1QxO/zditk8BrS5WRfgEoPr
 gjNapO25nrnJbgi9PaClMdeYNV//Fi/0vjpDjw6pYfJaZSE0sWMqp5/bWt5Bsy2A
 e/OyJ+U4TPY4vQ==
 =pEDD
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - Two fixes for smp_processor_id() calls in preemptible sections: one
   if the perf driver, and one in the fence.i prctl.

* tag 'riscv-for-linus-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF
  drivers: perf: Fix smp_processor_id() use in preemptible code
2024-09-12 13:03:45 -07:00
Palmer Dabbelt
9ea7b92b77
Merge patch series "remove size limit on XIP kernel"
Nam Cao <namcao@linutronix.de> says:

Hi,

For XIP kernel, the writable data section is always at offset specified in
XIP_OFFSET, which is hard-coded to 32MB.

Unfortunately, this means the read-only section (placed before the
writable section) is restricted in size. This causes build failure if the
kernel gets too large.

This series remove the use of XIP_OFFSET one by one, then remove this
macro entirely at the end, with the goal of lifting this size restriction.

Also some cleanup and documentation along the way.

* b4-shazam-merge
  riscv: remove limit on the size of read-only section for XIP kernel
  riscv: drop the use of XIP_OFFSET in create_kernel_page_table()
  riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa()
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET
  riscv: replace misleading va_kernel_pa_offset on XIP kernel
  riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel
  riscv: cleanup XIP_FIXUP macro
  riscv: change XIP's kernel_map.size to be size of the entire kernel
  ...

Link: https://lore.kernel.org/r/cover.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:05 -07:00
Nam Cao
b635a84bde
riscv: remove limit on the size of read-only section for XIP kernel
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size. This causes
build failures if the kernel gets too big [1].

Remove this limit.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404211031.J6l2AfJk-lkp@intel.com [1]
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/3bf3a77be10ebb0d8086c028500baa16e7a8e648.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:02 -07:00