Commit Graph

1294874 Commits

Author SHA1 Message Date
Jinjie Ruan
b3f835cd73
crash: Fix riscv64 crash memory reserve dead loop
On RISCV64 Qemu machine with 512MB memory, cmdline "crashkernel=500M,high"
will cause system stall as below:

	 Zone ranges:
	   DMA32    [mem 0x0000000080000000-0x000000009fffffff]
	   Normal   empty
	 Movable zone start for each node
	 Early memory node ranges
	   node   0: [mem 0x0000000080000000-0x000000008005ffff]
	   node   0: [mem 0x0000000080060000-0x000000009fffffff]
	 Initmem setup node 0 [mem 0x0000000080000000-0x000000009fffffff]
	(stall here)

commit 5d99cadf1568 ("crash: fix x86_32 crash memory reserve dead loop
bug") fix this on 32-bit architecture. However, the problem is not
completely solved. If `CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX` on 64-bit
architecture, for example, when system memory is equal to
CRASH_ADDR_LOW_MAX on RISCV64, the following infinite loop will also occur:

	-> reserve_crashkernel_generic() and high is true
	   -> alloc at [CRASH_ADDR_LOW_MAX, CRASH_ADDR_HIGH_MAX] fail
	      -> alloc at [0, CRASH_ADDR_LOW_MAX] fail and repeatedly
	         (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX).

As Catalin suggested, do not remove the ",high" reservation fallback to
",low" logic which will change arm64's kdump behavior, but fix it by
skipping the above situation similar to commit d2f32f23190b ("crash: fix
x86_32 crash memory reserve dead loop").

After this patch, it print:
	cannot allocate crashkernel (size:0x1f400000)

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Link: https://lore.kernel.org/r/20240812062017.2674441-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 05:58:12 -07:00
Mayuresh Chitale
f0c9363db2
perf/riscv-sbi: Add platform specific firmware event handling
The SBI v2.0 specification pointed to by the link below reserves the
event code 0xffff for platform specific firmware events. Update the driver
to be able to parse and program such events. The platform specific
firmware events must now be specified in the perf command as below:
perf stat -e rCxxx ...
where bits[63:62] = 0x3 of the event config indicate a platform specific
firmware event and xxx indicate the actual event code which is passed
as the event data.

Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Link: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/v2.0/riscv-sbi.pdf
Link: https://lore.kernel.org/r/20240812051109.6496-1-mchitale@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 05:58:11 -07:00
Palmer Dabbelt
47b9533ccd
Merge patch series "tools: Add barrier implementations for riscv"
Charlie Jenkins <charlie@rivosinc.com> says:

Add support for riscv specific barrier implementations to the tools
tree, so that fence instructions can be emitted for synchronization.

* b4-shazam-merge:
  tools: Optimize ring buffer for riscv
  tools: Add riscv barrier implementation

Link: https://lore.kernel.org/r/20240806-optimize_ring_buffer_read_riscv-v2-0-ca7e193ae198@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 05:58:08 -07:00
Charlie Jenkins
aa5736dc7a
tools: Optimize ring buffer for riscv
Now that the riscv tools tree supports optimized barriers, use them in
the ring buffer.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20240806-optimize_ring_buffer_read_riscv-v2-2-ca7e193ae198@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 01:46:47 -07:00
Charlie Jenkins
6d74d178fe
tools: Add riscv barrier implementation
Many of the other architectures use their custom barrier implementations.
Use the barrier code from the kernel sources to optimize barriers in
tools.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20240806-optimize_ring_buffer_read_riscv-v2-1-ca7e193ae198@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 01:46:46 -07:00
Palmer Dabbelt
ad380f6a0a
RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t
I recently ended up with a warning on some compilers along the lines of

      CC      kernel/resource.o
    In file included from include/linux/ioport.h:16,
                     from kernel/resource.c:15:
    kernel/resource.c: In function 'gfr_start':
    include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow]
       49 |         ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); })
          |                                     ^
    include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique'
       52 |         __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_))
          |         ^~~~~~~~~~~~~~~~~
    include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once'
      161 | #define min_t(type, x, y) __cmp_once(min, type, x, y)
          |                           ^~~~~~~~~~
    kernel/resource.c:1829:23: note: in expansion of macro 'min_t'
     1829 |                 end = min_t(resource_size_t, base->end,
          |                       ^~~~~
    kernel/resource.c: In function 'gfr_continue':
    include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow]
       49 |         ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); })
          |                                     ^
    include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique'
       52 |         __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_))
          |         ^~~~~~~~~~~~~~~~~
    include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once'
      161 | #define min_t(type, x, y) __cmp_once(min, type, x, y)
          |                           ^~~~~~~~~~
    kernel/resource.c:1847:24: note: in expansion of macro 'min_t'
     1847 |                addr <= min_t(resource_size_t, base->end,
          |                        ^~~~~
    cc1: all warnings being treated as errors

which looks like a real problem: our phys_addr_t is only 32 bits now, so
having 34-bit masks is just going to result in overflows.

Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240731162159.9235-2-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 01:32:39 -07:00
Haibo Xu
732b177663
ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
Currently, only acpi_early_node_map[0] was initialized to NUMA_NO_NODE.
To ensure all the values were properly initialized, switch to initialize
all of them to NUMA_NO_NODE.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> (arm64 platform)
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240729035958.1957185-1-haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 01:31:26 -07:00
Palmer Dabbelt
5835437609
Merge patch series "riscv: Improve KASAN coverage to fix unit tests"
Samuel Holland <samuel.holland@sifive.com> says:

This series fixes two areas where uninstrumented assembly routines
caused gaps in KASAN coverage on RISC-V, which were caught by KUnit
tests. The KASAN KUnit test suite passes after applying this series.

This series fixes the following test failures:
  # kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1520
  KASAN failure expected in "kasan_int_result = strcmp(ptr, "2")", but none occurred
  # kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1524
  KASAN failure expected in "kasan_int_result = strlen(ptr)", but none occurred
  not ok 60 kasan_strings
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1531
  KASAN failure expected in "set_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1533
  KASAN failure expected in "clear_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1535
  KASAN failure expected in "clear_bit_unlock(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1536
  KASAN failure expected in "__clear_bit_unlock(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1537
  KASAN failure expected in "change_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1543
  KASAN failure expected in "test_and_set_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1545
  KASAN failure expected in "test_and_set_bit_lock(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1546
  KASAN failure expected in "test_and_clear_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1548
  KASAN failure expected in "test_and_change_bit(nr, addr)", but none occurred
  not ok 61 kasan_bitops_generic

Samuel Holland (2):
  riscv: Omit optimized string routines when using KASAN
  riscv: Enable bitops instrumentation

arch/riscv/include/asm/bitops.h | 43 ++++++++++++++++++---------------
 arch/riscv/include/asm/string.h |  2 ++
 arch/riscv/kernel/riscv_ksyms.c |  3 ---
 arch/riscv/lib/Makefile         |  2 ++
 arch/riscv/lib/strcmp.S         |  1 +
 arch/riscv/lib/strlen.S         |  1 +
 arch/riscv/lib/strncmp.S        |  1 +
 arch/riscv/purgatory/Makefile   |  2 ++
 8 files changed, 32 insertions(+), 23 deletions(-)

* b4-shazam-merge:
  riscv: Enable bitops instrumentation
  riscv: Omit optimized string routines when using KASAN

Link: https://lore.kernel.org/r/20240801033725.28816-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-19 01:10:44 -07:00
Samuel Holland
77514915b7
riscv: Enable bitops instrumentation
Instead of implementing the bitops functions directly in assembly,
provide the arch_-prefixed versions and use the wrappers from
asm-generic to add instrumentation. This improves KASAN coverage and
fixes the kasan_bitops_generic() unit test.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240801033725.28816-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-19 01:10:04 -07:00
Samuel Holland
58ff537109
riscv: Omit optimized string routines when using KASAN
The optimized string routines are implemented in assembly, so they are
not instrumented for use with KASAN. Fall back to the C version of the
routines in order to improve KASAN coverage. This fixes the
kasan_strings() unit test.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240801033725.28816-2-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-19 01:10:00 -07:00
Hanjun Guo
21d98d658f
ACPI: RISCV: Make acpi_numa_get_nid() to be static
acpi_numa_get_nid() is only called in acpi_numa.c for riscv,
no need to add it in head file, so make it static and remove
related functions in the asm/acpi.h.

Spotted by doing some cleanup for arm64 ACPI.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Haibo Xu <haibo1.xu@intel.com>
Link: https://lore.kernel.org/r/20240811031804.3347298-1-guohanjun@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 12:02:48 -07:00
Yunhui Cui
048e2906d4
riscv: Randomize lower bits of stack address
Implement arch_align_stack() to randomize the lower bits
of the stack address.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240625030502.68988-1-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 08:05:10 -07:00
Charlie Jenkins
11c2dbd7f2
selftests: riscv: Allow mmap test to compile on 32-bit
Macros needed for 32-bit compilations were hidden behind 64-bit riscv
ifdefs. Fix the 32-bit compilations by moving macros to allow the
memory_layout test to run on 32-bit.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: 73d05262a2 ("selftests: riscv: Generalize mm selftests")
Link: https://lore.kernel.org/r/20240808-mmap_tests__fixes-v1-1-b1344b642a84@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 08:05:09 -07:00
Charlie Jenkins
594ffcf4ef
riscv: Make riscv_isa_vendor_ext_andes array static
Since this array is only used in this file, it should be static.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407241530.ej5SVgX1-lkp@intel.com/
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240807-make_andes_static-v1-1-b64bf4c3d941@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 08:05:08 -07:00
Jinjie Ruan
3cc754c237
riscv: Use LIST_HEAD() to simplify code
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD().

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240904013344.2026738-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 06:26:07 -07:00
Geert Uytterhoeven
e36ddf3226
riscv: defconfig: Disable RZ/Five peripheral support
There is not much point in keeping support for RZ/Five peripherals
enabled, as the RZ/Five platform option (ARCH_R9A07G043) is gated behind
NONPORTABLE.  Hence drop all config options that enable built-in or
modular support for peripherals found on RZ/Five SoCs.

Disable USB_XHCI_RCAR explicitly, as its value defaults to the value of
ARCH_RENESAS, which is still enabled.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/89ad70c7d6e8078208fecfd41dc03f6028531729.1722353710.git.geert+renesas@glider.be
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 06:13:21 -07:00
Jinjie Ruan
983f121499
RISC-V: Implement kgdb_roundup_cpus() to enable future NMI Roundup
Until now, the generic weak kgdb_roundup_cpus() has been used for kgdb on
RISCV. A custom one allows to debug CPUs that are stuck with interrupts
disabled with NMI support in the future. And using an IPI is better than
the generic one since it avoids the potential situation described in the
generic kgdb_call_nmi_hook(). As Andrew pointed out, once there is NMI
support, we can easily extend this and the CPU backtrace support
to use NMIs.

After this patch, the kgdb test show that:
	# echo g > /proc/sysrq-trigger
	[2]kdb> btc
	btc: cpu status: Currently on cpu 2
	Available cpus: 0-1(-), 2, 3(-)
	Stack traceback for pid 0
	0xffffffff81c13a40        0        0  1    0   -  0xffffffff81c14510  swapper/0
	CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.10.0-g3120273055b6-dirty #51
	Hardware name: riscv-virtio,qemu (DT)
	Call Trace:
	[<ffffffff80006c48>] dump_backtrace+0x28/0x30
	[<ffffffff80fceb38>] show_stack+0x38/0x44
	[<ffffffff80fe6a04>] dump_stack_lvl+0x58/0x7a
	[<ffffffff80fe6a3e>] dump_stack+0x18/0x20
	[<ffffffff801143fa>] kgdb_cpu_enter+0x682/0x6b2
	[<ffffffff801144ca>] kgdb_nmicallback+0xa0/0xac
	[<ffffffff8000a392>] handle_IPI+0x9c/0x120
	[<ffffffff800a2baa>] handle_percpu_devid_irq+0xa4/0x1e4
	[<ffffffff8009cca8>] generic_handle_domain_irq+0x28/0x36
	[<ffffffff800a9e5c>] ipi_mux_process+0xe8/0x110
	[<ffffffff806e1e30>] imsic_handle_irq+0xf8/0x13a
	[<ffffffff8009cca8>] generic_handle_domain_irq+0x28/0x36
	[<ffffffff806dff12>] riscv_intc_aia_irq+0x2e/0x40
	[<ffffffff80fe6ab0>] handle_riscv_irq+0x54/0x86
	[<ffffffff80ff2e4a>] call_on_irq_stack+0x32/0x40

Rebased on Ryo Takakura's "RISC-V: Enable IPI CPU Backtrace" patch.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240727063438.886155-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 05:52:44 -07:00
Jisheng Zhang
8f1534e744
riscv: avoid Imbalance in RAS
Inspired by[1], modify the code to remove the code of modifying ra to
avoid imbalance RAS (return address stack) which may lead to incorret
predictions on return.

Link: https://lore.kernel.org/linux-riscv/20240607061335.2197383-1-cyrilbur@tenstorrent.com/ [1]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com>
Link: https://lore.kernel.org/r/20240720170659.1522-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:58:25 -07:00
Palmer Dabbelt
7e340f4fad
Merge patch series "Svvptc extension to remove preventive sfence.vma"
Alexandre Ghiti <alexghiti@rivosinc.com> says:

In RISC-V, after a new mapping is established, a sfence.vma needs to be
emitted for different reasons:

- if the uarch caches invalid entries, we need to invalidate it otherwise
  we would trap on this invalid entry,
- if the uarch does not cache invalid entries, a reordered access could fail
  to see the new mapping and then trap (sfence.vma acts as a fence).

We can actually avoid emitting those (mostly) useless and costly sfence.vma
by handling the traps instead:

- for new kernel mappings: only vmalloc mappings need to be taken care of,
  other new mapping are rare and already emit the required sfence.vma if
  needed.
  That must be achieved very early in the exception path as explained in
  patch 3, and this also fixes our fragile way of dealing with vmalloc faults.

- for new user mappings: Svvptc makes update_mmu_cache() a no-op but we can
  take some gratuitous page faults (which are very unlikely though).

Patch 1 and 2 introduce Svvptc extension probing.

On our uarch that does not cache invalid entries and a 6.5 kernel, the
gains are measurable:

* Kernel boot:                  6%
* ltp - mmapstress01:           8%
* lmbench - lat_pagefault:      20%
* lmbench - lat_mmap:           5%

Here are the corresponding numbers of sfence.vma emitted:

* Ubuntu boot to login:
Before: ~630k sfence.vma
After:  ~200k sfence.vma

* ltp - mmapstress01
Before: ~45k
After:  ~6.3k

* lmbench - lat_pagefault
Before: ~665k
After:   832 (!)

* lmbench - lat_mmap
Before: ~546k
After:   718 (!)

Thanks to Ved and Matt Evans for triggering the discussion that led to
this patchset!

* b4-shazam-merge:
  riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc
  riscv: Stop emitting preventive sfence.vma for new vmalloc mappings
  dt-bindings: riscv: Add Svvptc ISA extension description
  riscv: Add ISA extension parsing for Svvptc

Link: https://lore.kernel.org/r/20240717060125.139416-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:58:24 -07:00
Steffen Persvold
1845d381f2
riscv: cacheinfo: Add back init_cache_level() function
commit 5944ce092b (arch_topology: Build cacheinfo from primary CPU)
removed the init_cache_level() function from arch/riscv/kernel/cacheinfo.c
and relies on the init_cpu_topology() function in drivers/base/arch_topology.c
to call fetch_cache_info() which in turn calls init_of_cache_level() to
populate the cache hierarchy information. However, init_cpu_topology() is only
called from smpboot.c:smp_prepare_cpus() and thus only available when
CONFIG_SMP is defined.

To support non-SMP enabled kernels to still detect cache hierarchy, we add back
the init_cache_level() function. The init_level_allocate_ci() function handles
this gracefully on SMP-enabled kernels anyway where fetch_cache_info() is
called from init_cpu_topology() earlier in the boot phase.

Signed-off-by: Steffen Persvold <spersvold@gmail.com>
Link: https://lore.kernel.org/r/20240707003515.5058-1-spersvold@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:50 -07:00
Jinjie Ruan
cea9d27705
riscv: Remove unused _TIF_WORK_MASK
Since commit f0bddf5058 ("riscv: entry: Convert to generic entry"),
_TIF_WORK_MASK is no longer used, so remove it.

Fixes: f0bddf5058 ("riscv: entry: Convert to generic entry")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Link: https://lore.kernel.org/r/20240711111508.1373322-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:50 -07:00
Palmer Dabbelt
9b2863e2cc
Merge patch series "riscv: select ARCH_USE_SYM_ANNOTATIONS"
Jisheng Zhang <jszhang@kernel.org> says:

commit 76329c6939 ("riscv: Use SYM_*() assembly macros instead
of deprecated ones"), most riscv has been to converted the new style
SYM_ assembler annotations. The remaining one is sifive's
errata_cip_453.S, so convert to new style SYM_ annotations as well.
After that select ARCH_USE_SYM_ANNOTATIONS.

* b4-shazam-merge:
  riscv: select ARCH_USE_SYM_ANNOTATIONS
  riscv: errata: sifive: Use SYM_*() assembly macros

Link: https://lore.kernel.org/r/20240709160536.3690-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:49 -07:00
Xiao Wang
1e206fad76
drivers/perf: riscv: Remove redundant macro check
The macro CONFIG_RISCV_PMU must have been defined when riscv_pmu.c gets
compiled, so this patch removes the redundant check.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240708121224.1148154-1-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:48 -07:00
Palmer Dabbelt
f25170a053
Merge patch series "riscv: stacktrace: Add USER_STACKTRACE support"
Jinjie Ruan <ruanjinjie@huawei.com> says:

Add RISC-V USER_STACKTRACE support, and fix the fp alignment bug
in perf_callchain_user() by the way as Björn pointed out.

* b4-shazam-merge:
  riscv: stacktrace: Add USER_STACKTRACE support
  riscv: Fix fp alignment bug in perf_callchain_user()

Link: https://lore.kernel.org/r/20240708032847.2998158-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:47 -07:00
Jisheng Zhang
5c178472af
riscv: define ILLEGAL_POINTER_VALUE for 64bit
This is used in poison.h for poison pointer offset. Based on current
SV39, SV48 and SV57 vm layout, 0xdead000000000000 is a proper value
that is not mappable, this can avoid potentially turning an oops to
an expolit.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: fbe934d69e ("RISC-V: Build Infrastructure")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240705170210.3236-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:40 -07:00
Alexandre Ghiti
7a21b2e370
riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc
The preventive sfence.vma were emitted because new mappings must be made
visible to the page table walker but Svvptc guarantees that it will
happen within a bounded timeframe, so no need to sfence.vma for the uarchs
that implement this extension, we will then take gratuitous (but very
unlikely) page faults, similarly to x86 and arm64.

This allows to drastically reduce the number of sfence.vma emitted:

* Ubuntu boot to login:
Before: ~630k sfence.vma
After:  ~200k sfence.vma

* ltp - mmapstress01
Before: ~45k
After:  ~6.3k

* lmbench - lat_pagefault
Before: ~665k
After:   832 (!)

* lmbench - lat_mmap
Before: ~546k
After:   718 (!)

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240717060125.139416-5-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:11:05 -07:00
Alexandre Ghiti
503638e0ba
riscv: Stop emitting preventive sfence.vma for new vmalloc mappings
In 6.5, we removed the vmalloc fault path because that can't work (see
[1] [2]). Then in order to make sure that new page table entries were
seen by the page table walker, we had to preventively emit a sfence.vma
on all harts [3] but this solution is very costly since it relies on IPI.

And even there, we could end up in a loop of vmalloc faults if a vmalloc
allocation is done in the IPI path (for example if it is traced, see
[4]), which could result in a kernel stack overflow.

Those preventive sfence.vma needed to be emitted because:

- if the uarch caches invalid entries, the new mapping may not be
  observed by the page table walker and an invalidation may be needed.
- if the uarch does not cache invalid entries, a reordered access
  could "miss" the new mapping and traps: in that case, we would actually
  only need to retry the access, no sfence.vma is required.

So this patch removes those preventive sfence.vma and actually handles
the possible (and unlikely) exceptions. And since the kernel stacks
mappings lie in the vmalloc area, this handling must be done very early
when the trap is taken, at the very beginning of handle_exception: this
also rules out the vmalloc allocations in the fault path.

Link: https://lore.kernel.org/linux-riscv/20230531093817.665799-1-bjorn@kernel.org/ [1]
Link: https://lore.kernel.org/linux-riscv/20230801090927.2018653-1-dylan@andestech.com [2]
Link: https://lore.kernel.org/linux-riscv/20230725132246.817726-1-alexghiti@rivosinc.com/ [3]
Link: https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/ [4]
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240717060125.139416-4-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:11:04 -07:00
Alexandre Ghiti
d25599b593
dt-bindings: riscv: Add Svvptc ISA extension description
Add description for the Svvptc ISA extension which was ratified recently.

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240717060125.139416-3-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:11:03 -07:00
Alexandre Ghiti
a6efe33cc5
riscv: Add ISA extension parsing for Svvptc
Add support to parse the Svvptc string in the riscv,isa string.

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240717060125.139416-2-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:11:02 -07:00
Jisheng Zhang
7c9d980e46
riscv: select ARCH_USE_SYM_ANNOTATIONS
Now, riscv has been converted to the new style SYM_ assembler
annotations. So select ARCH_USE_SYM_ANNOTATIONS to ensure the
deprecated macros such as ENTRY(), END(), WEAK() and so on are not
available and we don't regress.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-By: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240709160536.3690-3-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:03:23 -07:00
Jisheng Zhang
6868d12e02
riscv: errata: sifive: Use SYM_*() assembly macros
ENTRY()/END() macros are deprecated and we should make use of the
new SYM_*() macros [1] for better annotation of symbols. Replace the
deprecated ones with the new ones.

[1] https://docs.kernel.org/core-api/asm-annotations.html

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-By: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240709160536.3690-2-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:03:22 -07:00
Jinjie Ruan
1a74833182
riscv: stacktrace: Add USER_STACKTRACE support
Currently, userstacktrace is unsupported for riscv. So use the
perf_callchain_user() code as blueprint to implement the
arch_stack_walk_user() which add userstacktrace support on riscv.
Meanwhile, we can use arch_stack_walk_user() to simplify the implementation
of perf_callchain_user().

A ftrace test case is shown as below:

	# cd /sys/kernel/debug/tracing
	# echo 1 > options/userstacktrace
	# echo 1 > options/sym-userobj
	# echo 1 > events/sched/sched_process_fork/enable
	# cat trace
	......
	            bash-178     [000] ...1.    97.968395: sched_process_fork: comm=bash pid=178 child_comm=bash child_pid=231
	            bash-178     [000] ...1.    97.970075: <user stack trace>
	 => /lib/libc.so.6[+0xb5090]

Also a simple perf test is ok as below:

	# perf record -e cpu-clock --call-graph fp top
	# perf report --call-graph

	.....
	[[31m  66.54%[[m     0.00%  top      [kernel.kallsyms]            [k] ret_from_exception
            |
            ---ret_from_exception
               |
               |--[[31m58.97%[[m--do_trap_ecall_u
               |          |
               |          |--[[31m17.34%[[m--__riscv_sys_read
               |          |          ksys_read
               |          |          |
               |          |           --[[31m16.88%[[m--vfs_read
               |          |                     |
               |          |                     |--[[31m10.90%[[m--seq_read

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Tested-by: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240708032847.2998158-3-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 23:57:16 -07:00
Jinjie Ruan
22ab08955e
riscv: Fix fp alignment bug in perf_callchain_user()
The standard RISC-V calling convention said:
	"The stack grows downward and the stack pointer is always
	kept 16-byte aligned".

So perf_callchain_user() should check whether 16-byte aligned for fp.

Link: https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf

Fixes: dbeb90b0c1 ("riscv: Add perf callchain support")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240708032847.2998158-2-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 23:57:15 -07:00
Stuart Menefy
d6a1928134
riscv: Remove redundant restriction on memory size
The original reason for reserving the top 4GiB of the direct map
(space for modules/BPF/kernel) hasn't applied since the address
map was reworked for KASAN.

Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240624121723.2186279-1-stuart.menefy@codasip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 01:08:56 -07:00
Changbin Du
7587a3602b
riscv: vdso: do not strip debugging info for vdso.so.dbg
The vdso.so.dbg is a debug version of vdso and could be used for debugging
purpose. For example, perf-annotate requires debugging info to show source
lines. So let's keep its debugging info.

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240611040947.3024710-1-changbin.du@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 01:02:30 -07:00
Palmer Dabbelt
9ea7b92b77
Merge patch series "remove size limit on XIP kernel"
Nam Cao <namcao@linutronix.de> says:

Hi,

For XIP kernel, the writable data section is always at offset specified in
XIP_OFFSET, which is hard-coded to 32MB.

Unfortunately, this means the read-only section (placed before the
writable section) is restricted in size. This causes build failure if the
kernel gets too large.

This series remove the use of XIP_OFFSET one by one, then remove this
macro entirely at the end, with the goal of lifting this size restriction.

Also some cleanup and documentation along the way.

* b4-shazam-merge
  riscv: remove limit on the size of read-only section for XIP kernel
  riscv: drop the use of XIP_OFFSET in create_kernel_page_table()
  riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa()
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET
  riscv: replace misleading va_kernel_pa_offset on XIP kernel
  riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel
  riscv: cleanup XIP_FIXUP macro
  riscv: change XIP's kernel_map.size to be size of the entire kernel
  ...

Link: https://lore.kernel.org/r/cover.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:05 -07:00
Nam Cao
b635a84bde
riscv: remove limit on the size of read-only section for XIP kernel
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size. This causes
build failures if the kernel gets too big [1].

Remove this limit.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404211031.J6l2AfJk-lkp@intel.com [1]
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/3bf3a77be10ebb0d8086c028500baa16e7a8e648.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:02 -07:00
Nam Cao
a7cfb99943
riscv: drop the use of XIP_OFFSET in create_kernel_page_table()
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded value entirely, stop using
XIP_OFFSET in create_kernel_page_table(). Instead use _sdata and _start to
do the same thing.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/4ea3f222a7eb9f91c04b155ff2e4d3ef19158acc.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:01 -07:00
Nam Cao
75fdf791df
riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa()
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded macro XIP_OFFSET entirely,
remove the use of XIP_OFFSET in kernel_mapping_va_to_pa(). The macro
XIP_OFFSET is used in this case to check if the virtual address is mapped
to Flash or to RAM. The same check can be done with kernel_map.xiprom_sz.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/644c13d9467525a06f5d63d157875a35b2edb4bc.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:00 -07:00
Nam Cao
23311f57ee
riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded macro XIP_OFFSET entirely, stop
using XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET. Instead, use __data_loc and
_sdata to do the same thing.

While at it, also add a description for XIP_FIXUP_FLASH_OFFSET.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/7b3319657edd1822f3457e7e7c07aaa326cc2f87.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:22:59 -07:00
Nam Cao
e4eac34fed
riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded macro XIP_OFFSET entirely, stop
using XIP_OFFSET in XIP_FIXUP_OFFSET. Instead, use CONFIG_PHYS_RAM_BASE and
_sdata to do the same thing.

While at it, also add a description for XIP_FIXUP_OFFSET.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/dba0409518b14ee83b346e099b1f7f934daf7b74.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:22:58 -07:00
Nam Cao
5cf0896721
riscv: replace misleading va_kernel_pa_offset on XIP kernel
On XIP kernel, the name "va_kernel_pa_offset" is misleading: unlike
"normal" kernel, it is not the virtual-physical address offset of kernel
mapping, it is the offset of kernel mapping's first virtual address to
first physical address in DRAM, which is not meaningful because the
kernel's first physical address is not in DRAM.

For XIP kernel, there are 2 different offsets because the read-only part of
the kernel resides in ROM while the rest is in RAM. The offset to ROM is in
kernel_map.va_kernel_xip_pa_offset, while the offset to RAM is not stored
anywhere: it is calculated on-the-fly.

Remove this confusing "va_kernel_pa_offset" and add
"va_kernel_xip_data_pa_offset" as its replacement. This new variable is the
offset of virtual mapping of the kernel's data portion to the corresponding
physical addresses.

With the introduction of this new variable, also rename
va_kernel_xip_pa_offset -> va_kernel_xip_text_pa_offset to make it clear
that this one is about the .text section.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/84e5d005c1386d88d7b2531e0b6707ec5352ee54.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:22:57 -07:00
Nam Cao
f2df5b4fdd
riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel
The crash utility uses va_kernel_pa_offset to translate virtual addresses.
This is incorrect in the case of XIP kernel, because va_kernel_pa_offset is
not the virtual-physical address offset (yes, the name is misleading; this
variable will be removed for XIP in a following commit).

Stop exporting this variable for XIP kernel. The replacement is to be
determined, note it as a TODO for now.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/8f8760d3f9a11af4ea0acbc247e4f49ff5d317e9.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:22:56 -07:00
Nam Cao
aa3457f22f
riscv: cleanup XIP_FIXUP macro
The XIP_FIXUP macro is used to fix addresses early during boot before MMU:
generated code "thinks" the data section is in ROM while it is actually in
RAM. So this macro corrects the addresses in the data section.

This macro determines if the address needs to be fixed by checking if it is
within the range starting from ROM address up to the size of (2 *
XIP_OFFSET).

This means if the kernel size is bigger than (2 * XIP_OFFSET), some
addresses would not be fixed up.

XIP kernel can still work if the above scenario does not happen. But this
macro is obviously incorrect.

Rewrite this macro to only fix up addresses within the data section.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/95f50a4ec8204ec4fcbf2a80c9addea0e0609e3b.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:22:55 -07:00
Charlie Jenkins
4ffc8a3422
riscv: Add license to vmalloc.h
Add a missing license to vmalloc.h.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240729-riscv_fence_license-v1-2-7d5648069640@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-03 07:18:34 -07:00
Charlie Jenkins
097c72e1f2
riscv: Add license to fence.h
Add a missing license to fence.h.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240729-riscv_fence_license-v1-1-7d5648069640@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-03 07:18:33 -07:00
Jinjie Ruan
0e3f3649d4
riscv: Enable generic CPU vulnerabilites support
Currently x86, ARM and ARM64 support generic CPU vulnerabilites, but
RISC-V not, such as:

	# cd /sys/devices/system/cpu/vulnerabilities/
x86:
	# cat spec_store_bypass
		Mitigation: Speculative Store Bypass disabled via prctl and seccomp
	# cat meltdown
		Not affected

ARM64:

	# cat spec_store_bypass
		Mitigation: Speculative Store Bypass disabled via prctl and seccomp
	# cat meltdown
		Mitigation: PTI

RISC-V:

	# cat /sys/devices/system/cpu/vulnerabilities
	# ... No such file or directory

As SiFive RISC-V Core IP offerings are not affected by Meltdown and
Spectre, it can use the default weak function as below:

	# cat spec_store_bypass
		Not affected
	# cat meltdown
		Not affected

Link: https://www.sifive.cn/blog/sifive-statement-on-meltdown-and-spectre

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20240703022732.2068316-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 17:44:34 -07:00
Ying Sun
c6ebf2c528
riscv/kexec_file: Fix relocation type R_RISCV_ADD16 and R_RISCV_SUB16 unknown
Runs on the kernel with CONFIG_RISCV_ALTERNATIVE enabled:
  kexec -sl vmlinux

Error:
  kexec_image: Unknown rela relocation: 34
  kexec_image: Error loading purgatory ret=-8
and
  kexec_image: Unknown rela relocation: 38
  kexec_image: Error loading purgatory ret=-8

The purgatory code uses the 16-bit addition and subtraction relocation
type, but not handled, resulting in kexec_file_load failure.
So add handle to arch_kexec_apply_relocations_add().

Tested on RISC-V64 Qemu-virt, issue fixed.

Co-developed-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Ying Sun <sunying@isrc.iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240711083236.2859632-1-sunying@isrc.iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 17:44:33 -07:00
Nam Cao
57d76bc51f
riscv: change XIP's kernel_map.size to be size of the entire kernel
With XIP kernel, kernel_map.size is set to be only the size of data part of
the kernel. This is inconsistent with "normal" kernel, who sets it to be
the size of the entire kernel.

More importantly, XIP kernel fails to boot if CONFIG_DEBUG_VIRTUAL is
enabled, because there are checks on virtual addresses with the assumption
that kernel_map.size is the size of the entire kernel (these checks are in
arch/riscv/mm/physaddr.c).

Change XIP's kernel_map.size to be the size of the entire kernel.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: <stable@vger.kernel.org> # v6.1+
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240508191917.2892064-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:12:33 -07:00
Celeste Liu
6111939463
riscv: entry: always initialize regs->a0 to -ENOSYS
Otherwise when the tracer changes syscall number to -1, the kernel fails
to initialize a0 with -ENOSYS and subsequently fails to return the error
code of the failed syscall to userspace. For example, it will break
strace syscall tampering.

Fixes: 52449c17bd ("riscv: entry: set a0 = -ENOSYS only when syscall != -1")
Reported-by: "Dmitry V. Levin" <ldv@strace.io>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
Link: https://lore.kernel.org/r/20240627142338.5114-2-CoelacanthusHex@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:12:22 -07:00