Commit Graph

1795 Commits

Author SHA1 Message Date
Peter Zijlstra
9406415f46 sched/debug: Rename the sched_debug parameter to sched_verbose
CONFIG_SCHED_DEBUG is the build-time Kconfig knob, the boot param
sched_debug and the /debug/sched/debug_enabled knobs control the
sched_debug_enabled variable, but what they really do is make
SCHED_DEBUG more verbose, so rename the lot.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2021-04-17 13:22:44 +02:00
Catalin Marinas
a1e1eddef2 Merge branches 'for-next/misc', 'for-next/kselftest', 'for-next/xntable', 'for-next/vdso', 'for-next/fiq', 'for-next/epan', 'for-next/kasan-vmalloc', 'for-next/fgt-boot-init', 'for-next/vhe-only' and 'for-next/neon-softirqs-disabled', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* for-next/misc:
  : Miscellaneous patches
  arm64/sve: Add compile time checks for SVE hooks in generic functions
  arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG.
  arm64/sve: Remove redundant system_supports_sve() tests
  arm64: mte: Remove unused mte_assign_mem_tag_range()
  arm64: Add __init section marker to some functions
  arm64/sve: Rework SVE access trap to convert state in registers
  docs: arm64: Fix a grammar error
  arm64: smp: Add missing prototype for some smp.c functions
  arm64: setup: name `tcr` register
  arm64: setup: name `mair` register
  arm64: stacktrace: Move start_backtrace() out of the header
  arm64: barrier: Remove spec_bar() macro
  arm64: entry: remove test_irqs_unmasked macro
  ARM64: enable GENERIC_FIND_FIRST_BIT
  arm64: defconfig: Use DEBUG_INFO_REDUCED

* for-next/kselftest:
  : Various kselftests for arm64
  kselftest: arm64: Add BTI tests
  kselftest/arm64: mte: Report filename on failing temp file creation
  kselftest/arm64: mte: Fix clang warning
  kselftest/arm64: mte: Makefile: Fix clang compilation
  kselftest/arm64: mte: Output warning about failing compiler
  kselftest/arm64: mte: Use cross-compiler if specified
  kselftest/arm64: mte: Fix MTE feature detection
  kselftest/arm64: mte: common: Fix write() warnings
  kselftest/arm64: mte: user_mem: Fix write() warning
  kselftest/arm64: mte: ksm_options: Fix fscanf warning
  kselftest/arm64: mte: Fix pthread linking
  kselftest/arm64: mte: Fix compilation with native compiler

* for-next/xntable:
  : Add hierarchical XN permissions for all page tables
  arm64: mm: use XN table mapping attributes for user/kernel mappings
  arm64: mm: use XN table mapping attributes for the linear region
  arm64: mm: add missing P4D definitions and use them consistently

* for-next/vdso:
  : Minor improvements to the compat vdso and sigpage
  arm64: compat: Poison the compat sigpage
  arm64: vdso: Avoid ISB after reading from cntvct_el0
  arm64: compat: Allow signal page to be remapped
  arm64: vdso: Remove redundant calls to flush_dcache_page()
  arm64: vdso: Use GFP_KERNEL for allocating compat vdso and signal pages

* for-next/fiq:
  : Support arm64 FIQ controller registration
  arm64: irq: allow FIQs to be handled
  arm64: Always keep DAIF.[IF] in sync
  arm64: entry: factor irq triage logic into macros
  arm64: irq: rework root IRQ handler registration
  arm64: don't use GENERIC_IRQ_MULTI_HANDLER
  genirq: Allow architectures to override set_handle_irq() fallback

* for-next/epan:
  : Support for Enhanced PAN (execute-only permissions)
  arm64: Support execute-only permissions with Enhanced PAN

* for-next/kasan-vmalloc:
  : Support CONFIG_KASAN_VMALLOC on arm64
  arm64: Kconfig: select KASAN_VMALLOC if KANSAN_GENERIC is enabled
  arm64: kaslr: support randomized module area with KASAN_VMALLOC
  arm64: Kconfig: support CONFIG_KASAN_VMALLOC
  arm64: kasan: abstract _text and _end to KERNEL_START/END
  arm64: kasan: don't populate vmalloc area for CONFIG_KASAN_VMALLOC

* for-next/fgt-boot-init:
  : Booting clarifications and fine grained traps setup
  arm64: Require that system registers at all visible ELs be initialized
  arm64: Disable fine grained traps on boot
  arm64: Document requirements for fine grained traps at boot

* for-next/vhe-only:
  : Dealing with VHE-only CPUs (a.k.a. M1)
  arm64: Get rid of CONFIG_ARM64_VHE
  arm64: Cope with CPUs stuck in VHE mode
  arm64: cpufeature: Allow early filtering of feature override

* arm64/for-next/perf:
  arm64: perf: Remove redundant initialization in perf_event.c
  perf/arm_pmu_platform: Clean up with dev_printk
  perf/arm_pmu_platform: Fix error handling
  perf/arm_pmu_platform: Use dev_err_probe() for IRQ errors
  docs: perf: Address some html build warnings
  docs: perf: Add new description on HiSilicon uncore PMU v2
  drivers/perf: hisi: Add support for HiSilicon PA PMU driver
  drivers/perf: hisi: Add support for HiSilicon SLLC PMU driver
  drivers/perf: hisi: Update DDRC PMU for programmable counter
  drivers/perf: hisi: Add new functions for HHA PMU
  drivers/perf: hisi: Add new functions for L3C PMU
  drivers/perf: hisi: Add PMU version for uncore PMU drivers.
  drivers/perf: hisi: Refactor code for more uncore PMUs
  drivers/perf: hisi: Remove unnecessary check of counter index
  drivers/perf: Simplify the SMMUv3 PMU event attributes
  drivers/perf: convert sysfs sprintf family to sysfs_emit
  drivers/perf: convert sysfs scnprintf family to sysfs_emit_at() and sysfs_emit()
  drivers/perf: convert sysfs snprintf family to sysfs_emit

* for-next/neon-softirqs-disabled:
  : Run kernel mode SIMD with softirqs disabled
  arm64: fpsimd: run kernel mode NEON with softirqs disabled
  arm64: assembler: introduce wxN aliases for wN registers
  arm64: assembler: remove conditional NEON yield macros
2021-04-15 14:00:38 +01:00
Sumit Garg
5d0682be31 KEYS: trusted: Add generic trusted keys framework
Current trusted keys framework is tightly coupled to use TPM device as
an underlying implementation which makes it difficult for implementations
like Trusted Execution Environment (TEE) etc. to provide trusted keys
support in case platform doesn't posses a TPM device.

Add a generic trusted keys framework where underlying implementations
can be easily plugged in. Create struct trusted_key_ops to achieve this,
which contains necessary functions of a backend.

Also, define a module parameter in order to select a particular trust
source in case a platform support multiple trust sources. In case its
not specified then implementation itetrates through trust sources list
starting with TPM and assign the first trust source as a backend which
has initiazed successfully during iteration.

Note that current implementation only supports a single trust source at
runtime which is either selectable at compile time or during boot via
aforementioned module parameter.

Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14 16:30:30 +03:00
Thorsten Leemhuis
6161a4b18a docs: reporting-issues: make people CC the regressions list
Make people CC the recently created mailing list dedicated to Linux
kernel regressions when reporting one. Some paragraphs had to be
reshuffled and slightly rewritten during the process, as the text
otherwise would have gotten unnecessarily hard to follow.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/ac28089d710d5d41f295221bc726555ba32f4984.1617967127.git.linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-04-13 15:13:27 -06:00
Marc Zyngier
2d726d0db6 arm64: Get rid of CONFIG_ARM64_VHE
CONFIG_ARM64_VHE was introduced with ARMv8.1 (some 7 years ago),
and has been enabled by default for almost all that time.

Given that newer systems that are VHE capable are finally becoming
available, and that some systems are even incapable of not running VHE,
drop the configuration altogether.

Anyone willing to stick to non-VHE on VHE hardware for obscure
reasons should use the 'kvm-arm.mode=nvhe' command-line option.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210408131010.1109027-4-maz@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-08 18:45:16 +01:00
Kees Cook
39218ff4c6 stack: Optionally randomize kernel stack offset each syscall
This provides the ability for architectures to enable kernel stack base
address offset randomization. This feature is controlled by the boot
param "randomize_kstack_offset=on/off", with its default value set by
CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT.

This feature is based on the original idea from the last public release
of PaX's RANDKSTACK feature: https://pax.grsecurity.net/docs/randkstack.txt
All the credit for the original idea goes to the PaX team. Note that
the design and implementation of this upstream randomize_kstack_offset
feature differs greatly from the RANDKSTACK feature (see below).

Reasoning for the feature:

This feature aims to make harder the various stack-based attacks that
rely on deterministic stack structure. We have had many such attacks in
past (just to name few):

https://jon.oberheide.org/files/infiltrate12-thestackisback.pdf
https://jon.oberheide.org/files/stackjacking-infiltrate11.pdf
https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html

As Linux kernel stack protections have been constantly improving
(vmap-based stack allocation with guard pages, removal of thread_info,
STACKLEAK), attackers have had to find new ways for their exploits
to work. They have done so, continuing to rely on the kernel's stack
determinism, in situations where VMAP_STACK and THREAD_INFO_IN_TASK_STRUCT
were not relevant. For example, the following recent attacks would have
been hampered if the stack offset was non-deterministic between syscalls:

https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf
(page 70: targeting the pt_regs copy with linear stack overflow)

https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
(leaked stack address from one syscall as a target during next syscall)

The main idea is that since the stack offset is randomized on each system
call, it is harder for an attack to reliably land in any particular place
on the thread stack, even with address exposures, as the stack base will
change on the next syscall. Also, since randomization is performed after
placing pt_regs, the ptrace-based approach[1] to discover the randomized
offset during a long-running syscall should not be possible.

Design description:

During most of the kernel's execution, it runs on the "thread stack",
which is pretty deterministic in its structure: it is fixed in size,
and on every entry from userspace to kernel on a syscall the thread
stack starts construction from an address fetched from the per-cpu
cpu_current_top_of_stack variable. The first element to be pushed to the
thread stack is the pt_regs struct that stores all required CPU registers
and syscall parameters. Finally the specific syscall function is called,
with the stack being used as the kernel executes the resulting request.

The goal of randomize_kstack_offset feature is to add a random offset
after the pt_regs has been pushed to the stack and before the rest of the
thread stack is used during the syscall processing, and to change it every
time a process issues a syscall. The source of randomness is currently
architecture-defined (but x86 is using the low byte of rdtsc()). Future
improvements for different entropy sources is possible, but out of scope
for this patch. Further more, to add more unpredictability, new offsets
are chosen at the end of syscalls (the timing of which should be less
easy to measure from userspace than at syscall entry time), and stored
in a per-CPU variable, so that the life of the value does not stay
explicitly tied to a single task.

As suggested by Andy Lutomirski, the offset is added using alloca()
and an empty asm() statement with an output constraint, since it avoids
changes to assembly syscall entry code, to the unwinder, and provides
correct stack alignment as defined by the compiler.

In order to make this available by default with zero performance impact
for those that don't want it, it is boot-time selectable with static
branches. This way, if the overhead is not wanted, it can just be
left turned off with no performance impact.

The generated assembly for x86_64 with GCC looks like this:

...
ffffffff81003977: 65 8b 05 02 ea 00 7f  mov %gs:0x7f00ea02(%rip),%eax
					    # 12380 <kstack_offset>
ffffffff8100397e: 25 ff 03 00 00        and $0x3ff,%eax
ffffffff81003983: 48 83 c0 0f           add $0xf,%rax
ffffffff81003987: 25 f8 07 00 00        and $0x7f8,%eax
ffffffff8100398c: 48 29 c4              sub %rax,%rsp
ffffffff8100398f: 48 8d 44 24 0f        lea 0xf(%rsp),%rax
ffffffff81003994: 48 83 e0 f0           and $0xfffffffffffffff0,%rax
...

As a result of the above stack alignment, this patch introduces about
5 bits of randomness after pt_regs is spilled to the thread stack on
x86_64, and 6 bits on x86_32 (since its has 1 fewer bit required for
stack alignment). The amount of entropy could be adjusted based on how
much of the stack space we wish to trade for security.

My measure of syscall performance overhead (on x86_64):

lmbench: /usr/lib/lmbench/bin/x86_64-linux-gnu/lat_syscall -N 10000 null
    randomize_kstack_offset=y	Simple syscall: 0.7082 microseconds
    randomize_kstack_offset=n	Simple syscall: 0.7016 microseconds

So, roughly 0.9% overhead growth for a no-op syscall, which is very
manageable. And for people that don't want this, it's off by default.

There are two gotchas with using the alloca() trick. First,
compilers that have Stack Clash protection (-fstack-clash-protection)
enabled by default (e.g. Ubuntu[3]) add pagesize stack probes to
any dynamic stack allocations. While the randomization offset is
always less than a page, the resulting assembly would still contain
(unreachable!) probing routines, bloating the resulting assembly. To
avoid this, -fno-stack-clash-protection is unconditionally added to
the kernel Makefile since this is the only dynamic stack allocation in
the kernel (now that VLAs have been removed) and it is provably safe
from Stack Clash style attacks.

The second gotcha with alloca() is a negative interaction with
-fstack-protector*, in that it sees the alloca() as an array allocation,
which triggers the unconditional addition of the stack canary function
pre/post-amble which slows down syscalls regardless of the static
branch. In order to avoid adding this unneeded check and its associated
performance impact, architectures need to carefully remove uses of
-fstack-protector-strong (or -fstack-protector) in the compilation units
that use the add_random_kstack() macro and to audit the resulting stack
mitigation coverage (to make sure no desired coverage disappears). No
change is visible for this on x86 because the stack protector is already
unconditionally disabled for the compilation unit, but the change is
required on arm64. There is, unfortunately, no attribute that can be
used to disable stack protector for specific functions.

Comparison to PaX RANDKSTACK feature:

The RANDKSTACK feature randomizes the location of the stack start
(cpu_current_top_of_stack), i.e. including the location of pt_regs
structure itself on the stack. Initially this patch followed the same
approach, but during the recent discussions[2], it has been determined
to be of a little value since, if ptrace functionality is available for
an attacker, they can use PTRACE_PEEKUSR/PTRACE_POKEUSR to read/write
different offsets in the pt_regs struct, observe the cache behavior of
the pt_regs accesses, and figure out the random stack offset. Another
difference is that the random offset is stored in a per-cpu variable,
rather than having it be per-thread. As a result, these implementations
differ a fair bit in their implementation details and results, though
obviously the intent is similar.

[1] https://lore.kernel.org/kernel-hardening/2236FBA76BA1254E88B949DDB74E612BA4BC57C1@IRSMSX102.ger.corp.intel.com/
[2] https://lore.kernel.org/kernel-hardening/20190329081358.30497-1-elena.reshetova@intel.com/
[3] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.html

Co-developed-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210401232347.2791257-4-keescook@chromium.org
2021-04-08 14:05:19 +02:00
Maciej W. Rozycki
7d33004d24 pata_legacy: Add `probe_mask' parameter like with ide-generic
Carry the `probe_mask' parameter over from ide-generic to pata_legacy so
that there is a way to prevent random poking at ISA port I/O locations
in attempt to discover adapter option cards with libata like with the
old IDE driver.  By default all enabled locations are tried, however it
may interfere with a different kind of hardware responding there.

For example with a plain (E)ISA system the driver tries all the six
possible locations:

scsi host0: pata_legacy
ata1: PATA max PIO4 cmd 0x1f0 ctl 0x3f6 irq 14
ata1.00: ATA-4: ST310211A, 3.54, max UDMA/100
ata1.00: 19541088 sectors, multi 16: LBA
ata1.00: configured for PIO
scsi 0:0:0:0: Direct-Access     ATA      ST310211A        3.54 PQ: 0 ANSI: 5
scsi 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] 19541088 512-byte logical blocks: (10.0 GB/9.32 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3
sd 0:0:0:0: [sda] Attached SCSI disk
scsi host1: pata_legacy
ata2: PATA max PIO4 cmd 0x170 ctl 0x376 irq 15
scsi host1: pata_legacy
ata3: PATA max PIO4 cmd 0x1e8 ctl 0x3ee irq 11
scsi host1: pata_legacy
ata4: PATA max PIO4 cmd 0x168 ctl 0x36e irq 10
scsi host1: pata_legacy
ata5: PATA max PIO4 cmd 0x1e0 ctl 0x3e6 irq 8
scsi host1: pata_legacy
ata6: PATA max PIO4 cmd 0x160 ctl 0x366 irq 12

however giving the kernel "pata_legacy.probe_mask=21" makes it try every
other location only:

scsi host0: pata_legacy
ata1: PATA max PIO4 cmd 0x1f0 ctl 0x3f6 irq 14
ata1.00: ATA-4: ST310211A, 3.54, max UDMA/100
ata1.00: 19541088 sectors, multi 16: LBA
ata1.00: configured for PIO
scsi 0:0:0:0: Direct-Access     ATA      ST310211A        3.54 PQ: 0 ANSI: 5
scsi 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] 19541088 512-byte logical blocks: (10.0 GB/9.32 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3
sd 0:0:0:0: [sda] Attached SCSI disk
scsi host1: pata_legacy
ata2: PATA max PIO4 cmd 0x1e8 ctl 0x3ee irq 11
scsi host1: pata_legacy
ata3: PATA max PIO4 cmd 0x1e0 ctl 0x3e6 irq 8

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2103211800110.21463@angie.orcam.me.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-06 09:27:30 -06:00
Maciej W. Rozycki
6ddcec9547 pata_platform: Document `pio_mask' module parameter
Add MODULE_PARM_DESC documentation and a kernel-parameters.txt entry.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2103212023190.21463@angie.orcam.me.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-06 09:27:30 -06:00
Maciej W. Rozycki
426e2c6a2c pata_legacy: Properly document module parameters
Most pata_legacy module parameters lack MODULE_PARM_DESC documentation
and none is described in kernel-parameters.txt.  Also several comments
are inaccurate or wrong.

Add the missing documentation pieces then and reorder parameters into a
consistent block.  Remove inaccuracies as follows:

- `all' affects primary and secondary port ranges only rather than all,

- `probe_all' affects tertiary and further port ranges rather than all,

- `ht6560b' is for HT 6560B rather than HT 6560A.

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2103211909560.21463@angie.orcam.me.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-06 09:27:30 -06:00
Vipin Sharma
25259fc914 cgroup: Miscellaneous cgroup documentation.
Documentation of miscellaneous cgroup controller. This new controller is
used to track and limit the usage of scalar resources.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Rientjes <rientjes@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2021-04-04 13:34:46 -04:00
Christophe Leroy
51c66ad849 powerpc/bpf: Implement extended BPF on PPC32
Implement Extended Berkeley Packet Filter on Powerpc 32

Test result with test_bpf module:

	test_bpf: Summary: 378 PASSED, 0 FAILED, [354/366 JIT'ed]

Registers mapping:

	[BPF_REG_0] = r11-r12
	/* function arguments */
	[BPF_REG_1] = r3-r4
	[BPF_REG_2] = r5-r6
	[BPF_REG_3] = r7-r8
	[BPF_REG_4] = r9-r10
	[BPF_REG_5] = r21-r22 (Args 9 and 10 come in via the stack)
	/* non volatile registers */
	[BPF_REG_6] = r23-r24
	[BPF_REG_7] = r25-r26
	[BPF_REG_8] = r27-r28
	[BPF_REG_9] = r29-r30
	/* frame pointer aka BPF_REG_10 */
	[BPF_REG_FP] = r17-r18
	/* eBPF jit internal registers */
	[BPF_REG_AX] = r19-r20
	[TMP_REG] = r31

As PPC32 doesn't have a redzone in the stack, a stack frame must always
be set in order to host at least the tail count counter.

The stack frame remains for tail calls, it is set by the first callee
and freed by the last callee.

r0 is used as temporary register as much as possible. It is referenced
directly in the code in order to avoid misusing it, because some
instructions interpret it as value 0 instead of register r0
(ex: addi, addis, stw, lwz, ...)

The following operations are not implemented:

		case BPF_ALU64 | BPF_DIV | BPF_X: /* dst /= src */
		case BPF_ALU64 | BPF_MOD | BPF_X: /* dst %= src */
		case BPF_STX | BPF_XADD | BPF_DW: /* *(u64 *)(dst + off) += src */

The following operations are only implemented for power of two constants:

		case BPF_ALU64 | BPF_MOD | BPF_K: /* dst %= imm */
		case BPF_ALU64 | BPF_DIV | BPF_K: /* dst /= imm */

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/61d8b149176ddf99e7d5cef0b6dc1598583ca202.1616430991.git.christophe.leroy@csgroup.eu
2021-04-03 21:22:21 +11:00
Ismael Luceno
0e5e0a5553 docs: reporting-issues: Remove reference to oldnoconfig
Replace it with olddefconfig. oldnoconfig didn't do what the document
suggests (it aliased to olddefconfig), and isn't available since 4.19.

Ref: 04c459d204 ("kconfig: remove oldnoconfig target")
Ref: 312ee68752 ("kconfig: announce removal of oldnoconfig if used")
Signed-off-by: Ismael Luceno <ismael@iodev.co.uk>
Link: https://lore.kernel.org/r/20210331163541.28356-1-ismael@iodev.co.uk
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-31 14:33:45 -06:00
Wang Qing
c5c1c700e2 doc: admin-guide: remove explanation of "watchdog/%u"
"watchdog/%u" threads has be replaced by cpu_stop_work,
which will mislead the reader.

Signed-off-by: Wang Qing <wangqing@vivo.com>
Link: https://lore.kernel.org/r/1615801744-31548-1-git-send-email-wangqing@vivo.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-31 14:32:02 -06:00
Mark O'Donovan
abb9c07885 Documentation: Add leading slash to some paths
Change multiple sys/xyz to /sys/xyz

Signed-off-by: Mark O'Donovan <shiftee@posteo.net>
Link: https://lore.kernel.org/r/20210328152837.73347-1-shiftee@posteo.net
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-31 13:49:19 -06:00
Thorsten Leemhuis
58c539453b docs: reporting-issues: reduce quoting and assorted fixes
A pile of small fixes:

- don't quote terms like vanilla, mainline, and stable, unless in they
  occur in places where readers new to the kernel might see them for the
  first time

- make people rule out that vendor patches are interfering if they face
  a regression in a stable or longterm kernel they saw in a vendor
  kernel for the first time

- s/bugs/issues/ in a selected spots

- exchange two headlines that got mixed up somehow

- add a few links to some of the steps in the guide

- Greg mentioned sending reports to the stable mailing list is
  sufficient, so remove the "CC stable maintainers" bits

- fix a few typos and mistakes in the text, with a few very small
  improvements along the way

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/07bca15d8465b8e234537feb8841dd2ff20243bc.1617113469.git.linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-31 13:47:07 -06:00
Thorsten Leemhuis
4d2f46a8cd docs: reporting-issues.rst: reshuffle and improve TLDR
Make the TLDR a bit shorter while improving it at the same time by going
straight to the aspects readers are more interested it. The change makes
the process especially more straight-forward for people that hit a
regression in a stable or longterm kernel. Due to the changes the TLDR
now also matches the step by step guide better.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/762ccd7735315d2fdaa79612fccc1f474881118b.1617113469.git.linux@leemhuis.info
[ jc: fixed transposed _` as noted by Thorsten ]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-31 13:46:33 -06:00
Thorsten Leemhuis
d2ce285378 docs: make reporting-issues.rst official and delete reporting-bugs.rst
Remove the WIP and two FIXME notes in the text to make it official, as
it's now considered fully ready for consumption. To make sure this
step is okay for people the intent of this change and the latest version
of the text were posted to ksummit-discuss; nobody complained, thus
lets move ahead.

Add a footer to point out people can contact Thorsten directly in case
they find something to improve in the text.

Dear reporting-bugs.rst, I'm sorry to tell you, but that makes you fully
obsolete and we thus have to let you go now. Thank you very much for
your service, you in one form or another have been around for a long
time. I'm sure over the years you got read a lot and helped quite a few
people. But it's time to retire now. Rest in peace.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Harry Wei <harryxiyou@gmail.com>
CC: Alex Shi <alex.shi@linux.alibaba.com>
CC: Federico Vaga <federico.vaga@vaga.pv.it>
CC: Greg KH <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/49c674c2d304d87e6259063580fda05267e8c348.1617113469.git.linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-31 13:33:17 -06:00
Mukesh Ojha
9d843e8faf pstore: Add mem_type property DT parsing support
There could be a scenario where we define some region
in normal memory and use them store to logs which is later
retrieved by bootloader during warm reset.

In this scenario, we wanted to treat this memory as normal
cacheable memory instead of default behaviour which
is an overhead. Making it cacheable could improve
performance.

This commit gives control to change mem_type from Device
tree, and also documents the value for normal memory.

Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/1616438537-13719-1-git-send-email-mojha@codeaurora.org
2021-03-31 10:06:23 -07:00
Qi Liu
b88f5e9792 docs: perf: Address some html build warnings
Fix following html build warnings:
Documentation/admin-guide/perf/hisi-pmu.rst:61: WARNING: Unexpected indentation.
Documentation/admin-guide/perf/hisi-pmu.rst:62: WARNING: Block quote ends without a blank line; unexpected unindent.
Documentation/admin-guide/perf/hisi-pmu.rst:69: WARNING: Unexpected indentation.
Documentation/admin-guide/perf/hisi-pmu.rst:70: WARNING: Block quote ends without a blank line; unexpected unindent.
Documentation/admin-guide/perf/hisi-pmu.rst:83: WARNING: Unexpected indentation.

Fixes: 9b86b1b41e ("docs: perf: Add new description on HiSilicon uncore PMU v2")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1617021121-31450-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-30 11:39:09 +01:00
Fenghua Yu
ebca17707e Documentation/admin-guide: Change doc for split_lock_detect parameter
Since #DB for bus lock detect changes the split_lock_detect parameter,
update the documentation for the changes.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210322135325.682257-4-fenghua.yu@intel.com
2021-03-28 22:52:16 +02:00
Andy Shevchenko
81dd500b1c gpio: mockup: Adjust documentation to the code
First of all one of the parameter missed 'mockup' in its name,
Second, the semantics of the integer pairs depends on the sign
of the base (the first value in the pair).

Update documentation to reflect the real code behaviour.

Fixes: 2fd1abe99e ("Documentation: gpio: add documentation for gpio-mockup")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-03-26 14:56:19 +01:00
Dmitry Vyukov
6c996e1994 net: change netdev_unregister_timeout_secs min value to 1
netdev_unregister_timeout_secs=0 can lead to printing the
"waiting for dev to become free" message every jiffy.
This is too frequent and unnecessary.
Set the min value to 1 second.

Also fix the merge issue introduced by
"net: make unregister netdev warning timeout configurable":
it changed "refcnt != 1" to "refcnt".

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Fixes: 5aa3afe107 ("net: make unregister netdev warning timeout configurable")
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25 17:24:06 -07:00
Darrick J. Wong
3fef46fc43 xfs: rename the blockgc workqueue
Since we're about to start using the blockgc workqueue to dispose of
inactivated inodes, strip the "block" prefix from the name; now it's
merely the general garbage collection (gc) workqueue.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2021-03-25 16:47:50 -07:00
Thorsten Leemhuis
4b9d49d1ec docs: reporting-issues.rst: improved process esp. for stable regressions
Provide a shorter and easier process for users that deal with
regressions in stable and longterm kernels, as those should be reported
quickly.

To realize this in the least-confusing way and without having steps
multiple times in different places, split the 'search for existing
reports' into two. That has the additinal benefit that users will search
for them quickly when going through the step by step guide and thus will
save them trouble if the find reports.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
CC: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/d934c15e536bceeff5c40a126930ddf803548e08.1616181657.git.linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-25 12:16:46 -06:00
Thorsten Leemhuis
9bc4430db5 docs: reporting-issues.rst: duplicate sections for reviewing purposes
This duplicates two section to make the diff in the next patch a bit
easier to gasp for humans.

Straight copy, no content changes.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/ef85edc8466f035eb243dd6629429ad4fd0565d8.1616181657.git.linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-25 12:16:46 -06:00
Thorsten Leemhuis
4f08d7ab90 docs: reporting-issues.rst: reorder some steps
Reorder some steps where the order in which the readers perform them is
not crucial. This is a preparation for a later change that would make
the text much more complex otherwise.

Content just moved, not changed at all in the process.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/8dfc58efde25a05ccf9bf85929826c4b1b9e09c5.1616181657.git.linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-25 12:16:46 -06:00
Thorsten Leemhuis
2dfa9eb0ff docs: reporting-issues.rst: tone down 'test vanilla mainline' a little
Tell users that reporting bugs with vendor kernels which are only
slightly patched can be okay in some situations, but point out there's a
risk in doing so.

Adjust some related sections to make them compatible and a bit clearer.
At the same time make them less daunting: we want users to report bugs,
even if they can't test vanilla mainline kernel.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
CC: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/652ee20eb36228f5d7ca842299faa4cb472feedb.1616181657.git.linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-25 12:16:46 -06:00
Thorsten Leemhuis
613f969117 docs: reporting-issues.rst: fix small typos and style issues
Fix a typo and change "head over" to "scroll down", as suggested by Jon
when reviewing another patch that used the phrase the same way.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/fb845d2f1db6138337203bbfac419c04b5f28053.1616181657.git.linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-25 12:16:46 -06:00
Shaokun Zhang
9b86b1b41e docs: perf: Add new description on HiSilicon uncore PMU v2
Some news functions are added on HiSilicon uncore PMUs. Document them
to provide guidance on how to use them.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-10-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 13:03:46 +00:00
Paul E. McKenney
ab6ad3dbdd Merge branches 'bitmaprange.2021.03.08a', 'fixes.2021.03.15a', 'kvfree_rcu.2021.03.08a', 'mmdumpobj.2021.03.08a', 'nocb.2021.03.15a', 'poll.2021.03.24a', 'rt.2021.03.08a', 'tasks.2021.03.08a', 'torture.2021.03.08a' and 'torturescript.2021.03.22a' into HEAD
bitmaprange.2021.03.08a:  Allow 3-N for bitmap ranges.
fixes.2021.03.15a:  Miscellaneous fixes.
kvfree_rcu.2021.03.08a:  kvfree_rcu() updates.
mmdumpobj.2021.03.08a:  mem_dump_obj() updates.
nocb.2021.03.15a:  RCU NOCB CPU updates, including limited deoffloading.
poll.2021.03.24a:  Polling grace-period interfaces for RCU.
rt.2021.03.08a:  Realtime-related RCU changes.
tasks.2021.03.08a:  Tasks-RCU updates.
torture.2021.03.08a:  Torture-test updates.
torturescript.2021.03.22a:  Torture-test scripting updates.
2021-03-24 17:20:18 -07:00
Dmitry Vyukov
5aa3afe107 net: make unregister netdev warning timeout configurable
netdev_wait_allrefs() issues a warning if refcount does not drop to 0
after 10 seconds. While 10 second wait generally should not happen
under normal workload in normal environment, it seems to fire falsely
very often during fuzzing and/or in qemu emulation (~10x slower).
At least it's not possible to understand if it's really a false
positive or not. Automated testing generally bumps all timeouts
to very high values to avoid flake failures.
Add net.core.netdev_unregister_timeout_secs sysctl to make
the timeout configurable for automated testing systems.
Lowering the timeout may also be useful for e.g. manual bisection.
The default value matches the current behavior.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=211877
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23 17:22:50 -07:00
Nitin Joshi
3feb52a2b8 platform/x86: thinkpad_acpi: sysfs interface to get wwan antenna type
On some newer Thinkpads we need to set SAR value based on antenna type.
This patch provides a sysfs interface that userspace can use to get
antenna type and set corresponding SAR value, as is required for FCC
certification.

Reviewed-by: Mark Pearson <markpearson@lenovo.com>
Signed-off-by: Nitin Joshi <njoshi1@lenovo.com>
Link: https://lore.kernel.org/r/20210317024636.356175-1-njoshi1@lenovo.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-03-21 18:04:19 +01:00
Robin Murphy
3542dcb15c iommu/dma: Resurrect the "forcedac" option
In converting intel-iommu over to the common IOMMU DMA ops, it quietly
lost the functionality of its "forcedac" option. Since this is a handy
thing both for testing and for performance optimisation on certain
platforms, reimplement it under the common IOMMU parameter namespace.

For the sake of fixing the inadvertent breakage of the Intel-specific
parameter, remove the dmar_forcedac remnants and hook it up as an alias
while documenting the transition to the new common parameter.

Fixes: c588072bba ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/7eece8e0ea7bfbe2cd0e30789e0d46df573af9b0.1614961776.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18 10:55:23 +01:00
Gao Xiang
a8f2a68e42 Documentation: sysrq: update description about sysrq crash
After commit 8341f2f222 ("sysrq: Use panic() to force a crash"),
a crash was not generated by dereferencing a NULL pointer anymore.

Let's update documentation as well to make it less misleading.

Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Zefan Li <lizefan.x@bytedance.com>
Link: https://lore.kernel.org/r/20210309191550.3955601-1-hsiangkao@redhat.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-15 14:23:21 -06:00
Jiele zhao
0860b72d53 security/loadpin: Update the changing interface in the source code.
Loadpin cmdline interface "enabled" has been renamed to "enforce"
for a long time, but the User Description Document was not updated.
(Meaning unchanged)

And kernel_read_file* were moved from linux/fs.h to its own
linux/kernel_read_file.h include file. So update that change here.

Signed-off-by: Jiele zhao <unclexiaole@gmail.com>
Link: https://lore.kernel.org/r/20210308020358.102836-1-unclexiaole@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-15 13:32:32 -06:00
Bhaskar Chowdhury
fdebeae0d7 docs: admin-guide: cgroup-v1: Fix typos in the file memory.rst
s/overcommited/overcommitted/
s/Overcommiting/Overcommitting/

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210313061029.28024-1-unixbhaskar@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-15 13:03:25 -06:00
Jiri Slaby
67b1544a55 tty: isicom, remove this orphan
The Isicom driver was orphaned by commit d86b3001a1 (MAINTAINERS:
orphan isicom) 10 years ago. Noone stepped up to take care of them and
to fix all the issues the driver has.

So it's time to drop the driver with all its traces.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Link: https://lore.kernel.org/r/20210302062214.29627-6-jslaby@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10 09:34:06 +01:00
Jiri Slaby
f76edd8f7c tty: cyclades, remove this orphan
The Cyclades driver was orphaned by commit d459883e6c (MAINTAINERS:
remove two dead e-mail) 13 years ago. Noone stepped up to take care of
them and to fix all the issues the driver has.

On the top of that, there is no way to obtain the firmware for Z cards
from the vendor as cyclades.com ceased to exist.

So it's time to drop the driver with all its traces.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Link: https://lore.kernel.org/r/20210302062214.29627-5-jslaby@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10 09:34:06 +01:00
Kefeng Wang
f6e5aedf47 riscv: Add support for memtest
The riscv [rv32_]defconfig enabled CONFIG_MEMTEST,
but memtest feature is not supported in RISCV.

Add early_memtest() to support for memtest.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-03-09 18:09:46 -08:00
Barry Song
00b072c011 Documentation/admin-guide: kernel-parameters: correct the architectures for numa_balancing
X86 isn't the only architecture supporting NUMA_BALANCING. ARM64, PPC,
S390 and RISCV also support it:

arch$ git grep NUMA_BALANCING
arm64/Kconfig:  select ARCH_SUPPORTS_NUMA_BALANCING
arm64/configs/defconfig:CONFIG_NUMA_BALANCING=y
arm64/include/asm/pgtable.h:#ifdef CONFIG_NUMA_BALANCING
powerpc/configs/powernv_defconfig:CONFIG_NUMA_BALANCING=y
powerpc/configs/ppc64_defconfig:CONFIG_NUMA_BALANCING=y
powerpc/configs/pseries_defconfig:CONFIG_NUMA_BALANCING=y
powerpc/include/asm/book3s/64/pgtable.h:#ifdef CONFIG_NUMA_BALANCING
powerpc/include/asm/book3s/64/pgtable.h:#ifdef CONFIG_NUMA_BALANCING
powerpc/include/asm/book3s/64/pgtable.h:#endif /* CONFIG_NUMA_BALANCING */
powerpc/include/asm/book3s/64/pgtable.h:#ifdef CONFIG_NUMA_BALANCING
powerpc/include/asm/book3s/64/pgtable.h:#endif /* CONFIG_NUMA_BALANCING */
powerpc/include/asm/nohash/pgtable.h:#ifdef CONFIG_NUMA_BALANCING
powerpc/include/asm/nohash/pgtable.h:#endif /* CONFIG_NUMA_BALANCING */
powerpc/platforms/Kconfig.cputype:      select ARCH_SUPPORTS_NUMA_BALANCING
riscv/Kconfig:  select ARCH_SUPPORTS_NUMA_BALANCING
riscv/include/asm/pgtable.h:#ifdef CONFIG_NUMA_BALANCING
s390/Kconfig:   select ARCH_SUPPORTS_NUMA_BALANCING
s390/configs/debug_defconfig:CONFIG_NUMA_BALANCING=y
s390/configs/defconfig:CONFIG_NUMA_BALANCING=y
s390/include/asm/pgtable.h:#ifdef CONFIG_NUMA_BALANCING
x86/Kconfig:    select ARCH_SUPPORTS_NUMA_BALANCING     if X86_64
x86/include/asm/pgtable.h:#ifdef CONFIG_NUMA_BALANCING
x86/include/asm/pgtable.h:#endif /* CONFIG_NUMA_BALANCING */

On the other hand, setup_numabalancing() is implemented in mm/mempolicy.c
which doesn't depend on architectures.

Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210302084159.33688-1-song.bao.hua@hisilicon.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-08 17:19:05 -07:00
Martin Kepplinger
e85d92b3bc Documentation: dynamic-debug-howto: fix example
dynamic debug is "expecting pairs of match-spec <value>" so the example
for all files of which the paths include "usb" there is "file" missing.

Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Link: https://lore.kernel.org/r/20210303091646.773111-1-martin.kepplinger@puri.sm
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-08 17:09:26 -07:00
Uladzislau Rezki (Sony)
686fe1bf6b rcuscale: Add kfree_rcu() single-argument scale test
The single-argument variant of kfree_rcu() is currently not
tested by any member of the rcutoture test suite.  This
commit therefore adds rcuscale code to test it.  This
testing is controlled by two new boolean module parameters,
kfree_rcu_test_single and kfree_rcu_test_double.  If one
is set and the other not, only the corresponding variant
is tested, otherwise both are tested, with the variant to
be tested determined randomly on each invocation.

Both of these module parameters are initialized to false,
so setting either to true will test only that variant.

Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-08 14:18:07 -08:00
Paul Gortmaker
3e70df91f9 rcu: deprecate "all" option to rcu_nocbs=
With the core bitmap support now accepting "N" as a placeholder for
the end of the bitmap, "all" can be represented as "0-N" and has the
advantage of not being specific to RCU (or any other subsystem).

So deprecate the use of "all" by removing documentation references
to it.  The support itself needs to remain for now, since we don't
know how many people out there are using it currently, but since it
is in an __init area anyway, it isn't worth losing sleep over.

Cc: Yury Norov <yury.norov@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-08 14:16:58 -08:00
Paul Gortmaker
2c4885d24e lib: bitmap: support "N" as an alias for size of bitmap
While this is done for all bitmaps, the original use case in mind was
for CPU masks and cpulist_parse() as described below.

It seems that a common configuration is to use the 1st couple cores for
housekeeping tasks.  This tends to leave the remaining ones to form a
pool of similarly configured cores to take on the real workload of
interest to the user.

So on machine A - with 32 cores, it could be 0-3 for "system" and then
4-31 being used in boot args like nohz_full=, or rcu_nocbs= as part of
setting up the worker pool of CPUs.

But then newer machine B is added, and it has 48 cores, and so while
the 0-3 part remains unchanged, the pool setup cpu list becomes 4-47.

Multiple deployment becomes easier when we can just simply replace 31
and 47 with "N" and let the system substitute in the actual number at
boot; a number that it knows better than we do.

Cc: Yury Norov <yury.norov@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Suggested-by: Yury Norov <yury.norov@gmail.com> # move it from CPU code
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-08 14:16:58 -08:00
Rafael J. Wysocki
866d6cdf35 ACPI: PCI: Drop ACPI_PCI_COMPONENT that is not used any more
After dropping all of the code using ACPI_PCI_COMPONENT drop the
definition of it too and update the documentation to remove all
ACPI_PCI_COMPONENT references from it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
2021-03-08 16:51:09 +01:00
Thorsten Leemhuis
315c4e45f1 docs: reporting-issues.rst: explain how to decode stack traces
Replace placeholder text about decoding stack traces with a section that
properly describes what a typical user should do these days. To make
it works for them, add a paragraph in an earlier section to ensure
people build their kernels with everything that's needed to decode stack
traces later.

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20210215172857.382285-1-linux@leemhuis.info
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-06 17:36:52 -07:00
Yang Shi
1eff491fc4 doc: memcontrol: add description for oom_kill
When debugging an oom issue, I found the oom_kill counter of memcg is
confusing.  At the first glance without checking document, I thought it
just counts for memcg oom, but it turns out it counts both global and
memcg oom.

The cgroup v2 documents it, but the description is missed for cgroup v1.

Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Chris Down <chris@chrisdown.name>
Link: https://lore.kernel.org/r/20210226021254.3980-1-shy828301@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-03-06 17:36:50 -07:00
Juergen Gross
a5aabace5f locking/csd_lock: Add more data to CSD lock debugging
In order to help identifying problems with IPI handling and remote
function execution add some more data to IPI debugging code.

There have been multiple reports of CPUs looping long times (many
seconds) in smp_call_function_many() waiting for another CPU executing
a function like tlb flushing. Most of these reports have been for
cases where the kernel was running as a guest on top of KVM or Xen
(there are rumours of that happening under VMWare, too, and even on
bare metal).

Finding the root cause hasn't been successful yet, even after more than
2 years of chasing this bug by different developers.

Commit:

  35feb60474 ("kernel/smp: Provide CSD lock timeout diagnostics")

tried to address this by adding some debug code and by issuing another
IPI when a hang was detected. This helped mitigating the problem
(the repeated IPI unlocks the hang), but the root cause is still unknown.

Current available data suggests that either an IPI wasn't sent when it
should have been, or that the IPI didn't result in the target CPU
executing the queued function (due to the IPI not reaching the CPU,
the IPI handler not being called, or the handler not seeing the queued
request).

Try to add more diagnostic data by introducing a global atomic counter
which is being incremented when doing critical operations (before and
after queueing a new request, when sending an IPI, and when dequeueing
a request). The counter value is stored in percpu variables which can
be printed out when a hang is detected.

The data of the last event (consisting of sequence counter, source
CPU, target CPU, and event type) is stored in a global variable. When
a new event is to be traced, the data of the last event is stored in
the event related percpu location and the global data is updated with
the new event's data. This allows to track two events in one data
location: one by the value of the event data (the event before the
current one), and one by the location itself (the current event).

A typical printout with a detected hang will look like this:

csd: Detected non-responsive CSD lock (#1) on CPU#1, waiting 5000000003 ns for CPU#06 scf_handler_1+0x0/0x50(0xffffa2a881bb1410).
	csd: CSD lock (#1) handling prior scf_handler_1+0x0/0x50(0xffffa2a8813823c0) request.
        csd: cnt(00008cc): ffff->0000 dequeue (src cpu 0 == empty)
        csd: cnt(00008cd): ffff->0006 idle
        csd: cnt(0003668): 0001->0006 queue
        csd: cnt(0003669): 0001->0006 ipi
        csd: cnt(0003e0f): 0007->000a queue
        csd: cnt(0003e10): 0001->ffff ping
        csd: cnt(0003e71): 0003->0000 ping
        csd: cnt(0003e72): ffff->0006 gotipi
        csd: cnt(0003e73): ffff->0006 handle
        csd: cnt(0003e74): ffff->0006 dequeue (src cpu 0 == empty)
        csd: cnt(0003e7f): 0004->0006 ping
        csd: cnt(0003e80): 0001->ffff pinged
        csd: cnt(0003eb2): 0005->0001 noipi
        csd: cnt(0003eb3): 0001->0006 queue
        csd: cnt(0003eb4): 0001->0006 noipi
        csd: cnt now: 0003f00

The idea is to print only relevant entries. Those are all events which
are associated with the hang (so sender side events for the source CPU
of the hanging request, and receiver side events for the target CPU),
and the related events just before those (for adding data needed to
identify a possible race). Printing all available data would be
possible, but this would add large amounts of data printed on larger
configurations.

Signed-off-by: Juergen Gross <jgross@suse.com>
[ Minor readability edits. Breaks col80 but is far more readable. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20210301101336.7797-4-jgross@suse.com
2021-03-06 12:49:48 +01:00
Juergen Gross
8d0968cc6b locking/csd_lock: Add boot parameter for controlling CSD lock debugging
Currently CSD lock debugging can be switched on and off via a kernel
config option only. Unfortunately there is at least one problem with
CSD lock handling pending for about 2 years now, which has been seen
in different environments (mostly when running virtualized under KVM
or Xen, at least once on bare metal). Multiple attempts to catch this
issue have finally led to introduction of CSD lock debug code, but
this code is not in use in most distros as it has some impact on
performance.

In order to be able to ship kernels with CONFIG_CSD_LOCK_WAIT_DEBUG
enabled even for production use, add a boot parameter for switching
the debug functionality on. This will reduce any performance impact
of the debug coding to a bare minimum when not being used.

Signed-off-by: Juergen Gross <jgross@suse.com>
[ Minor edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210301101336.7797-2-jgross@suse.com
2021-03-06 12:49:48 +01:00
Linus Torvalds
03dc748bf1 Merge tag 'xfs-5.12-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull more xfs updates from Darrick Wong:
 "The most notable fix here prevents premature reuse of freed metadata
  blocks, and adding the ability to detect accidental nested
  transactions, which are not allowed here.

   - Restore a disused sysctl control knob that was inadvertently
     dropped during the merge window to avoid fstests regressions.

   - Don't speculatively release freed blocks from the busy list until
     we're actually allocating them, which fixes a rare log recovery
     regression.

   - Don't nest transactions when scanning for free space.

   - Add an idiot^Wmaintainer light to detect nested transactions. ;)"

* tag 'xfs-5.12-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: use current->journal_info for detecting transaction recursion
  xfs: don't nest transactions when scanning for eofblocks
  xfs: don't reuse busy extents on extent trim
  xfs: restore speculative_cow_prealloc_lifetime sysctl
2021-02-28 11:45:25 -08:00