linux

Author	SHA1	Message	Date
Mike Galbraith	9f5d1c336a	futex: Handle transient "ownerless" rtmutex state correctly Gratian managed to trigger the BUG_ON(!newowner) in fixup_pi_state_owner(). This is one possible chain of events leading to this: Task Prio Operation T1 120 lock(F) T2 120 lock(F) -> blocks (top waiter) T3 50 (RT) lock(F) -> boosts T1 and blocks (new top waiter) XX timeout/ -> wakes T2 signal T1 50 unlock(F) -> wakes T3 (rtmutex->owner == NULL, waiter bit is set) T2 120 cleanup -> try_to_take_mutex() fails because T3 is the top waiter and the lower priority T2 cannot steal the lock. -> fixup_pi_state_owner() sees newowner == NULL -> BUG_ON() The comment states that this is invalid and rt_mutex_real_owner() must return a non NULL owner when the trylock failed, but in case of a queued and woken up waiter rt_mutex_real_owner() == NULL is a valid transient state. The higher priority waiter has simply not yet managed to take over the rtmutex. The BUG_ON() is therefore wrong and this is just another retry condition in fixup_pi_state_owner(). Drop the locks, so that T3 can make progress, and then try the fixup again. Gratian provided a great analysis, traces and a reproducer. The analysis is to the point, but it confused the hell out of that tglx dude who had to page in all the futex horrors again. Condensed version is above. [ tglx: Wrote comment and changelog ] Fixes: `c1e2f0eaf0` ("futex: Avoid violating the 10th rule of futex") Reported-by: Gratian Crisan <gratian.crisan@ni.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87a6w6x7bb.fsf@ni.com Link: https://lore.kernel.org/r/87sg9pkvf7.fsf@nanos.tec.linutronix.de	2020-11-07 22:07:04 +01:00
Ingo Molnar	666fab4a3e	Merge branch 'linus' into perf/kprobes Conflicts: include/asm-generic/atomic-instrumented.h kernel/kprobes.c Use the upstream atomic-instrumented.h checksum, and pick the kprobes version of kernel/kprobes.c, which effectively reverts this upstream workaround: `645f224e7b`: ("kprobes: Tell lockdep about kprobe nesting") Since the new code should be fine without nesting. Knock on wood ... Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-11-07 13:20:17 +01:00
Ingo Molnar	0a986ea81e	Merge branch 'linus' into perf/kprobes Merge recent kprobes updates into perf/kprobes that came from -mm. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-11-07 13:18:49 +01:00
kiyin(尹亮)	7bdb157cde	perf/core: Fix a memory leak in perf_event_parse_addr_filter() As shown through runtime testing, the "filename" allocation is not always freed in perf_event_parse_addr_filter(). There are three possible ways that this could happen: - It could be allocated twice on subsequent iterations through the loop, - or leaked on the success path, - or on the failure path. Clean up the code flow to make it obvious that 'filename' is always freed in the reallocation path and in the two return paths as well. We rely on the fact that kfree(NULL) is NOP and filename is initialized with NULL. This fixes the leak. No other side effects expected. [ Dan Carpenter: cleaned up the code flow & added a changelog. ] [ Ingo Molnar: updated the changelog some more. ] Fixes: `375637bc52` ("perf/core: Introduce address range filtering") Signed-off-by: "kiyin(尹亮)" <kiyin@tencent.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "Srivatsa S. Bhat" <srivatsa@csail.mit.edu> Cc: Anthony Liguori <aliguori@amazon.com> -- kernel/events/core.c \| 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)	2020-11-07 13:07:26 +01:00
Andy Shevchenko	b6e95788fd	irqdomain: Introduce irq_domain_create_legacy() API Introduce irq_domain_create_legacy() API which is functional equivalent to the existing irq_domain_add_legacy(), but takes a pointer to the struct fwnode_handle as a parameter. This is useful for non OF systems. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20201030165919.86234-5-andriy.shevchenko@linux.intel.com	2020-11-07 11:33:46 +01:00
Andy Shevchenko	c3a877fea9	irqdomain: Replace open coded of_node_to_fwnode() of_node_to_fwnode() should be used for conversion. Replace the open coded variant of it in of_phandle_args_to_fwspec(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20201030165919.86234-4-andriy.shevchenko@linux.intel.com	2020-11-07 11:33:45 +01:00
Jakub Kicinski	86bbf01977	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Alexei Starovoitov says: ==================== pull-request: bpf 2020-11-06 1) Pre-allocated per-cpu hashmap needs to zero-fill reused element, from David. 2) Tighten bpf_lsm function check, from KP. 3) Fix bpftool attaching to flow dissector, from Lorenz. 4) Use -fno-gcse for the whole kernel/bpf/core.c instead of function attribute, from Ard. * git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Update verification logic for LSM programs bpf: Zero-fill re-used per-cpu map element bpf: BPF_PRELOAD depends on BPF_SYSCALL tools/bpftool: Fix attaching flow dissector libbpf: Fix possible use after free in xsk_socket__delete libbpf: Fix null dereference in xsk_socket__delete libbpf, hashmap: Fix undefined behavior in hash_bits bpf: Don't rely on GCC __attribute__((optimize)) to disable GCSE tools, bpftool: Remove two unused variables. tools, bpftool: Avoid array index warnings. xsk: Fix possible memory leak at socket close bpf: Add struct bpf_redir_neigh forward declaration to BPF helper defs samples/bpf: Set rlimit for memlock to infinity in all samples bpf: Fix -Wshadow warnings selftest/bpf: Fix profiler test using CO-RE relocation for enums ==================== Link: https://lore.kernel.org/r/20201106221759.24143-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-06 17:49:34 -08:00
Jakub Kicinski	ae0d0bb29b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-06 17:33:38 -08:00
Marco Elver	1d094cefc3	kcsan: Fix encoding masks and regain address bit The watchpoint encoding masks for size and address were off-by-one bit each, with the size mask using 1 unnecessary bit and the address mask missing 1 bit. However, due to the way the size is shifted into the encoded watchpoint, we were effectively wasting and never using the extra bit. For example, on x86 with PAGE_SIZE==4K, we have 1 bit for the is-write bit, 14 bits for the size bits, and then 49 bits left for the address. Prior to this fix we would end up with this usage: [ write<1> \| size<14> \| wasted<1> \| address<48> ] Fix it by subtracting 1 bit from the GENMASK() end and start ranges of size and address respectively. The added static_assert()s verify that the masks are as expected. With the fixed version, we get the expected usage: [ write<1> \| size<14> \| address<49> ] Functionally no change is expected, since that extra address bit is insignificant for enabled architectures. Acked-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:19:26 -08:00
Paul E. McKenney	75dc2da5ec	rcu-tasks: Make the units of ->init_fract be jiffies Currently, the units of ->init_fract are milliseconds while those of ->gp_sleep are jiffies. For consistency with each other and with the argument of schedule_timeout_idle(), this commit changes the units of ->init_fract to jiffies. This change does affect the backoff algorithm, but only on systems where HZ is not 1000, and even there the change makes more sense, given that the current setup would "back off" to the same number of jiffies repeatedly. In contrast, with this change, the number of jiffies waited increases on each pass through the loop in the rcu_tasks_wait_gp() function. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:17:59 -08:00
Paul E. McKenney	a7eb937b67	rcutorture: Don't do need_resched() testing if ->sync is NULL If cur_ops->sync is NULL, rcu_torture_fwd_prog_nr() will nevertheless attempt to call through it. This commit therefore flags cases where neither need_resched() nor call_rcu() forward-progress testing can be performed due to NULL function pointers, and also causes rcu_torture_fwd_prog_nr() to take an early exit if cur_ops->sync() is NULL. Reported-by: Tom Rix <trix@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:57 -08:00
Hou Tao	0d7202876b	locktorture: Invoke percpu_free_rwsem() to do percpu-rwsem cleanup When executing the LOCK06 locktorture scenario featuring percpu-rwsem, the RCU callback rcu_sync_func() may still be pending after locktorture module is removed. This can in turn lead to the following Oops: BUG: unable to handle page fault for address: ffffffffc00eb920 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 6500a067 P4D 6500a067 PUD 6500c067 PMD 13a36c067 PTE 800000013691c163 Oops: 0000 [#1] PREEMPT SMP CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0-rc5+ #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:rcu_cblist_dequeue+0x12/0x30 Call Trace: <IRQ> rcu_core+0x1b1/0x860 __do_softirq+0xfe/0x326 asm_call_on_stack+0x12/0x20 </IRQ> do_softirq_own_stack+0x5f/0x80 irq_exit_rcu+0xaf/0xc0 sysvec_apic_timer_interrupt+0x2e/0xb0 asm_sysvec_apic_timer_interrupt+0x12/0x20 This commit avoids tis problem by adding an exit hook in lock_torture_ops and using it to call percpu_free_rwsem() for percpu rwsem torture during the module-cleanup function, thus ensuring that rcu_sync_func() completes before module exits. It is also necessary to call the exit hook if lock_torture_init() fails half-way, so this commit also adds an ->init_called field in lock_torture_cxt to indicate that exit hook, if present, must be called. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:56 -08:00
Paul E. McKenney	85558182d5	scftorture: Add full-test stutter capability In virtual environments on systems with hardware assist, inter-processor interrupts must do very different things based on whether the target vCPU is running or not. This commit therefore enables torture-test stuttering to better test these running/not-running transitions. Suggested-by: Chris Mason <clm@fb.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:56 -08:00
Paul E. McKenney	293b93d66f	rcutorture: Small code cleanups The rcu_torture_cleanup() function fails to NULL out the reader_tasks pointer after freeing it and its fakewriter_tasks loop has redundant braces. This commit therefore cleans these up. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:55 -08:00
Paul E. McKenney	ab1b7880de	rcutorture: Make stutter_wait() caller restore priority Currently, stutter_wait() will happily spin waiting for the stutter interval to end even if the caller is running at a real-time priority level. This could starve normal-priority tasks for no good reason. This commit therefore drops the calling task's priority to SCHED_OTHER MAX_NICE if stutter_wait() needs to wait. But when it waits, stutter_wait() returns true, which allows the caller to restore the priority if needed. Callers that were already running at SCHED_OTHER MAX_NICE obviously do not need any changes, but this commit also restores priority for higher-priority callers. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:54 -08:00
Paul E. McKenney	4994684ce1	rcutorture: Prevent hangs for invalid arguments If an rcutorture torture-test run is given a bad kvm.sh argument, the test will complain to the console, which is good. What is bad is that from the user's perspective, it will just hang for the time specified by the --duration argument. This commit therefore forces an immediate kernel shutdown if a rcu_torture_init()-time error occurs, thus avoiding the appearance of a hang. It also forces a console splat in this case to clearly indicate the presence of an error. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:53 -08:00
Paul E. McKenney	6b74fa0a77	locktorture: Prevent hangs for invalid arguments If an locktorture torture-test run is given a bad kvm.sh argument, the test will complain to the console, which is good. What is bad is that from the user's perspective, it will just hang for the time specified by the --duration argument. This commit therefore forces an immediate kernel shutdown if a lock_torture_init()-time error occurs, thus avoiding the appearance of a hang. It also forces a console splat in this case to clearly indicate the presence of an error. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:53 -08:00
Hou Tao	e5ace37d83	locktorture: Ignore nreaders_stress if no readlock support Exclusive locks do not have readlock support, which means that a locktorture run with the following module parameters will do nothing: torture_type=mutex_lock nwriters_stress=0 nreaders_stress=1 This commit therefore rejects this combination for exclusive locks by returning -EINVAL during module init. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:52 -08:00
Paul E. McKenney	bc80d353b3	refscale: Prevent hangs for invalid arguments If an refscale torture-test run is given a bad kvm.sh argument, the test will complain to the console, which is good. What is bad is that from the user's perspective, it will just hang for the time specified by the --duration argument. This commit therefore forces an immediate kernel shutdown if a ref_scale_init()-time error occurs, thus avoiding the appearance of a hang. It also forces a console splat in this case to clearly indicate the presence of an error. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:51 -08:00
Paul E. McKenney	2f2214d43c	rcuscale: Prevent hangs for invalid arguments If an rcuscale torture-test run is given a bad kvm.sh argument, the test will complain to the console, which is good. What is bad is that from the user's perspective, it will just hang for the time specified by the --duration argument. This commit therefore forces an immediate kernel shutdown if a rcu_scale_init()-time error occurs, thus avoiding the appearance of a hang. It also forces a console splat in this case to clearly indicate the presence of an error. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:51 -08:00
Paul E. McKenney	899f317e48	rcuscale: Add RCU Tasks Trace This commit adds the ability to test performance and scalability of RCU Tasks Trace updaters. Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:50 -08:00
Paul E. McKenney	1ac78b49d6	scftorture: Add an alternative IPI vector The scftorture tests currently use only smp_call_function() and friends, which means that these tests cannot locate bugs caused by interactions between different IPI vectors. This commit therefore adds the rescheduling IPI to the mix. Note that this commit permits resched_cpus() only when scftorture is built in. This is a workaround. Longer term, this will use real wakeups rather than resched_cpu(). Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:49 -08:00
Paul E. McKenney	fda5ba9ed2	torture: Make torture_stutter() use hrtimer The torture_stutter() function uses schedule_timeout_interruptible() to time the stutter duration, but this can miss race conditions due to its being time-synchronized with everything else that is based on the timer wheels. This commit therefore converts torture_stutter() to use the high-resolution timers via schedule_hrtimeout(), and also to fuzz the stutter interval. While in the area, this commit also limits the spin-loop portion of the stutter_wait() function's wait loop to two jiffies, down from about one second. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:49 -08:00
Paul E. McKenney	19012b786e	torture: Periodically pause in stutter_wait() Running locktorture scenario LOCK05 results in hangs: tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --torture lock --duration 3 --configs LOCK05 The lock_torture_writer() kthreads set themselves to MAX_NICE while running SCHED_OTHER. Other locktorture kthreads run at default niceness, also SCHED_OTHER. This results in these other locktorture kthreads indefinitely preempting the lock_torture_writer() kthreads. Note that the cond_resched() in the stutter_wait() function's loop is ineffective because this scenario is built with CONFIG_PREEMPT=y. It is not clear that such indefinite preemption is supposed to happen, but in the meantime this commit prevents kthreads running in stutter_wait() from being completely CPU-bound, thus allowing the other threads to get some CPU in a timely fashion. This commit also uses hrtimers to provide very short sleeps to avoid degrading the sudden-on testing that stutter is supposed to provide. Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:48 -08:00
Paul E. McKenney	3480d6774f	locktorture: Track time of last ->writeunlock() This commit adds a last_lock_release variable that tracks the time of the last ->writeunlock() call, which allows easier diagnosing of lock hangs when using a kernel debugger. Acked-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-11-06 17:13:29 -08:00
Paul E. McKenney	3fcd6a230f	x86/cpu: Avoid cpuinfo-induced IPIing of idle CPUs Currently, accessing /proc/cpuinfo sends IPIs to idle CPUs in order to learn their clock frequency. Which is a bit strange, given that waking them from idle likely significantly changes their clock frequency. This commit therefore avoids sending /proc/cpuinfo-induced IPIs to idle CPUs. [ paulmck: Also check for idle in arch_freq_prepare_all(). ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <x86@kernel.org>	2020-11-06 16:59:11 -08:00
KP Singh	6f64e47783	bpf: Update verification logic for LSM programs The current logic checks if the name of the BTF type passed in attach_btf_id starts with "bpf_lsm_", this is not sufficient as it also allows attachment to non-LSM hooks like the very function that performs this check, i.e. bpf_lsm_verify_prog. In order to ensure that this verification logic allows attachment to only LSM hooks, the LSM_HOOK definitions in lsm_hook_defs.h are used to generate a BTF_ID set. Upon verification, the attach_btf_id of the program being attached is checked for presence in this set. Fixes: `9e4e01dfd3` ("bpf: lsm: Implement attach, detach and execution") Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201105230651.2621917-1-kpsingh@chromium.org	2020-11-06 13:15:21 -08:00
KP Singh	3ca1032ab7	bpf: Implement get_current_task_btf and RET_PTR_TO_BTF_ID The currently available bpf_get_current_task returns an unsigned integer which can be used along with BPF_CORE_READ to read data from the task_struct but still cannot be used as an input argument to a helper that accepts an ARG_PTR_TO_BTF_ID of type task_struct. In order to implement this helper a new return type, RET_PTR_TO_BTF_ID, is added. This is similar to RET_PTR_TO_BTF_ID_OR_NULL but does not require checking the nullness of returned pointer. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201106103747.2780972-6-kpsingh@chromium.org	2020-11-06 08:08:37 -08:00
KP Singh	4cf1bc1f10	bpf: Implement task local storage Similar to bpf_local_storage for sockets and inodes add local storage for task_struct. The life-cycle of storage is managed with the life-cycle of the task_struct. i.e. the storage is destroyed along with the owning task with a callback to the bpf_task_storage_free from the task_free LSM hook. The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the security blob which are now stackable and can co-exist with other LSMs. The userspace map operations can be done by using a pid fd as a key passed to the lookup, update and delete operations. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201106103747.2780972-3-kpsingh@chromium.org	2020-11-06 08:08:37 -08:00
KP Singh	9e7a4d9831	bpf: Allow LSM programs to use bpf spin locks Usage of spin locks was not allowed for tracing programs due to insufficient preemption checks. The verifier does not currently prevent LSM programs from using spin locks, but the helpers are not exposed via bpf_lsm_func_proto. Based on the discussion in [1], non-sleepable LSM programs should be able to use bpf_spin_{lock, unlock}. Sleepable LSM programs can be preempted which means that allowng spin locks will need more work (disabling preemption and the verifier ensuring that no sleepable helpers are called when a spin lock is held). [1]: https://lore.kernel.org/bpf/20201103153132.2717326-1-kpsingh@chromium.org/T/#md601a053229287659071600d3483523f752cd2fb Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201106103747.2780972-2-kpsingh@chromium.org	2020-11-06 08:08:37 -08:00
Steven Rostedt (VMware)	773c167050	ftrace: Add recording of functions that caused recursion This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file "recursed_functions" all the functions that caused recursion while a callback to the function tracer was running. Link: https://lkml.kernel.org/r/20201106023548.102375687@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Guo Ren <guoren@kernel.org> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Petr Mladek <pmladek@suse.com> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-csky@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: live-patching@vger.kernel.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-11-06 08:42:26 -05:00
Steven Rostedt (VMware)	a25d036d93	ftrace: Reverse what the RECURSION flag means in the ftrace_ops Now that all callbacks are recursion safe, reverse the meaning of the RECURSION flag and rename it from RECURSION_SAFE to simply RECURSION. Now only callbacks that request to have recursion protecting it will have the added trampoline to do so. Also remove the outdated comment about "PER_CPU" when determining to use the ftrace_ops_assist_func. Link: https://lkml.kernel.org/r/20201028115613.742454631@goodmis.org Link: https://lkml.kernel.org/r/20201106023547.904270143@goodmis.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Cc: Petr Mladek <pmladek@suse.com> Cc: linux-doc@vger.kernel.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-11-06 08:42:12 -05:00
Steven Rostedt (VMware)	5d029b035b	perf/ftrace: Check for rcu_is_watching() in callback function If a ftrace callback requires "rcu_is_watching", then it adds the FTRACE_OPS_FL_RCU flag and it will not be called if RCU is not "watching". But this means that it will use a trampoline when called, and this slows down the function tracing a tad. By checking rcu_is_watching() from within the callback, it no longer needs the RCU flag set in the ftrace_ops and it can be safely called directly. Link: https://lkml.kernel.org/r/20201028115613.591878956@goodmis.org Link: https://lkml.kernel.org/r/20201106023547.711035826@goodmis.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Petr Mladek <pmladek@suse.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-11-06 08:42:05 -05:00
Steven Rostedt (VMware)	5d15a624c3	perf/ftrace: Add recursion protection to the ftrace callback If a ftrace callback does not supply its own recursion protection and does not set the RECURSION_SAFE flag in its ftrace_ops, then ftrace will make a helper trampoline to do so before calling the callback instead of just calling the callback directly. The default for ftrace_ops is going to change. It will expect that handlers provide their own recursion protection, unless its ftrace_ops states otherwise. Link: https://lkml.kernel.org/r/20201028115613.444477858@goodmis.org Link: https://lkml.kernel.org/r/20201106023547.466892083@goodmis.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Petr Mladek <pmladek@suse.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-11-06 08:41:56 -05:00
Steven Rostedt (VMware)	4b750b573c	livepatch: Trigger WARNING if livepatch function fails due to recursion If for some reason a function is called that triggers the recursion detection of live patching, trigger a warning. By not executing the live patch code, it is possible that the old unpatched function will be called placing the system into an unknown state. Link: https://lore.kernel.org/r/20201029145709.GD16774@alley Link: https://lkml.kernel.org/r/20201106023547.312639435@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: live-patching@vger.kernel.org Suggested-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-11-06 08:41:47 -05:00
Steven Rostedt (VMware)	13f3ea9a2c	livepatch/ftrace: Add recursion protection to the ftrace callback If a ftrace callback does not supply its own recursion protection and does not set the RECURSION_SAFE flag in its ftrace_ops, then ftrace will make a helper trampoline to do so before calling the callback instead of just calling the callback directly. The default for ftrace_ops is going to change. It will expect that handlers provide their own recursion protection, unless its ftrace_ops states otherwise. Link: https://lkml.kernel.org/r/20201028115613.291169246@goodmis.org Link: https://lkml.kernel.org/r/20201106023547.122802424@goodmis.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: live-patching@vger.kernel.org Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-11-06 08:36:23 -05:00
Steven Rostedt (VMware)	6e4eb9cb22	ftrace: Add ftrace_test_recursion_trylock() helper function To make it easier for ftrace callbacks to have recursion protection, provide a ftrace_test_recursion_trylock() and ftrace_test_recursion_unlock() helper that tests for recursion. Link: https://lkml.kernel.org/r/20201028115612.634927593@goodmis.org Link: https://lkml.kernel.org/r/20201106023546.378584067@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-11-06 08:33:23 -05:00
Steven Rostedt (VMware)	0264c8c9e1	ftrace: Move the recursion testing into global headers Currently, if a callback is registered to a ftrace function and its ftrace_ops does not have the RECURSION flag set, it is encapsulated in a helper function that does the recursion for it. Really, all the callbacks should have their own recursion protection for performance reasons. But they should not all implement their own. Move the recursion helpers to global headers, so that all callbacks can use them. Link: https://lkml.kernel.org/r/20201028115612.460535535@goodmis.org Link: https://lkml.kernel.org/r/20201106023546.166456258@goodmis.org Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-11-06 08:33:23 -05:00
Lukas Bulwahn	90574a9c02	printk: remove unneeded dead-store assignment make clang-analyzer on x86_64 defconfig caught my attention with: kernel/printk/printk_ringbuffer.c:885:3: warning: Value stored to 'desc' is never read [clang-analyzer-deadcode.DeadStores] desc = to_desc(desc_ring, head_id); ^ Commit `b6cf8b3f33` ("printk: add lockless ringbuffer") introduced desc_reserve() with this unneeded dead-store assignment. As discussed with John Ogness privately, this is probably just some minor left-over from previous iterations of the ringbuffer implementation. So, simply remove this unneeded dead assignment to make clang-analyzer happy. As compilers will detect this unneeded assignment and optimize this anyway, the resulting object code is identical before and after this change. No functional change. No change to object code. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20201106034005.18822-1-lukas.bulwahn@gmail.com	2020-11-06 11:05:44 +01:00
Florian Lehner	c6bde958a6	bpf: Lift hashtab key_size limit Currently key_size of hashtab is limited to MAX_BPF_STACK. As the key of hashtab can also be a value from a per cpu map it can be larger than MAX_BPF_STACK. The use-case for this patch originates to implement allow/disallow lists for files and file paths. The maximum length of file paths is defined by PATH_MAX with 4096 chars including nul. This limit exceeds MAX_BPF_STACK. Changelog: v5: - Fix cast overflow v4: - Utilize BPF skeleton in tests - Rebase v3: - Rebase v2: - Add a test for bpf side Signed-off-by: Florian Lehner <dev@der-flo.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201029201442.596690-1-dev@der-flo.net	2020-11-05 20:04:46 -08:00
David Verbeiren	d3bec0138b	bpf: Zero-fill re-used per-cpu map element Zero-fill element values for all other cpus than current, just as when not using prealloc. This is the only way the bpf program can ensure known initial values for all cpus ('onallcpus' cannot be set when coming from the bpf program). The scenario is: bpf program inserts some elements in a per-cpu map, then deletes some (or userspace does). When later adding new elements using bpf_map_update_elem(), the bpf program can only set the value of the new elements for the current cpu. When prealloc is enabled, previously deleted elements are re-used. Without the fix, values for other cpus remain whatever they were when the re-used entry was previously freed. A selftest is added to validate correct operation in above scenario as well as in case of LRU per-cpu map element re-use. Fixes: `6c90598174` ("bpf: pre-allocate hash map elements") Signed-off-by: David Verbeiren <david.verbeiren@tessares.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201104112332.15191-1-david.verbeiren@tessares.net	2020-11-05 19:55:57 -08:00
Randy Dunlap	7c0afcad75	bpf: BPF_PRELOAD depends on BPF_SYSCALL Fix build error when BPF_SYSCALL is not set/enabled but BPF_PRELOAD is by making BPF_PRELOAD depend on BPF_SYSCALL. ERROR: modpost: "bpf_preload_ops" [kernel/bpf/preload/bpf_preload.ko] undefined! Reported-by: kernel test robot <lkp@intel.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201105195109.26232-1-rdunlap@infradead.org	2020-11-05 18:49:29 -08:00
Linus Torvalds	3249fe4563	Merge tag 'trace-v5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Fix off-by-one error in retrieving the context buffer for trace_printk() - Fix off-by-one error in stack nesting limit - Fix recursion to not make all NMI code false positive as recursing - Stop losing events in function tracing when transitioning between irq context - Stop losing events in ring buffer when transitioning between irq context - Fix return code of error pointer in parse_synth_field() to prevent NULL pointer dereference. - Fix false positive of NMI recursion in kprobe event handling * tag 'trace-v5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: kprobes: Tell lockdep about kprobe nesting tracing: Make -ENOMEM the default error for parse_synth_field() ring-buffer: Fix recursion protection transitions between interrupt context tracing: Fix the checking of stackidx in __ftrace_trace_stack ftrace: Handle tracing when switching between context ftrace: Fix recursion check for NMI test tracing: Fix out of bounds write in get_trace_buf	2020-11-05 11:41:38 -08:00
Linus Torvalds	f786dfa374	Merge tag 'pm-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix the device links support in runtime PM, correct mistakes in the cpuidle documentation, fix the handling of policy limits changes in the schedutil cpufreq governor, fix assorted issues in the OPP (operating performance points) framework and make one janitorial change. Specifics: - Unify the handling of managed and stateless device links in the runtime PM framework and prevent runtime PM references to devices from being leaked after device link removal (Rafael Wysocki). - Fix two mistakes in the cpuidle documentation (Julia Lawall). - Prevent the schedutil cpufreq governor from missing policy limits updates in some cases (Viresh Kumar). - Prevent static OPPs from being dropped by mistake (Viresh Kumar). - Prevent helper function in the OPP framework from returning prematurely (Viresh Kumar). - Prevent opp_table_lock from being held too long during removal of OPP tables with no more active references (Viresh Kumar). - Drop redundant semicolon from the Intel RAPL power capping driver (Tom Rix)" * tag 'pm-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: runtime: Resume the device earlier in __device_release_driver() PM: runtime: Drop pm_runtime_clean_up_links() PM: runtime: Drop runtime PM references to supplier on link removal powercap/intel_rapl: remove unneeded semicolon Documentation: PM: cpuidle: correct path name Documentation: PM: cpuidle: correct typo cpufreq: schedutil: Don't skip freq update if need_freq_update is set opp: Reduce the size of critical section in _opp_table_kref_release() opp: Fix early exit from dev_pm_opp_register_set_opp_helper() opp: Don't always remove static OPPs in _of_add_opp_table_v1()	2020-11-05 11:04:29 -08:00
Thomas Gleixner	b6be002bcd	x86/entry: Move nmi entry/exit into common code Lockdep state handling on NMI enter and exit is nothing specific to X86. It's not any different on other architectures. Also the extra state type is not necessary, irqentry_state_t can carry the necessary information as well. Move it to common code and extend irqentry_state_t to carry lockdep state. [ Ira: Make exit_rcu and lockdep a union as they are mutually exclusive between the IRQ and NMI exceptions, and add kernel documentation for struct irqentry_state_t ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201102205320.1458656-7-ira.weiny@intel.com	2020-11-04 22:55:36 +01:00
Thomas Gleixner	01be83eea0	Merge branch 'core/urgent' into core/entry Pick up the entry fix before further modifications.	2020-11-04 18:14:52 +01:00
Thomas Gleixner	9d820f68b2	entry: Fix the incorrect ordering of lockdep and RCU check When an exception/interrupt hits kernel space and the kernel is not currently in the idle task then RCU must be watching. irqentry_enter() validates this via rcu_irq_enter_check_tick(), which in turn invokes lockdep when taking a lock. But at that point lockdep does not yet know about the fact that interrupts have been disabled by the CPU, which triggers a lockdep splat complaining about inconsistent state. Invoking trace_hardirqs_off() before rcu_irq_enter_check_tick() defeats the point of rcu_irq_enter_check_tick() because trace_hardirqs_off() uses RCU. So use the same sequence as for the idle case and tell lockdep about the irq state change first, invoke the RCU check and then do the lockdep and tracer update. Fixes: `a5497bab5f` ("entry: Provide generic interrupt entry/exit code") Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Mark Rutland <mark.rutland@arm.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87y2jhl19s.fsf@nanos.tec.linutronix.de	2020-11-04 18:06:14 +01:00
Steven Rostedt (VMware)	645f224e7b	kprobes: Tell lockdep about kprobe nesting Since the kprobe handlers have protection that prohibits other handlers from executing in other contexts (like if an NMI comes in while processing a kprobe, and executes the same kprobe, it will get fail with a "busy" return). Lockdep is unaware of this protection. Use lockdep's nesting api to differentiate between locks taken in INT3 context and other context to suppress the false warnings. Link: https://lore.kernel.org/r/20201102160234.fa0ae70915ad9e2b21c08b85@kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2020-11-04 09:46:06 -05:00
Sergey Shtylyov	076aa52e40	module: only handle errors with the switch statement in module_sig_check() Let's handle the successful call of mod_verify_sig() right after that call, making the switch statement only handle the real errors, and then move the comment from the first case before switch itself and the comment before default after it. Fix the comment style, add article/comma/dash, spell out "nomem" as "lack of memory" in these comments, while at it... Suggested-by: Joe Perches <joe@perches.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2020-11-04 15:31:29 +01:00
Sergey Shtylyov	10ccd1abb8	module: avoid gotos in module_sig_check() Let's move the common handling of the non-fatal errors after the switch statement -- this avoids gotos inside that switch... Suggested-by: Joe Perches <joe@perches.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2020-11-04 15:31:28 +01:00

... 8 9 10 11 12 ...

35244 Commits