linux

Author	SHA1	Message	Date
Paul E. McKenney	4a8cc433b8	rcu-tasks: Drive synchronous grace periods from calling task This commit causes synchronous grace periods to be driven from the task invoking synchronize_rcu_*(), allowing these functions to be invoked from the mid-boot dead zone extending from when the scheduler was initialized to to point that the various RCU tasks grace-period kthreads are spawned. This change will allow the self-tests to run in a consistent manner. Reported-by: Matthew Wilcox <willy@infradead.org> Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-06-20 09:22:27 -07:00
Paul E. McKenney	68cb47204d	rcu-tasks: Move synchronize_rcu_tasks_generic() down This is strictly a code-motion commit that moves the synchronize_rcu_tasks_generic() down to where it can invoke rcu_tasks_one_gp() without the need for a forward declaration. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-06-20 09:22:26 -07:00
Paul E. McKenney	d96225fd09	rcu-tasks: Split rcu_tasks_one_gp() from rcu_tasks_kthread() This commit abstracts most of the rcu_tasks_kthread() function's loop body into a new rcu_tasks_one_gp() function. It also introduces a new ->tasks_gp_mutex to synchronize concurrent calls to this new rcu_tasks_one_gp() function. This commit is preparation for allowing RCU tasks grace periods to be driven by the calling task during the mid-boot dead zone. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-06-20 09:22:26 -07:00
Paul E. McKenney	4cf0585c4d	rcu-tasks: Check for abandoned callbacks This commit adds a debugging scan for callbacks that got lost during a callback-queueing transition. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-06-20 09:22:26 -07:00
Paul E. McKenney	ab2756ea6b	rcu-tasks: Handle sparse cpu_possible_mask in rcu_tasks_invoke_cbs() If the cpu_possible_mask is sparse (for example, if bits are set only for CPUs 0, 4, 8, ...), then rcu_tasks_invoke_cbs() will access per-CPU data for a CPU not in cpu_possible_mask. It makes these accesses while doing a workqueue-based binary search for non-empty callback lists. Although this search must pass through CPUs not represented in cpu_possible_mask, it has no need to check the callback list for such CPUs. This commit therefore changes the rcu_tasks_invoke_cbs() function's binary search so as to only check callback lists for CPUs present in cpu_possible_mask. Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:06:43 -07:00
Eric Dumazet	07d95c34e8	rcu-tasks: Handle sparse cpu_possible_mask If the rcupdate.rcu_task_enqueue_lim kernel boot parameter is set to something greater than 1 and less than nr_cpu_ids, the code attempts to use a subset of the CPU's RCU Tasks callback lists. This works, but only if the cpu_possible_mask is contiguous. If there are "holes" in this mask, the callback-enqueue code might attempt to access a non-existent per-CPU ->rtcpu variable for a non-existent CPU. For example, if only CPUs 0, 4, 8, 12, 16 and so on are in cpu_possible_mask, specifying rcupdate.rcu_task_enqueue_lim=4 would cause the code to attempt to use callback queues for non-existent CPUs 1, 2, and 3. Because such systems have existed in the past and might still exist, the code needs to gracefully handle this situation. This commit therefore checks to see whether the desired CPU is present in cpu_possible_mask, and, if not, searches for the next CPU. This means that the systems administrator of a system with a sparse cpu_possible_mask will need to account for this sparsity when specifying the value of the rcupdate.rcu_task_enqueue_lim kernel boot parameter. For example, setting this parameter to the value 4 will use only CPUs 0 and 4, which CPU 4 getting three times the callback load of CPU 0. This commit assumes that bit (nr_cpu_ids - 1) is always set in cpu_possible_mask. Link: https://lore.kernel.org/lkml/CANn89iKaNEwyNZ=L_PQnkH0LP_XjLYrr_dpyRKNNoDJaWKdrmg@mail.gmail.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:06:43 -07:00
Paul E. McKenney	10b3742f93	rcu-tasks: Make show_rcu_tasks_generic_gp_kthread() check all CPUs Currently, the show_rcu_tasks_generic_gp_kthread() function only looks at CPU 0's callback lists. Although this is not fatal, it can confuse debugging efforts in cases where any of the Tasks RCU flavors are in per-CPU queueing mode. This commit therefore causes this function to scan all CPUs' callback queues. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:06:43 -07:00
Paul E. McKenney	bddf7122f7	rcu-tasks: Restore use of timers for non-RT kernels The use of hrtimers for RCU-tasks grace-period delays works well in general, but can result in excessive grace-period delays for some corner-case workloads. This commit therefore reverts to the use of timers for non-RT kernels to mitigate those grace-period delays. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:06:43 -07:00
Sebastian Andrzej Siewior	777570d9ef	rcu-tasks: Use schedule_hrtimeout_range() to wait for grace periods The synchronous RCU-tasks grace-period-wait primitives invoke schedule_timeout_idle() to give readers a chance to exit their read-side critical sections. Unfortunately, this fails during early boot on PREEMPT_RT because PREEMPT_RT relies solely on ksoftirqd to run timer handlers. Because ksoftirqd cannot operate until its kthreads are spawned, there is a brief period of time following scheduler initialization where PREEMPT_RT cannot run the timer handlers that schedule_timeout_idle() relies on, resulting in a hang. To avoid this boot-time hang, this commit replaces schedule_timeout_idle() with schedule_hrtimeout(), so that the timer expires in hardirq context. This is ensures that the timer fires even on PREEMPT_RT throughout the irqs-enabled portions of boot as well as during runtime. The timer is set to expire between fract and fract + HZ / 2 jiffies in order to align with any other timers that might expire during that time, thus reducing the number of wakeups. Note that RCU-tasks grace periods are infrequent, so the use of hrtimer should be fine. In contrast, in common-case code, user of hrtimer could result in performance issues. Cc: Martin KaFai Lau <kafai@fb.com> Cc: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:06:42 -07:00
Sebastian Andrzej Siewior	88db792bbe	rcu-tasks: Use rcuwait for the rcu_tasks_kthread() The waitqueue used by rcu_tasks_kthread() has always only one waiter. With a guaranteed only one waiter, this can be replaced with rcuwait which is smaller and simpler. With rcuwait based wake counterpart, the irqwork function (call_rcu_tasks_iw_wakeup()) can be invoked hardirq context because it is only a wake up and no sleeping locks are involved (unlike the wait_queue_head). As a side effect, this is also one piece of the puzzle to pass the RCU selftest at early boot on PREEMPT_RT. Replace wait_queue_head with rcuwait and let the irqwork run in hardirq context on PREEMPT_RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:06:42 -07:00
Paul E. McKenney	f25390033f	rcu-tasks: Print pre-stall-warning informational messages RCU-tasks stall-warning messages are printed after the grace period is ten minutes old. Unfortunately, most of us will have rebooted the system in response to an apparently-hung command long before the ten minutes is up, and will thus see what looks to be a silent hang. This commit therefore adds pr_info() messages that are printed earlier. These should avoid being classified as errors, but should give impatient users a hint. These are controlled by new rcupdate.rcu_task_stall_info and rcupdate.rcu_task_stall_info_mult kernel-boot parameters. The former defines the initial delay in jiffies (defaulting to 10 seconds) and the latter defines the multiplier (defaulting to 3). Thus, by default, the first message will appear 10 seconds into the RCU-tasks grace period, the second 40 seconds in, and the third 160 seconds in. There would be a fourth at 640 seconds in, but the stall warning message appears 600 seconds in, and once a stall warning is printed for a given grace period, no further informational messages are printed. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:06:42 -07:00
Padmanabha Srinivasaiah	f75fd4b922	rcu-tasks: Fix race in schedule and flush work While booting secondary CPUs, cpus_read_[lock/unlock] is not keeping online cpumask stable. The transient online mask results in below calltrace. [ 0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083] [ 0.346652] Detected PIPT I-cache on CPU2 [ 0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083] [ 0.377255] Detected PIPT I-cache on CPU3 [ 0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083] [ 0.379040] ------------[ cut here ]------------ [ 0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138 [ 0.384850] Modules linked in: [ 0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13 [ 0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT) [ 0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 0.388308] pc : __flush_work+0x12c/0x138 [ 0.388970] lr : __flush_work+0x80/0x138 [ 0.389620] sp : ffffffc00aaf3c60 [ 0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48 [ 0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100 [ 0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28 [ 0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9 [ 0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550 [ 0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000 [ 0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000 [ 0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000 [ 0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c [ 0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000 [ 0.401886] Call trace: [ 0.402309] __flush_work+0x12c/0x138 [ 0.402941] schedule_on_each_cpu+0x228/0x278 [ 0.403693] rcu_tasks_rude_wait_gp+0x130/0x144 [ 0.404502] rcu_tasks_kthread+0x220/0x254 [ 0.405264] kthread+0x174/0x1ac [ 0.405837] ret_from_fork+0x10/0x20 [ 0.406456] irq event stamp: 102 [ 0.406966] hardirqs last enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4 [ 0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c [ 0.409410] softirqs last enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c [ 0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c [ 0.411890] ---[ end trace 0000000000000000 ]--- [ 0.413000] smp: Brought up 1 node, 4 CPUs [ 0.413762] SMP: Total of 4 processors activated. [ 0.414566] CPU features: detected: 32-bit EL0 Support [ 0.415414] CPU features: detected: 32-bit EL1 Support [ 0.416278] CPU features: detected: CRC32 instructions [ 0.447021] Callback from call_rcu_tasks_rude() invoked. [ 0.506693] Callback from call_rcu_tasks() invoked. This commit therefore fixes this issue by applying a single-CPU optimization to the RCU Tasks Rude grace-period process. The key point here is that the purpose of this RCU flavor is to force a schedule on each online CPU since some past event. But the rcu_tasks_rude_wait_gp() function runs in the context of the RCU Tasks Rude's grace-period kthread, so there must already have been a context switch on the current CPU since the call to either synchronize_rcu_tasks_rude() or call_rcu_tasks_rude(). So if there is only a single CPU online, RCU Tasks Rude's grace-period kthread does not need to anything at all. It turns out that the rcu_tasks_rude_wait_gp() function's call to schedule_on_each_cpu() causes problems during early boot. During that time, there is only one online CPU, namely the boot CPU. Therefore, applying this single-CPU optimization fixes early-boot instances of this problem. Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/ Suggested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:06:42 -07:00
Linus Torvalds	3fe2f7446f	Changes in this cycle were: - Cleanups for SCHED_DEADLINE - Tracing updates/fixes - CPU Accounting fixes - First wave of changes to optimize the overhead of the scheduler build, from the fast-headers tree - including placeholder _api.h headers for later header split-ups. - Preempt-dynamic using static_branch() for ARM64 - Isolation housekeeping mask rework; preperatory for further changes - NUMA-balancing: deal with CPU-less nodes - NUMA-balancing: tune systems that have multiple LLC cache domains per node (eg. AMD) - Updates to RSEQ UAPI in preparation for glibc usage - Lots of RSEQ/selftests, for same - Add Suren as PSI co-maintainer Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmI5rg8RHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1hGrw/+M3QOk6fH7G48wjlNnBvcOife6ls+Ni4k ixOAcF4JKoixO8HieU5vv0A7yf/83tAa6fpeXeMf1hkCGc0NSlmLtuIux+WOmoAL LzCyDEYfiP8KnVh0A1Tui/lK0+AkGo21O6ADhQE2gh8o2LpslOHQMzvtyekSzeeb mVxMYQN+QH0m518xdO2D8IQv9ctOYK0eGjmkqdNfntOlytypPZHeNel/tCzwklP/ dElJUjNiSKDlUgTBPtL3DfpoLOI/0mHF2p6NEXvNyULxSOqJTu8pv9Z2ADb2kKo1 0D56iXBDngMi9MHIJLgvzsA8gKzHLFSuPbpODDqkTZCa28vaMB9NYGhJ643NtEie IXTJEvF1rmNkcLcZlZxo0yjL0fjvPkczjw4Vj27gbrUQeEBfb4mfuI4BRmij63Ep qEkgQTJhduCqqrQP1rVyhwWZRk1JNcVug+F6N42qWW3fg1xhj0YSrLai2c9nPez6 3Zt98H8YGS1Z/JQomSw48iGXVqfTp/ETI7uU7jqHK8QcjzQ4lFK5H4GZpwuqGBZi NJJ1l97XMEas+rPHiwMEN7Z1DVhzJLCp8omEj12QU+tGLofxxwAuuOVat3CQWLRk f80Oya3TLEgd22hGIKDRmHa22vdWnNQyS0S15wJotawBzQf+n3auS9Q3/rh979+t ES/qvlGxTIs= =Z8uT -----END PGP SIGNATURE----- Merge tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Cleanups for SCHED_DEADLINE - Tracing updates/fixes - CPU Accounting fixes - First wave of changes to optimize the overhead of the scheduler build, from the fast-headers tree - including placeholder _api.h headers for later header split-ups. - Preempt-dynamic using static_branch() for ARM64 - Isolation housekeeping mask rework; preperatory for further changes - NUMA-balancing: deal with CPU-less nodes - NUMA-balancing: tune systems that have multiple LLC cache domains per node (eg. AMD) - Updates to RSEQ UAPI in preparation for glibc usage - Lots of RSEQ/selftests, for same - Add Suren as PSI co-maintainer * tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits) sched/headers: ARM needs asm/paravirt_api_clock.h too sched/numa: Fix boot crash on arm64 systems headers/prep: Fix header to build standalone: <linux/psi.h> sched/headers: Only include <linux/entry-common.h> when CONFIG_GENERIC_ENTRY=y cgroup: Fix suspicious rcu_dereference_check() usage warning sched/preempt: Tell about PREEMPT_DYNAMIC on kernel headers sched/topology: Remove redundant variable and fix incorrect type in build_sched_domains sched/deadline,rt: Remove unused parameter from pick_next_[rt\|dl]_entity() sched/deadline,rt: Remove unused functions for !CONFIG_SMP sched/deadline: Use __node_2_[pdl\|dle]() and rb_first_cached() consistently sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy() sched/deadline: Move bandwidth mgmt and reclaim functions into sched class source file sched/deadline: Remove unused def_dl_bandwidth sched/tracing: Report TASK_RTLOCK_WAIT tasks as TASK_UNINTERRUPTIBLE sched/tracing: Don't re-read p->state when emitting sched_switch event sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race sched/cpuacct: Remove redundant RCU read lock sched/cpuacct: Optimize away RCU read lock sched/cpuacct: Fix charge percpu cpuusage sched/headers: Reorganize, clean up and optimize kernel/sched/sched.h dependencies ...	2022-03-22 14:39:12 -07:00
Ingo Molnar	6255b48aeb	Linux 5.17-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmISrYgeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGg20IAKDZr7rfSHBopjQV Cocw744tom0XuxpvSZpp2GGOOXF+tkswcNNaRIrbGOl1mkyxA7eBZCTMpDeDS9aQ wB0D0Gxx8QBAJp4KgB1W7TB+hIGes/rs8Ve+6iO4ulLLdCVWX/q2boI0aZ7QX9O9 qNi8OsoZQtk6falRvciZFHwV5Av1p2Sy1AW57udQ7DvJ4H98AfKf1u8/z208WWW8 1ixC+qJxQcUcM9vI+7P9Tt7NbFSKv8SvAmqjFY7P+DxQAsVw6KXoqVXykDzeOv0t fUNOE/t0oFZafwtn8h7KBQnwS9lH03+3KkslVZs+iMFyUj/Bar+NVVyKoDhWXtVg /PuMhEg= =eU1o -----END PGP SIGNATURE----- Merge tag 'v5.17-rc5' into sched/core, to resolve conflicts New conflicts in sched/core due to the following upstream fixes: `44585f7bc0` ("psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n") `a06247c680` ("psi: Fix uaf issue when psi trigger is destroyed while being polled") Conflicts: include/linux/psi_types.h kernel/sched/psi.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2022-02-21 11:53:51 +01:00
Frederic Weisbecker	04d4e665a6	sched/isolation: Use single feature type while referring to housekeeping cpumask Refer to housekeeping APIs using single feature types instead of flags. This prevents from passing multiple isolation features at once to housekeeping interfaces, which soon won't be possible anymore as each isolation features will have their own cpumask. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Phil Auld <pauld@redhat.com> Link: https://lore.kernel.org/r/20220207155910.527133-5-frederic@kernel.org	2022-02-16 15:57:55 +01:00
Paul E. McKenney	00a8b4b54c	rcu-tasks: Set ->percpu_enqueue_shift to zero upon contention Currently, call_rcu_tasks_generic() sets ->percpu_enqueue_shift to order_base_2(nr_cpu_ids) upon encountering sufficient contention. This does not shift to use of non-CPU-0 callback queues as intended, but rather continues using only CPU 0's queue. Although this does provide some decrease in contention due to spreading work over multiple locks, it is not the dramatic decrease that was intended. This commit therefore makes call_rcu_tasks_generic() set ->percpu_enqueue_shift to 0. Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-08 10:13:22 -08:00
Paul E. McKenney	2bcd18e041	rcu-tasks: Use order_base_2() instead of ilog2() The ilog2() function can be used to generate a shift count, but it will generate the same count for a power of two as for one greater than a power of two. This results in shift counts that are larger than necessary for systems with a power-of-two number of CPUs because the CPUs are numbered from zero, so that the maximum CPU number is one less than that power of two. This commit therefore substitutes order_base_2(), which appears to have been designed for exactly this use case. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-02-08 10:13:12 -08:00
Paul E. McKenney	da123016ca	rcu-tasks: Fix computation of CPU-to-list shift counts The ->percpu_enqueue_shift field is used to map from the running CPU number to the index of the corresponding callback list. This mapping can change at runtime in response to varying callback load, resulting in varying levels of contention on the callback-list locks. Unfortunately, the initial value of this field is correct only if the system happens to have a power-of-two number of CPUs, otherwise the callbacks from the high-numbered CPUs can be placed into the callback list indexed by 1 (rather than 0), and those index-1 callbacks will be ignored. This can result in soft lockups and hangs. This commit therefore corrects this mapping, adding one to this shift count as needed for systems having odd numbers of CPUs. Fixes: `7a30871b6a` ("rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic queue selection") Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-01-26 13:04:05 -08:00
Paul E. McKenney	fd796e4139	rcu-tasks: Use fewer callbacks queues if callback flood ends By default, when lock contention is encountered, the RCU Tasks flavors of RCU switch to using per-CPU queueing. However, if the callback flood ends, per-CPU queueing continues to be used, which introduces significant additional overhead, especially for callback invocation, which fans out a series of workqueue handlers. This commit therefore switches back to single-queue operation if at the beginning of a grace period there are very few callbacks. The definition of "very few" is set by the rcupdate.rcu_task_collapse_lim module parameter, which defaults to 10. This switch happens in two phases, with the first phase causing future callbacks to be enqueued on CPU 0's queue, but with all queues continuing to be checked for grace periods and callback invocation. The second phase checks to see if an RCU grace period has elapsed and if all remaining RCU-Tasks callbacks are queued on CPU 0. If so, only CPU 0 is checked for future grace periods and callback operation. Of course, the return of contention anywhere during this process will result in returning to per-CPU callback queueing. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	2cee0789b4	rcu-tasks: Use separate ->percpu_dequeue_lim for callback dequeueing Decreasing the number of callback queues is a bit tricky because it is necessary to handle callbacks that were queued before the number of queues decreased, but which were not ready to invoke until afterwards. This commit takes a first step in this direction by maintaining a separate ->percpu_dequeue_lim to control callback dequeueing, in addition to the existing ->percpu_enqueue_lim which now controls only enqueueing. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	ab97152f88	rcu-tasks: Use more callback queues if contention encountered The rcupdate.rcu_task_enqueue_lim module parameter allows system administrators to tune the number of callback queues used by the RCU Tasks flavors. However if callback storms are infrequent, it would be better to operate with a single queue on a given system unless and until that system actually needed more queues. Systems not needing more queues can then avoid the overhead of checking the extra queues and especially avoid the overhead of fanning workqueue handlers out to all CPUs to invoke callbacks. This commit therefore switches to using all the CPUs' callback queues if call_rcu_tasks_generic() encounters too much lock contention. The amount of lock contention to tolerate defaults to 100 contended lock acquisitions per jiffy, and can be adjusted using the new rcupdate.rcu_task_contend_lim module parameter. Such switching is undertaken only if the rcupdate.rcu_task_enqueue_lim module parameter is negative, which is its default value (-1). This allows savvy systems administrators to set the number of queues to some known good value and to not have to worry about the kernel doing any second guessing. [ paulmck: Apply feedback from Guillaume Tucker and kernelci. ] Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	3063b33a34	rcu-tasks: Avoid raw-spinlocked wakeups from call_rcu_tasks_generic() If the caller of of call_rcu_tasks(), call_rcu_tasks_rude(), or call_rcu_tasks_trace() holds a raw spinlock, and then if call_rcu_tasks_generic() determines that the grace-period kthread must be awakened, then the wakeup might acquire a normal spinlock while a raw spinlock is held. This results in lockdep splats when the kernel is built with CONFIG_PROVE_RAW_LOCK_NESTING=y. This commit therefore defers the wakeup using irq_work_queue(). It would be nice to directly invoke wakeup when a raw spinlock is not held, but there is currently no way to check for this in all kernels. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	7d13d30bb6	rcu-tasks: Count trylocks to estimate call_rcu_tasks() contention This commit converts the unconditional raw_spin_lock_rcu_node() lock acquisition in call_rcu_tasks_generic() to a trylock followed by an unconditional acquisition if the trylock fails. If the trylock fails, the failure is counted, but the count is reset to zero on each new jiffy. This statistic will be used to determine when to move from a single callback queue to per-CPU callback queues. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	8610b65680	rcu-tasks: Add rcupdate.rcu_task_enqueue_lim to set initial queueing This commit adds a rcupdate.rcu_task_enqueue_lim module parameter that sets the initial number of callback queues to use for the RCU Tasks family of RCU implementations. This parameter allows testing of various fanout values. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	ce9b1c667f	rcu-tasks: Make rcu_barrier_tasks() handle multiple callback queues Currently, rcu_barrier_tasks(), rcu_barrier_tasks_rude(), and rcu_barrier_tasks_trace() simply invoke the corresponding synchronize_rcu_tasks() function. This works because there is only one callback queue. However, there will soon be multiple callback queues. This commit therefore scans the queues currently in use, entraining a callback on each non-empty queue. Sequence numbers and reference counts are used to synchronize this process in a manner similar to the approach taken by rcu_barrier(). Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:11 -08:00
Paul E. McKenney	d363f833c6	rcu-tasks: Use workqueues for multiple rcu_tasks_invoke_cbs() invocations If there is a flood of callbacks, it is necessary to put multiple CPUs to work invoking those callbacks. This commit therefore uses a workqueue-flooding approach to parallelize RCU Tasks callback execution. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:52:00 -08:00
Paul E. McKenney	57881863ad	rcu-tasks: Abstract invocations of callbacks This commit adds a rcu_tasks_invoke_cbs() function that invokes all ready callbacks on all of the per-CPU lists that are currently in use. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:51:44 -08:00
Paul E. McKenney	4d1114c054	rcu-tasks: Abstract checking of callback lists This commit adds a rcu_tasks_need_gpcb() function that returns an indication of whether another grace period is required, and if no grace period is required, whether there are callbacks that need to be invoked. The function scans all per-CPU lists currently in use. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:51:27 -08:00
Paul E. McKenney	8dd593fddd	rcu-tasks: Add a ->percpu_enqueue_lim to the rcu_tasks structure This commit adds a ->percpu_enqueue_lim field to the rcu_tasks structure. This field contains two to the power of the ->percpu_enqueue_shift field, easing construction of iterators over the per-CPU queues that might contain RCU Tasks callbacks. Such iterators will be introduced in later commits. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:13:55 -08:00
Neeraj Upadhyay	65b629e704	rcu-tasks: Inspect stalled task's trc state in locked state On RCU tasks trace stall, inspect the RCU-tasks-trace specific states of stalled task in locked down state, using try_invoke_ on_locked_down_task(), to get reliable trc state of a non-running stalled task. This was tested using the following command: tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --configs TRACE01 \ --bootargs "rcutorture.torture_type=tasks-tracing rcutorture.stall_cpu=10 \ rcutorture.stall_cpu_block=1 rcupdate.rcu_task_stall_timeout=100" --trust-make As expected, this produced the following console output for running and sleeping tasks. [ 21.520291] INFO: rcu_tasks_trace detected stalls on tasks: [ 21.521292] P85: ... nesting: 1N cpu: 2 [ 21.521966] task:rcu_torture_sta state:D stack:15080 pid: 85 ppid: 2 flags:0x00004000 [ 21.523384] Call Trace: [ 21.523808] __schedule+0x273/0x6e0 [ 21.524428] schedule+0x35/0xa0 [ 21.524971] schedule_timeout+0x1ed/0x270 [ 21.525690] ? del_timer_sync+0x30/0x30 [ 21.526371] ? rcu_torture_writer+0x720/0x720 [ 21.527106] rcu_torture_stall+0x24a/0x270 [ 21.527816] kthread+0x115/0x140 [ 21.528401] ? set_kthread_struct+0x40/0x40 [ 21.529136] ret_from_fork+0x22/0x30 [ 21.529766] 1 holdouts [ 21.632300] INFO: rcu_tasks_trace detected stalls on tasks: [ 21.632345] rcu_torture_stall end. [ 21.633293] P85: . [ 21.633294] task:rcu_torture_sta state:R running task stack:15080 pid: 85 ppid: 2 flags:0x00004000 [ 21.633299] Call Trace: [ 21.633301] ? vprintk_emit+0xab/0x180 [ 21.633306] ? vprintk_emit+0x11a/0x180 [ 21.633308] ? _printk+0x4d/0x69 [ 21.633311] ? __default_send_IPI_shortcut+0x1f/0x40 [ paulmck: Update to new v5.16 task_call_func() name. ] Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:13:55 -08:00
Paul E. McKenney	381a4f3b38	rcu-tasks: Use spin_lock_rcu_node() and friends This commit renames the rcu_tasks_percpu structure's ->cbs_pcpu_lock to ->lock and then uses spin_lock_rcu_node() and friends to acquire and release this lock, preparing for upcoming commits that will spread the grace-period process across multiple CPUs and kthreads. [ paulmck: Apply feedback from kernel test robot. ] Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-09 10:12:45 -08:00
Paul E. McKenney	9b073de1c7	rcu_tasks: Convert bespoke callback list to rcu_segcblist structure This commit moves from a bespoke head and tail pointer in the rcu_tasks_percpu structure to an rcu_segcblist structure, thus allowing associating the grace-period sequence number with groups of callbacks. This in turn will allow callbacks to be invoked independently on different CPUs. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:26:57 -08:00
Paul E. McKenney	b14fb4fbbc	rcu-tasks: Convert grace-period counter to grace-period sequence number This commit moves the rcu_tasks structure's ->n_gps grace-period-counter field to a ->task_gp_seq grce-period sequence number in order to enable use of the rcu_segcblist structure for the callback lists. This in turn permits CPUs to lag behind the RCU Tasks grace-period sequence number without suffering long-term slowdowns in callback invocation. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:26:57 -08:00
Paul E. McKenney	7a30871b6a	rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic queue selection This commit introduces a ->percpu_enqueue_shift field to the rcu_tasks structure, and uses it to shift down the CPU number in order to select a rcu_tasks_percpu structure. This field is currently set to a sufficiently large shift count to always select the CPU-0 instance of the rcu_tasks_percpu structure, and later commits will adjust this. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:26:57 -08:00
Paul E. McKenney	cafafd6776	rcu-tasks: Create per-CPU callback lists Currently, RCU Tasks Trace (as well as the other two flavors of RCU Tasks) use a single global callback list. This works well and is simple, but expected changes in workload will cause this list to become a bottleneck. This commit therefore creates per-CPU callback lists for the various flavors of RCU Tasks, but continues queueing on a single list, namely that of CPU 0. Later commits will dynamically vary the number of lists in use to accommodate dynamic changes in workload. Reported-by: Martin Lau <kafai@fb.com> Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com> Tested-by: kernel test robot <beibei.si@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-12-07 16:26:09 -08:00
Paul E. McKenney	f5dbc594b5	rcu-tasks: Don't remove tasks with pending IPIs from holdout list Currently, the check_all_holdout_tasks_trace() function removes all tasks marked with ->trc_reader_checked from the holdout list, including those with IPIs pending. This means that the IPI handler might arrive at a task that has already been removed from the list, which is at best an accident waiting to happen. This commit therefore avoids removing tasks with IPIs pending from the holdout list. This in turn means that the "if" condition in the for_each_online_cpu() loop in rcu_tasks_trace_postgp() should always evaluate to false, so a WARN_ON_ONCE() is added to check that. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-11-30 17:29:06 -08:00
Linus Torvalds	6fedc28076	RCU pull request for v5.16 This pull request contains the following branches: fixes.2021.10.07a: Miscellaneous fixes. scftorture.2021.09.16a: smp_call_function torture-test updates, most notably better checking of module parameters. tasks.2021.09.15a: Tasks-trace RCU updates that fix a number of rare but important race-condition bugs. torture.2021.09.13b: Other torture-test updates, most notably better checking of module parameters. In addition, rcutorture may now be run on CONFIG_PREEMPT_RT kernels. torturescript.2021.09.16a: Torture-test scripting updates, most notably specifying the new CONFIG_KCSAN_STRICT kconfig option rather than maintaining an ever-changing list of individual KCSAN kconfig options. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmGAVMMTHHBhdWxtY2tA a2VybmVsLm9yZwAKCRCevxLzctn7jGJBD/9ld6USOpedBLAbTYVMQYvIKoSqqDIG 74ZFhKvZ5I6Y8OZAGxXjb5U06rh4V2brlTN7IJ7XLEA1t401ENffsGeQSCxEmpEf PqQN04dbmVvaWjD4jiLZCcl3oDp+w1gIKwmX6wh0Weogr3KZWu5aNvD5tl9qIz4a uPC1JqTBxf7WDrLhqNxG5N4MXs27+KvukCd9wftk3NTzRJ9tyLM/YNGOVArM8rW2 QpEh8n6veB5dEoXBxmRHzuxYHN1k0Fhkbm3irMjcI0T5wj8TDod89zbg9mdFXMIj AjZ9CGpIBa4frThdu654ZNuEQHDCsPWtMi925xNOWxh5lkPGjeWnwYpcRrwfI2pj op0xVlur+Nam5CT/AJNT9+KogpZthAWXvwqCs5GbYNSU30Rlw99bw1vyAsJUD+af Mv08/z4o7Kuhr4cw2vkd2UfF9zuIQsJ1jWCIjMxfj4ctBnIpedrEnEISp8Y61fWk w9vXgCRhZCSkxoURoNss+nAUsiePUafptsvqKLu6Z53ufPA5yL0rVS778xq8vurP Xyd34TVlQ94ydZDC5pkSNpri1HGV1U7pztFwey5GloE66iV+7TSQCfMhzLd4CM0K wW96wimHrDtIxD6LedCZOHLHkS9AJd7F9uSoNodKspTH0tJowQztrzPW1eZifDE3 iJP8xcJ+vL67Og== =nmaP -----END PGP SIGNATURE----- Merge tag 'rcu.2021.11.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Miscellaneous fixes - Torture-test updates for smp_call_function(), most notably improved checking of module parameters. - Tasks-trace RCU updates that fix a number of rare but important race-condition bugs. - Other torture-test updates, most notably better checking of module parameters. In addition, rcutorture may once again be run on CONFIG_PREEMPT_RT kernels. - Torture-test scripting updates, most notably specifying the new CONFIG_KCSAN_STRICT kconfig option rather than maintaining an ever-changing list of individual KCSAN kconfig options. * tag 'rcu.2021.11.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (46 commits) rcu: Fix rcu_dynticks_curr_cpu_in_eqs() vs noinstr rcu: Always inline rcu_dynticks_task_{enter,exit}() torture: Make kvm-remote.sh print size of downloaded tarball torture: Allot 1G of memory for scftorture runs tools/rcu: Add an extract-stall script scftorture: Warn on individual scf_torture_init() error conditions scftorture: Count reschedule IPIs scftorture: Account for weight_resched when checking for all zeroes scftorture: Shut down if nonsensical arguments given scftorture: Allow zero weight to exclude an smp_call_function() category rcu: Avoid unneeded function call in rcu_read_unlock() rcu-tasks: Update comments to cond_resched_tasks_rcu_qs() rcu-tasks: Fix IPI failure handling in trc_wait_for_one_reader rcu-tasks: Fix read-side primitives comment for call_rcu_tasks_trace rcu-tasks: Clarify read side section info for rcu_tasks_rude GP primitives rcu-tasks: Correct comparisons for CPU numbers in show_stalled_task_trace rcu-tasks: Correct firstreport usage in check_all_holdout_tasks_trace rcu-tasks: Fix s/rcu_add_holdout/trc_add_holdout/ typo in comment rcu-tasks: Move RTGS_WAIT_CBS to beginning of rcu_tasks_kthread() loop rcu-tasks: Fix s/instruction/instructions/ typo in comment ...	2021-11-01 20:25:38 -07:00
Peter Zijlstra	9b3c4ab304	sched,rcu: Rework try_invoke_on_locked_down_task() Give try_invoke_on_locked_down_task() a saner name and have it return an int so that the caller might distinguish between different reasons of failure. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Tested-by: Vasily Gorbik <gor@linux.ibm.com> # on s390 Link: https://lkml.kernel.org/r/20210929152428.649944917@infradead.org	2021-10-07 13:51:15 +02:00
Paul E. McKenney	8af9e2c782	rcu-tasks: Update comments to cond_resched_tasks_rcu_qs() The cond_resched_rcu_qs() function no longer exists, despite being mentioned several times in kernel/rcu/tasks.h. This commit therefore updates it to the current cond_resched_tasks_rcu_qs(). Reported-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:41:19 -07:00
Neeraj Upadhyay	46aa886c48	rcu-tasks: Fix IPI failure handling in trc_wait_for_one_reader The trc_wait_for_one_reader() function is called at multiple stages of trace rcu-tasks GP function, rcu_tasks_wait_gp(): - First, it is called as part of per task function - rcu_tasks_trace_pertask(), for all non-idle tasks. As part of per task processing, this function add the task in the holdout list and if the task is currently running on a CPU, it sends IPI to the task's CPU. The IPI handler takes action depending on whether task is in trace rcu-tasks read side critical section or not: - a. If the task is in trace rcu-tasks read side critical section (t->trc_reader_nesting != 0), the IPI handler sets the task's ->trc_reader_special.b.need_qs, so that this task notifies exit from its outermost read side critical section (by decrementing trc_n_readers_need_end) to the GP handling function. trc_wait_for_one_reader() also increments trc_n_readers_need_end, so that the trace rcu-tasks GP handler function waits for this task's read side exit notification. The IPI handler also sets t->trc_reader_checked to true, and no further IPIs are sent for this task, for this trace rcu-tasks grace period and this task can be removed from holdout list. - b. If the task is in the process of exiting its trace rcu-tasks read side critical section, (t->trc_reader_nesting < 0), defer this task's processing to future calls to trc_wait_for_one_reader(). - c. If task is not in rcu-task read side critical section, t->trc_reader_nesting == 0, ->trc_reader_checked is set for this task, so that this task is removed from holdout list. - Second, trc_wait_for_one_reader() is called as part of post scan, in function rcu_tasks_trace_postscan(), for all idle tasks. - Third, in function check_all_holdout_tasks_trace(), this function is called for each task in the holdout list, but only if there isn't a pending IPI for the task (->trc_ipi_to_cpu == -1). This function removed the task from holdout list, if IPI handler has completed the required work, to ensure that the current trace rcu-tasks grace period either waits for this task, or this task is not in a trace rcu-tasks read side critical section. Now, considering the scenario where smp_call_function_single() fails in first case, inside rcu_tasks_trace_pertask(). In this case, ->trc_ipi_to_cpu is set to the current CPU for that task. This will result in trc_wait_for_one_reader() getting skipped in third case, inside check_all_holdout_tasks_trace(), for this task. This further results in ->trc_reader_checked never getting set for this task, and the task not getting removed from holdout list. This can cause the current trace rcu-tasks grace period to stall. Fix the above problem, by resetting ->trc_ipi_to_cpu to -1, on smp_call_function_single() failure, so that future IPI calls can be send for this task. Note that all three of the trc_wait_for_one_reader() function's callers (rcu_tasks_trace_pertask(), rcu_tasks_trace_postscan(), check_all_holdout_tasks_trace()) hold cpu_read_lock(). This means that smp_call_function_single() cannot race with CPU hotplug, and thus should never fail. Therefore, also add a warning in order to report any such failure in case smp_call_function_single() grows some other reason for failure. Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:41:19 -07:00
Neeraj Upadhyay	ed42c38067	rcu-tasks: Fix read-side primitives comment for call_rcu_tasks_trace call_rcu_tasks_trace() does have read-side primitives - rcu_read_lock_trace() and rcu_read_unlock_trace(). Fix this information in the comments. Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:41:19 -07:00
Neeraj Upadhyay	a6517e9ce0	rcu-tasks: Clarify read side section info for rcu_tasks_rude GP primitives RCU tasks rude variant does not check whether the current running context on a CPU is usermode. Read side critical section ends on transition to usermode execution, by the virtue of usermode execution being schedulable. Clarify this in comments for call_rcu_tasks_rude() and synchronize_rcu_tasks_rude(). Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:41:19 -07:00
Neeraj Upadhyay	d39ec8f3c1	rcu-tasks: Correct comparisons for CPU numbers in show_stalled_task_trace Valid CPU numbers can be zero or greater, but the checks for ->trc_ipi_to_cpu and tick_nohz_full_cpu()'s argument are for strictly greater than. This commit therefore corrects the check for no_hz_full cpu in show_stalled_task_trace() so as to include cpu 0. Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:41:14 -07:00
Neeraj Upadhyay	89401176da	rcu-tasks: Correct firstreport usage in check_all_holdout_tasks_trace In check_all_holdout_tasks_trace(), firstreport is a pointer argument; so, check the dereferenced value, instead of checking the pointer. Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:37:51 -07:00
Neeraj Upadhyay	d0a8585856	rcu-tasks: Fix s/rcu_add_holdout/trc_add_holdout/ typo in comment Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:37:51 -07:00
Paul E. McKenney	0db7c32ad3	rcu-tasks: Move RTGS_WAIT_CBS to beginning of rcu_tasks_kthread() loop Early in debugging, it made some sense to differentiate the first iteration from subsequent iterations, but now this just causes confusion. This commit therefore moves the "set_tasks_gp_state(rtp, RTGS_WAIT_CBS)" statement to the beginning of the "for" loop in rcu_tasks_kthread(). Reported-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:37:51 -07:00
Paul E. McKenney	c4f113ac45	rcu-tasks: Fix s/instruction/instructions/ typo in comment Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:37:50 -07:00
Paul E. McKenney	a5c071ccfa	rcu-tasks: Remove second argument of rcu_read_unlock_trace_special() The second argument of rcu_read_unlock_trace_special() is always zero. When called from exit_tasks_rcu_finish_trace(), it is the constant zero, and rcu_read_unlock_trace_special() doesn't get called from rcu_read_unlock_trace() unless the value of local variable "nesting" is zero because in that case the early return is taken instead. This commit therefore removes the "nesting" argument from the rcu_read_unlock_trace_special() function, substituting the constant zero within that function. This commit also adds a WARN_ON_ONCE() to rcu_read_lock_trace_held() in case non-zeroness some day appears. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:37:50 -07:00
Paul E. McKenney	18f08e758f	rcu-tasks: Add trc_inspect_reader() checks for exiting critical section Currently, trc_inspect_reader() treats a task exiting its RCU Tasks Trace read-side critical section the same as being within that critical section. However, this can fail because that task might have already checked its .need_qs field, which means that it might never decrement the all-important trc_n_readers_need_end counter. Of course, for that to happen, the task would need to never again execute an RCU Tasks Trace read-side critical section, but this really could happen if the system's last trampoline was removed. Note that exit from such a critical section cannot be treated as a quiescent state due to the possibility of nested critical sections. This means that if trc_inspect_reader() sees a negative nesting value, it must set up to try again later. This commit therefore ignores tasks that are exiting their RCU Tasks Trace read-side critical sections so that they will be rechecked later. [ paulmck: Apply feedback from Neeraj Upadhyay and Boqun Feng. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:37:50 -07:00
Paul E. McKenney	96017bf903	rcu-tasks: Simplify trc_read_check_handler() atomic operations Currently, trc_wait_for_one_reader() atomically increments the trc_n_readers_need_end counter before sending the IPI invoking trc_read_check_handler(). All failure paths out of trc_read_check_handler() and also from the smp_call_function_single() within trc_wait_for_one_reader() must carefully atomically decrement this counter. This is more complex than it needs to be. This commit therefore simplifies things and saves a few lines of code by dispensing with the atomic decrements in favor of having trc_read_check_handler() do the atomic increment only in the success case. In theory, this represents no change in functionality. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-09-15 11:37:50 -07:00

1 2 3

113 Commits