linux/kernel
Paul E. McKenney 962aff03c3 rcu: Clean up handling of tasks blocked across full-rcu_node offline
Commit 0aa04b055e ("rcu: Process offlining and onlining only at
grace-period start") deferred handling of CPU-hotplug events until the
start of the next grace period, but consider the following sequence
of events:

1.	A task is preempted within an RCU-preempt read-side critical
	section.

2.	The CPU that this task was running on goes offline, along with all
	other CPUs sharing the corresponding leaf rcu_node structure.

3.	The task resumes execution.

4.	One of those CPUs comes back online before a new grace period starts.

In step 2, the code in the next rcu_gp_init() invocation will (correctly)
defer removing the leaf rcu_node structure from the upper-level bitmasks,
and will (correctly) set that structure's ->wait_blkd_tasks field.  During
the ensuing interval, RCU will (correctly) track the tasks preempted on
that structure because they must block any subsequent grace period.

In step 3, the code in rcu_read_unlock_special() will (correctly) remove
the task from the leaf rcu_node structure.  From this point forward, RCU
need not pay attention to this structure, at least not until one of the
corresponding CPUs comes back online.

In step 4, the code in the next rcu_gp_init() invocation will
(incorrectly) invoke rcu_init_new_rnp().  This is incorrect because
the corresponding rcu_cleanup_dead_rnp() was never invoked.  This is
nevertheless harmless because the upper-level bits are still set.
So, no harm, no foul, right?

At least, all is well until a little further into rcu_gp_init()
invocation, which will notice that there are no longer any tasks blocked
on the leaf rcu_node structure, conclude that there is no longer anything
left over from step 2's offline operation, and will therefore invoke
rcu_cleanup_dead_rnp().  But this invocation of rcu_cleanup_dead_rnp()
is for the beginning of the earlier offline interval, and the previous
invocation of rcu_init_new_rnp() is for the end of that same interval.
That is right, they are invoked out of order.

That cannot be good, can it?

It turns out that this is not a (correctness!) problem because
rcu_cleanup_dead_rnp() checks to see if any of the corresponding CPUs
are online, and refuses to do anything if so.  In other words, in the
case where rcu_init_new_rnp() and rcu_cleanup_dead_rnp() execute out of
order, they both have no effect.

But this is at best an accident waiting to happen.

This commit therefore adds logic to rcu_gp_init() so that
rcu_init_new_rnp() and rcu_cleanup_dead_rnp() are always invoked in
order, and so that neither are invoked at all in cases where RCU had to
pay attention to the leaf rcu_node structure during the entire time that
all corresponding CPUs were offline.

And, while in the area, this commit reduces confusion by using formal
parameters rather than local variables that just happen to have the same
value at that particular point in the code.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:58 -07:00
..
bpf Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-06-16 07:39:34 +09:00
cgroup docs: Fix some broken references 2018-06-15 18:10:01 -03:00
configs kconfig: tinyconfig: remove stale stack protector fixups 2018-06-15 07:15:28 +09:00
debug treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
events treewide: kzalloc_node() -> kcalloc_node() 2018-06-12 16:19:22 -07:00
gcov gcov: remove CONFIG_GCOV_FORMAT_AUTODETECT 2018-06-08 18:56:02 +09:00
irq Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-06-10 09:44:53 -07:00
livepatch livepatch: Allow to call a custom callback when freeing shadow variables 2018-04-17 13:42:48 +02:00
locking torture: Keep old-school dmesg format 2018-06-25 11:30:10 -07:00
power fix a series of Documentation/ broken file name references 2018-06-15 18:10:01 -03:00
printk Printk changes for 4.18 2018-06-06 16:04:55 -07:00
rcu rcu: Clean up handling of tasks blocked across full-rcu_node offline 2018-07-12 15:38:58 -07:00
sched sched/core / kcov: avoid kcov_area during task switch 2018-06-15 07:55:24 +09:00
time Power management updates for 4.18-rc1 2018-06-05 09:38:39 -07:00
trace docs: Fix some broken references 2018-06-15 18:10:01 -03:00
.gitignore
acct.c
async.c kernel/async.c: revert "async: simplify lowest_in_progress()" 2018-02-06 18:32:44 -08:00
audit_fsnotify.c fsnotify: add fsnotify_add_inode_mark() wrappers 2018-05-18 14:58:22 +02:00
audit_tree.c fsnotify: add fsnotify_add_inode_mark() wrappers 2018-05-18 14:58:22 +02:00
audit_watch.c \n 2018-06-17 05:06:18 +09:00
audit.c audit: use inline function to get audit context 2018-05-14 17:24:18 -04:00
audit.h audit: track the owner of the command mutex ourselves 2018-02-23 11:22:22 -05:00
auditfilter.c audit: use existing session info function 2018-05-18 15:47:54 -04:00
auditsc.c audit: Fix wrong task in comparison of session ID 2018-05-21 14:27:43 -04:00
backtracetest.c
bounds.c
capability.c
compat.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-06-04 20:27:54 -07:00
configs.c
context_tracking.c
cpu_pm.c
cpu.c cpu/hotplug: Fix unused function warning 2018-03-15 20:34:40 +01:00
crash_core.c mm: split page_type out from _mapcount 2018-06-07 17:34:37 -07:00
crash_dump.c
cred.c
delayacct.c delayacct: Use raw_spinlocks 2018-04-27 14:34:51 +02:00
dma.c proc: introduce proc_create_single{,_data} 2018-05-16 07:23:35 +02:00
elfcore.c
exec_domain.c proc: introduce proc_create_single{,_data} 2018-05-16 07:23:35 +02:00
exit.c kernel: use kernel_wait4() instead of sys_wait4() 2018-04-02 20:14:51 +02:00
extable.c extable: Make init_kernel_text() global 2018-02-21 16:54:06 +01:00
fail_function.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
fork.c mm: check for SIGKILL inside dup_mmap() loop 2018-06-15 07:55:24 +09:00
freezer.c
futex_compat.c
futex.c pids: introduce find_get_task_by_vpid() helper 2018-02-06 18:32:46 -08:00
groups.c
hung_task.c kernel/hung_task.c: show all hung tasks before panic 2018-06-07 17:34:39 -07:00
iomem.c memremap: split devm_memremap_pages() and memremap() infrastructure 2018-05-15 23:08:33 -07:00
irq_work.c
jump_label.c jump_label: Disable jump labels in __exit code 2018-03-20 08:57:17 +01:00
kallsyms.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk 2018-02-01 13:36:15 -08:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c sched/core / kcov: avoid kcov_area during task switch 2018-06-15 07:55:24 +09:00
kexec_core.c kexec: yield to scheduler when loading kimage segments 2018-06-15 07:55:24 +09:00
kexec_file.c treewide: Use array_size() in vzalloc() 2018-06-12 16:19:22 -07:00
kexec_internal.h
kexec.c kexec: call do_kexec_load() in compat syscall directly 2018-04-02 20:15:01 +02:00
kmod.c
kprobes.c kprobes: Fix random address output of blacklist file 2018-04-25 10:27:56 -04:00
ksysfs.c
kthread.c kthread: Allow kthread_park() on a parked kthread 2018-05-25 08:03:51 +02:00
latencytop.c
Makefile Merge branch 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-06-10 10:17:09 -07:00
memremap.c mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS 2018-05-22 06:59:39 -07:00
module_signing.c
module-internal.h
module.c Modules updates for v4.18 2018-06-16 07:36:39 +09:00
notifier.c
nsproxy.c
padata.c
panic.c Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables 2018-06-14 12:21:18 +09:00
params.c kernel/params.c: downgrade warning for unsafe parameters 2018-04-11 10:28:37 -07:00
pid_namespace.c Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2018-04-03 19:15:32 -07:00
pid.c xarray: add the xa_lock to the radix_tree_root 2018-04-11 10:28:39 -07:00
profile.c
ptrace.c pids: introduce find_get_task_by_vpid() helper 2018-02-06 18:32:46 -08:00
range.c
reboot.c
relay.c kernel/relay.c: change return type to vm_fault_t 2018-06-15 07:55:24 +09:00
resource.c libnvdimm for 4.18 2018-06-08 17:21:52 -07:00
rseq.c rseq: Introduce restartable sequences system call 2018-06-06 11:58:31 +02:00
seccomp.c audit/stable-4.18 PR 20180605 2018-06-06 16:34:00 -07:00
signal.c signal: Remove no longer required irqsave/restore 2018-06-10 06:14:01 +02:00
smp.c
smpboot.c
smpboot.h
softirq.c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-06-04 19:59:22 -07:00
stacktrace.c
stop_machine.c Linux 4.17-rc5 2018-05-15 08:10:50 +02:00
sys_ni.c Merge branch 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-06-10 10:17:09 -07:00
sys.c mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct 2018-06-07 17:34:34 -07:00
sysctl_binary.c staging: irda: remove remaining remants of irda code removal 2018-04-16 11:26:49 +02:00
sysctl.c treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
task_work.c
taskstats.c pids: introduce find_get_task_by_vpid() helper 2018-02-06 18:32:46 -08:00
test_kprobes.c
torture.c torture: Keep old-school dmesg format 2018-06-25 11:30:10 -07:00
tracepoint.c tracepoints: Fix the descriptions of tracepoint_probe_register{_prio} 2018-05-28 12:49:51 -04:00
tsacct.c
ucount.c headers: untangle kmemleak.h from mm.h 2018-04-05 21:36:27 -07:00
uid16.c fs: add do_fchownat(), ksys_fchown() helpers and ksys_{,l}chown() wrappers 2018-04-02 20:15:59 +02:00
uid16.h kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c 2018-04-02 20:15:30 +02:00
umh.c umh: fix race condition 2018-06-07 16:56:28 -04:00
up.c
user_namespace.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
user-return-notifier.c
user.c efivarfs: Limit the rate for non-root to read files 2018-02-22 10:21:02 -08:00
utsname_sysctl.c
utsname.c uts: create "struct uts_namespace" from kmem_cache 2018-04-11 10:28:35 -07:00
watchdog_hld.c
watchdog.c
workqueue_internal.h workqueue: Set worker->desc to workqueue name by default 2018-05-18 08:47:13 -07:00
workqueue.c treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00