linux/kernel/rcu
Frederic Weisbecker 28319d6dc5 rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes()
RCU Tasks and PID-namespace unshare can interact in do_exit() in a
complicated circular dependency:

1) TASK A calls unshare(CLONE_NEWPID), this creates a new PID namespace
   that every subsequent child of TASK A will belong to. But TASK A
   doesn't itself belong to that new PID namespace.

2) TASK A forks() and creates TASK B. TASK A stays attached to its PID
   namespace (let's say PID_NS1) and TASK B is the first task belonging
   to the new PID namespace created by unshare()  (let's call it PID_NS2).

3) Since TASK B is the first task attached to PID_NS2, it becomes the
   PID_NS2 child reaper.

4) TASK A forks() again and creates TASK C which get attached to PID_NS2.
   Note how TASK C has TASK A as a parent (belonging to PID_NS1) but has
   TASK B (belonging to PID_NS2) as a pid_namespace child_reaper.

5) TASK B exits and since it is the child reaper for PID_NS2, it has to
   kill all other tasks attached to PID_NS2, and wait for all of them to
   die before getting reaped itself (zap_pid_ns_process()).

6) TASK A calls synchronize_rcu_tasks() which leads to
   synchronize_srcu(&tasks_rcu_exit_srcu).

7) TASK B is waiting for TASK C to get reaped. But TASK B is under a
   tasks_rcu_exit_srcu SRCU critical section (exit_notify() is between
   exit_tasks_rcu_start() and exit_tasks_rcu_finish()), blocking TASK A.

8) TASK C exits and since TASK A is its parent, it waits for it to reap
   TASK C, but it can't because TASK A waits for TASK B that waits for
   TASK C.

Pid_namespace semantics can hardly be changed at this point. But the
coverage of tasks_rcu_exit_srcu can be reduced instead.

The current task is assumed not to be concurrently reapable at this
stage of exit_notify() and therefore tasks_rcu_exit_srcu can be
temporarily relaxed without breaking its constraints, providing a way
out of the deadlock scenario.

[ paulmck: Fix build failure by adding additional declaration. ]

Fixes: 3f95aa81d2 ("rcu: Make TASKS_RCU handle tasks that are almost done exiting")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Eric W . Biederman <ebiederm@xmission.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-01-03 17:52:16 -08:00
..
Kconfig printk changes for 6.2 2022-12-12 09:01:36 -08:00
Kconfig.debug rcu: Make SRCU mandatory 2022-11-29 15:00:06 -08:00
Makefile rcuperf: Change rcuperf to rcuscale 2020-08-24 18:39:24 -07:00
rcu_segcblist.c rcu: Clarify fill-the-gap comment in rcu_segcblist_advance() 2022-04-11 17:28:48 -07:00
rcu_segcblist.h rcu: Mark writes to the rcu_segcblist structure's ->flags field 2022-02-14 10:36:58 -08:00
rcu.h printk changes for 6.2 2022-12-12 09:01:36 -08:00
rcuscale.c rcu/rcuscale: Use call_rcu_hurry() for async reader test 2022-11-29 14:04:33 -08:00
rcutorture.c Merge branches 'doc.2022.10.20a', 'fixes.2022.10.21a', 'lazy.2022.11.30a', 'srcunmisafe.2022.11.09a', 'torture.2022.10.18c' and 'torturescript.2022.10.20a' into HEAD 2022-11-30 13:20:05 -08:00
refscale.c refscale: Convert test_lock spinlock to raw_spinlock 2022-06-21 15:57:04 -07:00
srcutiny.c srcu: Make Tiny synchronize_srcu() check for readers 2022-12-01 15:49:12 -08:00
srcutree.c srcu: Debug NMI safety even on archs that don't require it 2022-10-21 10:44:11 -07:00
sync.c rcu/sync: Use call_rcu_hurry() instead of call_rcu 2022-11-29 14:04:33 -08:00
tasks.h rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes() 2023-01-03 17:52:16 -08:00
tiny.c rcu: Make call_rcu() lazy to save power 2022-11-29 14:02:23 -08:00
tree_exp.h rcu: Make call_rcu() lazy to save power 2022-11-29 14:02:23 -08:00
tree_nocb.h rcu: Shrinker for lazy rcu 2022-11-29 14:02:52 -08:00
tree_plugin.h rcu: Synchronize ->qsmaskinitnext in rcu_boost_kthread_setaffinity() 2022-10-18 14:59:57 -07:00
tree_stall.h sched/debug: Try trigger_single_cpu_backtrace(cpu) in dump_cpu_task() 2022-08-31 05:03:14 -07:00
tree.c Urgent RCU pull request for v6.2 2022-12-21 07:59:57 -08:00
tree.h rcu: Make call_rcu() lazy to save power 2022-11-29 14:02:23 -08:00
update.c rcu: Make SRCU mandatory 2022-11-29 15:00:06 -08:00