linux/kernel/rcu
Sergey Senozhatsky ccfc9dd691 rcu/tree: Handle VM stoppage in stall detection
The soft watchdog timer function checks if a virtual machine
was suspended and hence what looks like a lockup in fact
is a false positive.

This is what kvm_check_and_clear_guest_paused() does: it
tests guest PVCLOCK_GUEST_STOPPED (which is set by the host)
and if it's set then we need to touch all watchdogs and bail
out.

Watchdog timer function runs from IRQ, so PVCLOCK_GUEST_STOPPED
check works fine.

There is, however, one more watchdog that runs from IRQ, so
watchdog timer fn races with it, and that watchdog is not aware
of PVCLOCK_GUEST_STOPPED - RCU stall detector.

apic_timer_interrupt()
 smp_apic_timer_interrupt()
  hrtimer_interrupt()
   __hrtimer_run_queues()
    tick_sched_timer()
     tick_sched_handle()
      update_process_times()
       rcu_sched_clock_irq()

This triggers RCU stalls on our devices during VM resume.

If tick_sched_handle()->rcu_sched_clock_irq() runs on a VCPU
before watchdog_timer_fn()->kvm_check_and_clear_guest_paused()
then there is nothing on this VCPU that touches watchdogs and
RCU reads stale gp stall timestamp and new jiffies value, which
makes it think that RCU has stalled.

Make RCU stall watchdog aware of PVCLOCK_GUEST_STOPPED and
don't report RCU stalls when we resume the VM.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-08-06 13:41:48 -07:00
..
Kconfig Merge branches 'doc.2021.01.06a', 'fixes.2021.01.04b', 'kfree_rcu.2021.01.04a', 'mmdumpobj.2021.01.22a', 'nocb.2021.01.06a', 'rt.2021.01.04a', 'stall.2021.01.06a', 'torture.2021.01.12a' and 'tortureall.2021.01.06a' into HEAD 2021-01-22 15:26:44 -08:00
Kconfig.debug rcu: Restrict RCU_STRICT_GRACE_PERIOD to at most four CPUs 2021-05-10 16:22:54 -07:00
Makefile rcuperf: Change rcuperf to rcuscale 2020-08-24 18:39:24 -07:00
rcu_segcblist.c rcu/nocb: Remove stale comment above rcu_segcblist_offload() 2021-03-15 13:54:54 -07:00
rcu_segcblist.h rcu/nocb: Code-style nits in callback-offloading toggling 2021-01-06 16:47:55 -08:00
rcu.h Merge branches 'bitmaprange.2021.05.10c', 'doc.2021.05.10c', 'fixes.2021.05.13a', 'kvfree_rcu.2021.05.10c', 'mmdumpobj.2021.05.10c', 'nocb.2021.05.12a', 'srcu.2021.05.12a', 'tasks.2021.05.18a' and 'torture.2021.05.10c' into HEAD 2021-05-18 10:56:19 -07:00
rcuscale.c rcuscale: Add kfree_rcu() single-argument scale test 2021-03-08 14:18:07 -08:00
rcutorture.c Merge branch 'core-rcu-2021.07.04' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu 2021-07-04 12:58:33 -07:00
refscale.c refscale: Avoid false-positive warnings in ref_scale_reader() 2021-07-06 12:38:08 -07:00
srcutiny.c srcu: Provide polling interfaces for Tiny SRCU grace periods 2021-01-04 13:53:38 -08:00
srcutree.c Merge branches 'bitmaprange.2021.05.10c', 'doc.2021.05.10c', 'fixes.2021.05.13a', 'kvfree_rcu.2021.05.10c', 'mmdumpobj.2021.05.10c', 'nocb.2021.05.12a', 'srcu.2021.05.12a', 'tasks.2021.05.18a' and 'torture.2021.05.10c' into HEAD 2021-05-18 10:56:19 -07:00
sync.c rcu: Fix various typos in comments 2021-05-12 12:11:05 -07:00
tasks.h rcu-tasks: Don't delete holdouts within trc_wait_for_one_reader() 2021-07-06 15:52:49 -07:00
tiny.c srcu: Initialize SRCU after timers 2021-05-10 16:03:35 -07:00
tree_exp.h rcu/tree: Add a trace event for RCU CPU stall warnings 2021-03-15 13:53:24 -07:00
tree_plugin.h rcu: Mark accesses to ->rcu_read_lock_nesting 2021-08-06 13:41:48 -07:00
tree_stall.h rcu/tree: Handle VM stoppage in stall detection 2021-08-06 13:41:48 -07:00
tree.c rcu: Weaken ->dynticks accesses and updates 2021-08-06 13:41:48 -07:00
tree.h Merge branches 'bitmaprange.2021.05.10c', 'doc.2021.05.10c', 'fixes.2021.05.13a', 'kvfree_rcu.2021.05.10c', 'mmdumpobj.2021.05.10c', 'nocb.2021.05.12a', 'srcu.2021.05.12a', 'tasks.2021.05.18a' and 'torture.2021.05.10c' into HEAD 2021-05-18 10:56:19 -07:00
update.c Merge branches 'bitmaprange.2021.05.10c', 'doc.2021.05.10c', 'fixes.2021.05.13a', 'kvfree_rcu.2021.05.10c', 'mmdumpobj.2021.05.10c', 'nocb.2021.05.12a', 'srcu.2021.05.12a', 'tasks.2021.05.18a' and 'torture.2021.05.10c' into HEAD 2021-05-18 10:56:19 -07:00