linux/include
Dongli Zhang 5e25f5db6a xen/time: do not decrease steal time after live migration on xen
After guest live migration on xen, steal time in /proc/stat
(cpustat[CPUTIME_STEAL]) might decrease because steal returned by
xen_steal_lock() might be less than this_rq()->prev_steal_time which is
derived from previous return value of xen_steal_clock().

For instance, steal time of each vcpu is 335 before live migration.

cpu  198 0 368 200064 1962 0 0 1340 0 0
cpu0 38 0 81 50063 492 0 0 335 0 0
cpu1 65 0 97 49763 634 0 0 335 0 0
cpu2 38 0 81 50098 462 0 0 335 0 0
cpu3 56 0 107 50138 374 0 0 335 0 0

After live migration, steal time is reduced to 312.

cpu  200 0 370 200330 1971 0 0 1248 0 0
cpu0 38 0 82 50123 500 0 0 312 0 0
cpu1 65 0 97 49832 634 0 0 312 0 0
cpu2 39 0 82 50167 462 0 0 312 0 0
cpu3 56 0 107 50207 374 0 0 312 0 0

Since runstate times are cumulative and cleared during xen live migration
by xen hypervisor, the idea of this patch is to accumulate runstate times
to global percpu variables before live migration suspend. Once guest VM is
resumed, xen_get_runstate_snapshot_cpu() would always return the sum of new
runstate times and previously accumulated times stored in global percpu
variables.

Comment above HYPERVISOR_suspend() has been removed as it is inaccurate:
the call can return an error code (e.g., possibly -EPERM in the future).

Similar and more severe issue would impact prior linux 4.8-4.10 as
discussed by Michael Las at
https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest,
which would overflow steal time and lead to 100% st usage in top command
for linux 4.8-4.10. A backport of this patch would fix that issue.

[boris: added linux/slab.h to driver/xen/time.c, slightly reformatted
        commit message]

References: https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-02 16:49:41 -04:00
..
acpi ACPI / bus: Make ACPI_HANDLE() work for non-GPL code again 2017-09-19 22:42:31 +02:00
asm-generic percpu: make this_cpu_generic_read() atomic w.r.t. interrupts 2017-09-26 07:37:33 -07:00
clocksource
crypto
drm lib/interval_tree: fast overlap detection 2017-09-08 18:26:49 -07:00
dt-bindings ARC: reset: remove the misleading v1 suffix all over 2017-09-18 13:02:03 +02:00
keys
kvm
linux Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-10-29 08:11:49 -07:00
math-emu
media media updates for v4.14-rc1 2017-09-07 12:53:14 -07:00
memory
misc
net net_sched: introduce a workqueue for RCU callbacks of tc filter 2017-10-29 22:49:30 +09:00
pcmcia
ras
rdma IB: Correct MR length field to be 64-bit 2017-09-25 11:47:23 -04:00
scsi scsi: libiscsi: Remove iscsi_destroy_session 2017-10-02 22:23:21 -04:00
soc ARM: SoC driver updates for v4.14 2017-09-10 20:40:00 -07:00
sound ALSA: hda - Fix incorrect TLV callback check introduced during set_fs() removal 2017-10-18 12:27:00 +02:00
target
trace sched/debug: Add explicit TASK_PARKED printing 2017-09-29 11:02:57 +02:00
uapi Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-10-29 08:11:49 -07:00
video
xen xen/time: do not decrease steal time after live migration on xen 2017-11-02 16:49:41 -04:00