linux

History

Thomas Gleixner 01f0a02701 watchdog/core: Remove the park_in_progress obfuscation Commit: `b94f51183b` ("kernel/watchdog: prevent false hardlockup on overloaded system") tries to fix the following issue: proc_write() set_sample_period() <--- New sample period becoms visible <----- Broken starts proc_watchdog_update() watchdog_enable_all_cpus() watchdog_hrtimer_fn() update_watchdog_all_cpus() restart_timer(sample_period) watchdog_park_threads() thread->park() disable_nmi() <----- Broken ends The reason why this is broken is that the update of the watchdog threshold becomes immediately effective and visible for the hrtimer function which uses that value to rearm the timer. But the NMI/perf side still uses the old value up to the point where it is disabled. If the rate has been lowered then the NMI can run fast enough to 'detect' a hard lockup because the timer has not fired due to the longer period. The patch 'fixed' this by adding a variable: proc_write() set_sample_period() <----- Broken starts proc_watchdog_update() watchdog_enable_all_cpus() watchdog_hrtimer_fn() update_watchdog_all_cpus() restart_timer(sample_period) watchdog_park_threads() park_in_progress = 1 <----- Broken ends nmi_watchdog() if (park_in_progress) return; The only effect of this variable was to make the window where the breakage can hit small enough that it was not longer observable in testing. From a correctness point of view it is a pointless bandaid which merily papers over the root cause: the unsychronized update of the variable. Looking deeper into the related code pathes unearthed similar problems in the watchdog_start()/stop() functions. watchdog_start() perf_nmi_event_start() hrtimer_start() watchdog_stop() hrtimer_cancel() perf_nmi_event_stop() In both cases the call order is wrong because if the tasks gets preempted or the VM gets scheduled out long enough after the first call, then there is a chance that the next NMI will see a stale hrtimer interrupt count and trigger a false positive hard lockup splat. Get rid of park_in_progress so the code can be gradually deobfuscated and pruned from several layers of duct tape papering over the root cause, which has been either ignored or not understood at all. Once this is removed the underlying problem will be fixed by rewriting the proc interface to do a proper synchronized update. Address the start/stop() ordering problem as well by reverting the call order, so this part is at least correct now. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Don Zickus <dzickus@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Ulrich Obergfell <uobergfe@redhat.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1709052038270.2393@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org>		2017-09-14 11:41:05 +02:00
..
bpf	bpf: devmap, use cond_resched instead of cpu_relax	2017-09-08 21:11:00 -07:00
cgroup	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2017-09-12 11:30:56 -07:00
configs	ANDROID: binder: add hwbinder,vndbinder to BINDER_DEVICES.	2017-08-22 18:43:23 -07:00
debug	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h>	2017-03-02 08:42:34 +01:00
events	Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup	2017-09-06 22:25:25 -07:00
gcov	gcov: support GCC 7.1	2017-05-12 15:57:15 -07:00
irq	Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2017-09-12 11:25:56 -07:00
livepatch	livepatch: Fix stacking of patches with respect to RCU	2017-06-20 10:42:19 +02:00
locking	locking/rtmutex: replace top-waiter and pi_waiters leftmost caching	2017-09-08 18:26:49 -07:00
power	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2017-09-12 11:30:56 -07:00
printk	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk	2017-09-07 21:00:52 -07:00
rcu	treewide: make "nr_cpu_ids" unsigned	2017-09-08 18:26:48 -07:00
sched	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2017-09-13 12:22:32 -07:00
time	drivers/pps: aesthetic tweaks to PPS-related content	2017-09-08 18:26:51 -07:00
trace	Merge branch 'akpm' (patches from Andrew)	2017-09-09 10:30:07 -07:00
.gitignore
acct.c	sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h>	2017-03-02 08:42:39 +01:00
async.c	async: Adjust system_state checks	2017-05-23 10:01:37 +02:00
audit_fsnotify.c	Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs	2017-05-03 11:05:15 -07:00
audit_tree.c	Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs	2017-05-03 11:05:15 -07:00
audit_watch.c	audit/stable-4.13 PR 20170816	2017-08-16 16:48:34 -07:00
audit.c	audit: update the function comments	2017-09-05 09:46:59 -04:00
audit.h	audit: style fix	2017-06-12 18:07:43 -04:00
auditfilter.c	audit: kernel generated netlink traffic should have a portid of 0	2017-05-02 10:16:05 -04:00
auditsc.c	audit: update the function comments	2017-09-05 09:46:59 -04:00
backtracetest.c
bounds.c
capability.c
compat.c	Merge branch 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2017-07-06 20:57:13 -07:00
configs.c
context_tracking.c
cpu_pm.c	PM / CPU: replace raw_notifier with atomic_notifier	2017-07-31 13:09:49 +02:00
cpu.c	watchdog/hardlockup/perf: Prevent CPU hotplug deadlock	2017-09-14 11:41:05 +02:00
crash_core.c	kdump: protect vmcoreinfo data under the crash memory	2017-07-12 16:26:00 -07:00
crash_dump.c
cred.c	doc: ReSTify credentials.txt	2017-05-18 10:30:19 -06:00
delayacct.c	sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h>	2017-03-02 08:42:39 +01:00
dma.c
elfcore.c
exec_domain.c
exit.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace	2017-09-11 18:34:47 -07:00
extable.c	lib/extable.c: use bsearch() library function in search_extable()	2017-07-10 16:32:35 -07:00
fork.c	selinux/stable-4.14 PR 20170831	2017-09-12 13:21:00 -07:00
freezer.c
futex_compat.c
futex.c	futex: Remove duplicated code and fix undefined behaviour	2017-08-25 22:49:59 +02:00
groups.c	kernel/groups.c: use sort library function	2017-07-10 16:32:34 -07:00
hung_task.c	kernel/hung_task.c: defer showing held locks	2017-05-08 17:15:10 -07:00
irq_work.c
jump_label.c	jump_label: Provide hotplug context variants	2017-08-10 12:28:59 +02:00
kallsyms.c	kernel/kallsyms.c: replace all_var with IS_ENABLED(CONFIG_KALLSYMS_ALL)	2017-07-10 16:32:34 -07:00
kcmp.c	kcmp: add KCMP_EPOLL_TFD mode to compare epoll target files	2017-07-12 16:26:01 -07:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c	kcov: support compat processes	2017-09-08 18:26:51 -07:00
kexec_core.c	x86/mm, kexec: Allow kexec to be used with SME	2017-07-18 11:38:04 +02:00
kexec_file.c	kexec_file: adjust declaration of kexec_purgatory	2017-07-12 16:26:02 -07:00
kexec_internal.h	kexec_file: adjust declaration of kexec_purgatory	2017-07-12 16:26:02 -07:00
kexec.c	kdump: protect vmcoreinfo data under the crash memory	2017-07-12 16:26:00 -07:00
kmod.c	kmod: move #ifdef CONFIG_MODULES wrapper to Makefile	2017-09-08 18:26:51 -07:00
kprobes.c	kprobes: Ensure that jprobe probepoints are at function entry	2017-07-08 11:05:35 +02:00
ksysfs.c	kexec: move vmcoreinfo out of the kernel's .bss section	2017-07-12 16:25:59 -07:00
kthread.c	kernel/kthread.c: kthread_worker: don't hog the cpu	2017-08-31 16:33:15 -07:00
latencytop.c	sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h>	2017-03-02 08:42:39 +01:00
Makefile	kmod: move #ifdef CONFIG_MODULES wrapper to Makefile	2017-09-08 18:26:51 -07:00
memremap.c	mm/device-public-memory: device memory cache coherent with CPU	2017-09-08 18:26:46 -07:00
module_signing.c
module-internal.h
module.c	module: fix ddebug_remove_module()	2017-07-25 15:08:32 +02:00
notifier.c	kernel/notifier.c: simplify expression	2017-02-24 17:46:56 -08:00
nsproxy.c	perf: Add PERF_RECORD_NAMESPACES to include namespaces related info	2017-03-13 15:57:41 -03:00
padata.c	padata: Avoid nested calls to cpus_read_lock() in pcrypt_init_padata()	2017-05-26 10:10:37 +02:00
panic.c	locking/refcounts, x86/asm: Implement fast refcount overflow protection	2017-08-17 10:40:26 +02:00
params.c	boot/param: Move next_arg() function to lib/cmdline.c for later reuse	2017-04-18 10:37:13 +02:00
pid_namespace.c	userns,pidns: Verify the userns for new pid namespaces	2017-07-20 07:43:58 -05:00
pid.c	pids: make task_tgid_nr_ns() safe	2017-08-21 12:47:31 -07:00
profile.c	sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h>	2017-03-02 08:42:39 +01:00
ptrace.c	signal: Remove kernel interal si_code magic	2017-07-24 14:30:28 -05:00
range.c
reboot.c
relay.c	Merge branch 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2017-05-02 11:38:06 -07:00
resource.c
seccomp.c	seccomp: Switch from atomic_t to recount_t	2017-06-26 09:24:00 -07:00
signal.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace	2017-09-11 18:34:47 -07:00
smp.c	treewide: make "nr_cpu_ids" unsigned	2017-09-08 18:26:48 -07:00
smpboot.c	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h>	2017-03-02 08:42:35 +01:00
smpboot.h
softirq.c	sched/core: Remove 'task' parameter and rename tsk_restore_flags() to current_restore_flags()	2017-04-11 09:06:32 +02:00
stacktrace.c	stacktrace/x86: add function for detecting reliable stack traces	2017-03-08 09:18:02 +01:00
stop_machine.c	stop_machine: Provide stop_machine_cpuslocked()	2017-05-26 10:10:36 +02:00
sys_ni.c
sys.c	prctl: Allow local CAP_SYS_ADMIN changing exe_file	2017-07-20 07:46:07 -05:00
sysctl_binary.c	kernel/sysctl_binary.c: check name array length in deprecated_sysctl_warning()	2017-07-12 16:26:00 -07:00
sysctl.c	kernel/watchdog: split up config options	2017-07-12 16:26:02 -07:00
task_work.c	task_work: Replace spin_unlock_wait() with lock/unlock pair	2017-07-25 10:08:58 -07:00
taskstats.c	taskstats: add e/u/stime for TGID command	2017-05-08 17:15:12 -07:00
test_kprobes.c
torture.c	torture: Fix typo suppressing CPU-hotplug statistics	2017-07-25 13:04:45 -07:00
tracepoint.c	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h>	2017-03-02 08:42:35 +01:00
tsacct.c	sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h>	2017-03-02 08:42:39 +01:00
ucount.c	ucount: Remove the atomicity from ucount->count	2017-03-06 15:26:37 -06:00
uid16.c	sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h>	2017-03-02 08:42:31 +01:00
umh.c	kmod: split out umh code into its own file	2017-09-08 18:26:50 -07:00
up.c	smp: Avoid using two cache lines for struct call_single_data	2017-08-29 15:14:38 +02:00
user_namespace.c	userns,pidns: Verify the userns for new pid namespaces	2017-07-20 07:43:58 -05:00
user-return-notifier.c
user.c	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/user.h>	2017-03-02 08:42:29 +01:00
utsname_sysctl.c	sched/headers: Remove <linux/rwsem.h> from <linux/sched.h>	2017-03-03 01:45:36 +01:00
utsname.c	sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h>	2017-03-02 08:42:38 +01:00
watchdog_hld.c	watchdog/core: Remove the park_in_progress obfuscation	2017-09-14 11:41:05 +02:00
watchdog.c	watchdog/core: Remove the park_in_progress obfuscation	2017-09-14 11:41:05 +02:00
workqueue_internal.h
workqueue.c	Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq	2017-09-06 21:59:31 -07:00