linux/kernel
Viresh Kumar b5e995e671 nohz: Fix spurious periodic tick behaviour in low-res dynticks mode
When we reach the end of the tick handler, we unconditionally reschedule
the next tick to the next jiffy. Then on irq exit, the nohz code
overrides that setting if needed and defers the next tick as far away in
the future as possible.

Now in the best dynticks case, when we actually don't need any tick in
the future (ie: expires == KTIME_MAX), low-res and high-res behave
differently. What we want in this case is to cancel the next tick
programmed by the previous one. That's what we do in high-res mode. OTOH
we lack a low-res mode equivalent of hrtimer_cancel() so we simply don't
do anything in this case and the next tick remains scheduled to jiffies + 1.

As a result, in low-res mode, when the dynticks code determines that no
tick is needed in the future, we can recursively get a spurious tick
every jiffy because then the next tick is always reprogrammed from the
tick handler and is never cancelled. And this can happen indefinetly
until some subsystem actually needs a precise tick in the future and only
then we eventually overwrite the previous tick handler setting to defer
the next tick.

We are fixing this by introducing the ONESHOT_STOPPED mode which will
let us pause a clockevent when no further interrupt is needed. Meanwhile
we can't expect all drivers to support this new mode.

So lets reduce much of the symptoms by skipping the nohz-blind tick
rescheduling from the tick-handler when the CPU is in dynticks mode.
That tick rescheduling wrongly assumed periodicity and the low-res
dynticks code can't cancel such decision. This breaks the recursive (and
thus the worst) part of the problem. In the worst case now, we'll get
only one extra tick due to uncancelled tick scheduled before we entered
dynticks mode.

This also removes a needless clockevent write on idle ticks. Since those
clock write are usually considered to be slow, it's a general win.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2014-08-22 18:46:49 +02:00
..
bpf net: filter: split 'struct sk_filter' into socket and bpf parts 2014-08-02 15:03:58 -07:00
debug kdb: Use ktime_get_ts() 2014-06-12 16:18:45 +02:00
events mm: memcontrol: rewrite charge API 2014-08-08 15:57:17 -07:00
gcov kernel/gcov/fs.c: remove unnecessary null test before debugfs_remove 2014-08-08 15:57:24 -07:00
irq Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-08-05 17:38:45 -07:00
locking arch, locking: Ciao arch_mutex_cpu_relax() 2014-07-17 12:32:47 +02:00
power Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle' 2014-08-11 23:19:48 +02:00
printk printk: Add function to return log buffer address and size 2014-08-13 15:13:44 +10:00
rcu Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-08-04 15:55:08 -07:00
sched Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle' 2014-08-11 23:19:48 +02:00
time nohz: Fix spurious periodic tick behaviour in low-res dynticks mode 2014-08-22 18:46:49 +02:00
trace This contains a fix for two long standing bugs. Both of which are 2014-08-09 17:29:36 -07:00
.gitignore
acct.c kernel/acct.c: fix coding style warnings and errors 2014-08-08 15:57:27 -07:00
async.c
audit_tree.c
audit_watch.c
audit.c CAPABILITIES: remove undefined caps from all processes 2014-07-24 21:53:47 +10:00
audit.h
auditfilter.c kernel/auditfilter.c: replace count*size kmalloc by kcalloc 2014-08-06 18:01:12 -07:00
auditsc.c auditsc: audit_krule mask accesses need bounds checking 2014-06-10 08:44:40 -07:00
backtracetest.c kernel/backtracetest.c: replace no level printk by pr_info() 2014-06-04 16:54:14 -07:00
bounds.c page-cgroup: get rid of NR_PCG_FLAGS 2014-08-08 15:57:18 -07:00
capability.c CAPABILITIES: remove undefined caps from all processes 2014-07-24 21:53:47 +10:00
cgroup_freezer.c cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes 2014-07-15 11:05:09 -04:00
cgroup.c Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-08-04 10:11:28 -07:00
compat.c kernel/compat.c: use sizeof() instead of sizeof 2014-06-04 16:54:19 -07:00
configs.c
context_tracking.c x86/kprobes: Fix build errors and blacklist context_track_user 2014-06-14 09:07:44 +02:00
cpu_pm.c
cpu.c sched: Rework check_for_tasks() 2014-07-05 11:17:45 +02:00
cpuset.c Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-08-04 10:11:28 -07:00
crash_dump.c
cred.c
delayacct.c delayacct: Remove braindamaged type conversions 2014-07-23 10:18:06 -07:00
dma.c
elfcore.c
exec_domain.c kernel/exec_domain.c: code clean-up 2014-06-04 16:54:15 -07:00
exit.c kernel/exit.c: fix coding style warnings and errors 2014-08-08 15:57:22 -07:00
extable.c
fork.c seccomp: Replace BUG(!spin_is_locked()) with assert_spin_lock 2014-08-11 13:29:12 -07:00
freezer.c
futex_compat.c
futex.c futex: Simplify futex_lock_pi_atomic() and make it more robust 2014-06-21 22:26:24 +02:00
groups.c kernel/groups.c: remove return value of set_groups 2014-04-03 16:21:05 -07:00
hung_task.c kernel/hung_task.c: convert simple_strtoul to kstrtouint 2014-06-04 16:54:15 -07:00
irq_work.c irq_work: Remove BUG_ON in irq_work_run() 2014-07-05 11:17:26 +02:00
jump_label.c
kallsyms.c kernel/kallsyms.c: fix %pB when there's no symbol at the address 2014-08-08 15:57:18 -07:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER 2014-07-16 14:57:13 +02:00
Kconfig.preempt
kexec.c kexec: verify the signature of signed PE bzImage 2014-08-08 15:57:33 -07:00
kmod.c signals: change wait_for_helper() to use kernel_sigaction() 2014-06-06 16:08:12 -07:00
kprobes.c kprobes: Fix "Failed to find blacklist" probing errors on ia64 and ppc64 2014-07-18 06:23:40 +02:00
ksysfs.c kobject: Make support for uevent_helper optional. 2014-04-25 12:00:49 -07:00
kthread.c kthread_work: wake up worker only when the worker is idle 2014-07-28 14:07:52 -04:00
latencytop.c kernel/latencytop.c: convert seq_printf to seq_puts 2014-06-04 16:54:15 -07:00
Makefile bin2c: move bin2c in scripts/basic 2014-08-08 15:57:32 -07:00
module_signing.c
module-internal.h
module.c module: Clean up ro/nx after early module load failures 2014-08-16 04:47:00 +09:30
notifier.c kprobes, notifier: Use NOKPROBE_SYMBOL macro in notifier 2014-04-24 10:26:39 +02:00
nsproxy.c namespaces: Use task_lock and not rcu to protect nsproxy 2014-07-29 18:08:50 -07:00
padata.c
panic.c panic: add TAINT_SOFTLOCKUP 2014-08-08 15:57:24 -07:00
params.c Add module param type 'ullong' 2014-07-17 22:07:37 +02:00
pid_namespace.c
pid.c
profile.c kernel/profile.c: use static const char instead of static char 2014-06-06 16:08:13 -07:00
ptrace.c sched: Remove proliferation of wait_on_bit() action functions 2014-07-16 15:10:39 +02:00
range.c
reboot.c kernel/reboot.c: convert simple_strtoul to kstrtoint 2014-06-04 16:54:15 -07:00
relay.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-04-12 14:49:50 -07:00
res_counter.c kernel/res_counter.c: replace simple_strtoull by kstrtoull 2014-06-04 16:54:15 -07:00
resource.c resource: provide new functions to walk through resources 2014-08-08 15:57:32 -07:00
seccomp.c seccomp: Replace BUG(!spin_is_locked()) with assert_spin_lock 2014-08-11 13:29:12 -07:00
signal.c Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
smp.c kernel/smp.c:on_each_cpu_cond(): fix warning in fallback path 2014-08-06 18:01:22 -07:00
smpboot.c
smpboot.h
softirq.c Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu 2014-05-22 11:36:10 +02:00
stacktrace.c
stop_machine.c kernel/stop_machine.c: kernel-doc warning fix 2014-06-04 16:54:15 -07:00
sys_ni.c kexec: new syscall kexec_file_load() declaration 2014-08-08 15:57:32 -07:00
sys.c sched: move no_new_privs into new atomic flags 2014-07-18 12:13:38 -07:00
sysctl_binary.c ipv6: Allow accepting RA from local IP addresses. 2014-07-01 12:16:24 -07:00
sysctl.c mm, hugetlb: remove hugetlb_zero and hugetlb_infinity 2014-08-06 18:01:19 -07:00
system_certificates.S
system_keyring.c KEYS: validate certificate trust only with builtin keys 2014-07-17 09:35:17 -04:00
task_work.c
taskstats.c
test_kprobes.c kernel/test_kprobes.c: use current logging functions 2014-08-08 15:57:18 -07:00
torture.c torture: Avoid format string leak to thead name 2014-07-07 10:12:56 -07:00
tracepoint.c tracing: syscall_regfunc() should not skip kernel threads 2014-06-21 00:15:26 -04:00
tsacct.c sched: Make task->start_time nanoseconds based 2014-07-23 10:18:05 -07:00
uid16.c
up.c
user_namespace.c proc: constify seq_operations 2014-08-08 15:57:22 -07:00
user-return-notifier.c
user.c kernel/user.c: drop unused field 'files' from user_struct 2014-06-04 16:54:16 -07:00
utsname_sysctl.c sysctl: convert use of typedef ctl_table to struct ctl_table 2014-06-06 16:08:16 -07:00
utsname.c namespaces: Use task_lock and not rcu to protect nsproxy 2014-07-29 18:08:50 -07:00
watchdog.c panic: add TAINT_SOFTLOCKUP 2014-08-08 15:57:24 -07:00
workqueue_internal.h workqueue: rename manager_mutex to attach_mutex 2014-05-20 10:59:32 -04:00
workqueue.c Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2014-08-04 10:09:27 -07:00