linux/kernel
john stultz 6c9bacb41c time: catch xtime_nsec underflows and fix them
Impact: fix time warp bug

Alex Shi, along with Yanmin Zhang have been noticing occasional time
inconsistencies recently. Through their great diagnosis, they found that
the xtime_nsec value used in update_wall_time was occasionally going
negative. After looking through the code for awhile, I realized we have
the possibility for an underflow when three conditions are met in
update_wall_time():

  1) We have accumulated a second's worth of nanoseconds, so we
     incremented xtime.tv_sec and appropriately decrement xtime_nsec.
     (This doesn't cause xtime_nsec to go negative, but it can cause it
      to be small).

  2) The remaining offset value is large, but just slightly less then
     cycle_interval.

  3) clocksource_adjust() is speeding up the clock, causing a
     corrective amount (compensating for the increase in the multiplier
     being multiplied against the unaccumulated offset value) to be
     subtracted from xtime_nsec.

This can cause xtime_nsec to underflow.

Unfortunately, since we notify the NTP subsystem via second_overflow()
whenever we accumulate a full second, and this effects the error
accumulation that has already occured, we cannot simply revert the
accumulated second from xtime nor move the second accumulation to after
the clocksource_adjust call without a change in behavior.

This leaves us with (at least) two options:

1) Simply return from clocksource_adjust() without making a change if we
   notice the adjustment would cause xtime_nsec to go negative.

This would work, but I'm concerned that if a large adjustment was needed
(due to the error being large), it may be possible to get stuck with an
ever increasing error that becomes too large to correct (since it may
always force xtime_nsec negative). This may just be paranoia on my part.

2) Catch xtime_nsec if it is negative, then add back the amount its
   negative to both xtime_nsec and the error.

This second method is consistent with how we've handled earlier rounding
issues, and also has the benefit that the error being added is always in
the oposite direction also always equal or smaller then the correction
being applied. So the risk of a corner case where things get out of
control is lessened.

This patch fixes bug 11970, as tested by Yanmin Zhang
http://bugzilla.kernel.org/show_bug.cgi?id=11970

Reported-by: alex.shi@intel.com
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Acked-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-04 08:43:02 +01:00
..
irq irq: make variable static 2008-10-22 07:37:17 +02:00
power suspend: use WARN not WARN_ON to print the message 2008-11-18 08:07:36 -08:00
time time: catch xtime_nsec underflows and fix them 2008-12-04 08:43:02 +01:00
trace Merge branch 'tip/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent 2008-11-19 09:00:50 +01:00
.gitignore
acct.c tty: Fix abusers of current->sighand->tty 2008-10-13 09:51:42 -07:00
audit_tree.c Fix inotify watch removal/umount races 2008-11-15 12:26:44 -08:00
audit.c [PATCH] Fix the bug of using AUDIT_STATUS_RATE_LIMIT when set fail, no error output. 2008-08-01 12:15:16 -04:00
audit.h
auditfilter.c Fix inotify watch removal/umount races 2008-11-15 12:26:44 -08:00
auditsc.c tty: Fix abusers of current->sighand->tty 2008-10-13 09:51:42 -07:00
backtracetest.c backtrace: replace timer with tasklet + completions 2008-06-27 18:09:16 +02:00
bounds.c
capability.c security: Fix setting of PF_SUPERPRIV by __capable() 2008-08-14 22:59:43 +10:00
cgroup_debug.c cgroups: fix probable race with put_css_set[_taskexit] and find_css_set 2008-10-20 08:52:38 -07:00
cgroup_freezer.c freezer_cg: disable writing freezer.state of root cgroup 2008-11-12 17:17:16 -08:00
cgroup.c cgroups: fix a serious bug in cgroupstats 2008-11-19 18:50:00 -08:00
compat.c Merge branches 'timers/clocksource', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus 2008-10-20 13:14:06 +02:00
configs.c kernel/configs.c: remove useless comments 2008-10-20 08:52:34 -07:00
cpu.c cpumask: introduce new API, without changing anything 2008-11-06 09:05:33 +01:00
cpuset.c cpuset: update top cpuset's mems after adding a node 2008-11-19 18:49:58 -08:00
delayacct.c per-task-delay-accounting: update taskstats for memory reclaim delay 2008-07-25 10:53:47 -07:00
dma-coherent.c dma-coherent: export dma_[alloc|release]_from_coherent methods 2008-08-22 08:34:53 +02:00
dma.c kernel/dma.c: remove a CVS keyword 2008-10-16 11:21:30 -07:00
exec_domain.c proc: move /proc/execdomains to kernel/exec_domain.c 2008-10-23 14:30:41 +04:00
exit.c Move "exit_robust_list" into mm_release() 2008-11-15 10:20:36 -08:00
extable.c
fork.c Move "exit_robust_list" into mm_release() 2008-11-15 10:20:36 -08:00
freezer.c freezer_cg: use thaw_process() in unfreeze_cgroup() 2008-10-30 11:38:45 -07:00
futex_compat.c
futex.c hrtimer: make the futex() system call use the per process slack value 2008-09-11 07:17:00 -07:00
hrtimer.c hrtimer: clean up unused callback modes 2008-11-12 09:54:40 +01:00
itimer.c timers: fix itimer/many thread hang 2008-09-14 16:25:35 +02:00
kallsyms.c sprint_symbol(): use less stack 2008-11-19 18:49:58 -08:00
Kconfig.freezer container freezer: implement freezer cgroup subsystem 2008-10-20 08:52:34 -07:00
Kconfig.hz sched: fix SCHED_HRTICK dependency 2008-07-28 14:37:38 +02:00
Kconfig.preempt
kexec.c kexec: fix crash_save_vmcoreinfo_init build problem 2008-10-20 15:28:50 -07:00
kfifo.c
kgdb.c kgdb: call touch_softlockup_watchdog on resume 2008-10-06 13:50:59 -05:00
kmod.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus 2008-10-16 12:38:34 -07:00
kprobes.c kernel/kprobes.c: don't pad kretprobe_table_locks[] on uniprocessor builds 2008-11-12 17:17:17 -08:00
ksysfs.c profiling: dynamically enable readprofile at runtime 2008-10-16 11:21:31 -07:00
kthread.c Merge branch 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-10-20 13:35:07 -07:00
latencytop.c
lockdep_internals.h lockdep: build fix 2008-08-13 12:55:10 +02:00
lockdep_proc.c lockstat: fix numerical output rounding error 2008-08-26 10:37:46 +02:00
lockdep.c lockdep: fix irqs on/off ip tracing 2008-10-28 11:19:07 +01:00
Makefile Remove -mno-spe flags as they dont belong 2008-11-17 13:24:35 -08:00
marker.c markers: bit-field is not thread-safe nor smp-safe 2008-10-14 10:38:45 +02:00
module.c Merge branch 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc 2008-10-23 12:04:37 -07:00
mutex-debug.c mutex-debug: check mutex magic before owner 2008-05-16 16:53:35 +02:00
mutex-debug.h
mutex.c locking: fix mutex @key parameter kernel-doc notation 2008-07-28 18:12:36 +02:00
mutex.h
notifier.c ftrace: ignore functions that cannot be kprobe-ed 2008-10-14 10:34:22 +02:00
ns_cgroup.c cgroup_clone: use pid of newly created task for new cgroup 2008-07-25 10:53:37 -07:00
nsproxy.c removed unused #include <linux/version.h>'s 2008-08-23 12:14:12 -07:00
panic.c Make panic= and panic_on_oops into core_params 2008-10-22 10:00:25 +11:00
params.c Fix compile warning in kernel/params.c 2008-10-23 12:09:00 -07:00
pid_namespace.c pid_ns: (BUG 11391) change ->child_reaper when init->group_leader exits 2008-09-02 19:21:38 -07:00
pid.c pidns: remove now unused find_pid function. 2008-07-25 10:53:45 -07:00
pm_qos_params.c pm_qos_requirement might sleep 2008-09-02 19:21:40 -07:00
posix-cpu-timers.c posix-cpu-timers: fix clock_gettime with CLOCK_PROCESS_CPUTIME_ID 2008-11-24 16:41:40 +01:00
posix-timers.c Merge branch 'timers/range-hrtimers' into v28-range-hrtimers-for-linus-v2 2008-10-22 09:48:06 +02:00
printk.c printk: remove unused code from kernel/printk.c 2008-10-23 21:54:29 +02:00
profile.c kernel/profile.c: fix section mismatch warning 2008-11-18 08:49:24 +01:00
ptrace.c make ptrace_untrace() static 2008-10-20 08:52:39 -07:00
rcuclassic.c rcu: RCU-based detection of stalled CPUs for Classic RCU, fix 2008-10-03 10:41:00 +02:00
rcupdate.c rcupdate: fix bug of rcu_barrier*() 2008-10-21 15:59:53 +02:00
rcupreempt_trace.c rcu: trace fix possible mem-leak 2008-08-15 17:54:40 +02:00
rcupreempt.c byteorder: remove direct includes of linux/byteorder/swab[b].h 2008-10-20 08:52:40 -07:00
rcutorture.c byteorder: remove direct includes of linux/byteorder/swab[b].h 2008-10-20 12:51:53 -07:00
relay.c relay: fix cpu offline problem 2008-11-18 15:08:56 +01:00
res_counter.c cgroup files: convert res_counter_write() to be a cgroups write_string() handler 2008-07-25 10:53:36 -07:00
resource.c reserve_region_with_split: Fix GFP_KERNEL usage under spinlock 2008-11-01 09:53:58 -07:00
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c sysdev: Pass the attribute to the low level sysdev show/store function 2008-07-21 21:55:02 -07:00
rtmutex.c hrtimer: convert kernel/* to the new hrtimer apis 2008-09-05 21:35:13 -07:00
rtmutex.h
rwsem.c
sched_clock.c sched_clock: prevent scd->clock from moving backwards 2008-10-10 11:17:04 +02:00
sched_cpupri.c sched: use a 2-d bitmap for searching lowest-pri CPU 2008-06-06 15:19:28 +02:00
sched_cpupri.h sched: fix the cpuprio count really 2008-06-06 15:19:44 +02:00
sched_debug.c sched: fix kernel warning on /proc/sched_debug access 2008-11-16 08:07:15 +01:00
sched_fair.c sched: release buddies on yield 2008-11-11 11:57:22 +01:00
sched_features.h sched: backward looking buddy 2008-11-05 10:30:14 +01:00
sched_idletask.c sched: add CONFIG_SMP consistency 2008-10-22 10:01:52 +02:00
sched_rt.c Merge commit 'v2.6.28-rc1' into sched/urgent 2008-10-24 12:48:46 +02:00
sched_stats.h sched, signals: fix the racy usage of ->signal in account_group_xxx/run_posix_cpu_timers 2008-11-17 16:49:35 +01:00
sched.c cpuset: fix regression when failed to generate sched domains 2008-11-18 08:44:51 +01:00
seccomp.c
semaphore.c semaphore: __down_common: use signal_pending_state() 2008-08-05 14:33:47 -07:00
signal.c 'kill sig -1' must only apply to caller's namespace 2008-10-30 11:38:46 -07:00
smp.c generic-ipi: fix the smp_mb() placement 2008-11-06 08:41:56 +01:00
softirq.c irq: call __irq_enter() before calling the tick_idle_check 2008-11-10 22:36:39 +01:00
softlockup.c Make the taint flags reliable 2008-10-16 11:21:31 -07:00
spinlock.c lockdep: spin_lock_nest_lock(), checkpatch fixes 2008-08-13 13:56:51 +02:00
srcu.c
stacktrace.c stacktrace: fix modular build, export print_stack_trace and save_stack_trace 2008-06-30 09:20:55 +02:00
stop_machine.c stop_machine: fix race with return value (fixes Bug #11989) 2008-11-16 15:09:52 -08:00
sys_ni.c reintroduce accept4 2008-11-19 18:49:57 -08:00
sys.c Merge branch 'timers/range-hrtimers' into v28-range-hrtimers-for-linus-v2 2008-10-22 09:48:06 +02:00
sysctl_check.c sysctl: check for bogus modes 2008-07-25 10:53:45 -07:00
sysctl.c Merge commit 'v2.6.28-rc2' into tracing/urgent 2008-10-27 10:50:54 +01:00
taskstats.c taskstats: remove initialization of static per-cpu variable 2008-07-25 10:53:47 -07:00
test_kprobes.c
time.c select: add a timespec_add_safe() function 2008-09-05 21:34:57 -07:00
timeconst.pl Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timer.c Add round_jiffies_up and related routines 2008-11-06 08:42:48 +01:00
tracepoint.c tracepoint: check if the probe has been registered 2008-10-27 16:45:46 +01:00
tsacct.c task IO accounting: move all IO statistics in struct task_io_accounting 2008-07-27 16:12:28 -07:00
uid16.c
user_namespace.c removed unused #include <linux/version.h>'s 2008-08-23 12:14:12 -07:00
user.c sched: rt-bandwidth for user grouping interface 2008-08-19 13:10:09 +02:00
utsname_sysctl.c sysctl: simplify ->strategy 2008-10-16 11:21:47 -07:00
utsname.c removed unused #include <linux/version.h>'s 2008-08-23 12:14:12 -07:00
wait.c wait: kill is_sync_wait() 2008-10-16 11:21:31 -07:00
workqueue.c cpumask: introduce new API, without changing anything 2008-11-06 09:05:33 +01:00