linux/arch
Nicolai Stange 1a9e4c564a x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents frequency roundoff error
I noticed the following bug/misbehavior on certain Intel systems: with a
single task running on a NOHZ CPU on an Intel Haswell, I recognized
that I did not only get the one expected local_timer APIC interrupt, but
two per second at minimum. (!)

Further tracing showed that the first one precedes the programmed deadline
by up to ~50us and hence, it did nothing except for reprogramming the TSC
deadline clockevent device to trigger shortly thereafter again.

The reason for this is imprecise calibration, the timeout we program into
the APIC results in 'too short' timer interrupts. The core (hr)timer code
notices this (because it has a precise ktime source and sees the short
interrupt) and fixes it up by programming an additional very short
interrupt period.

This is obviously suboptimal.

The reason for the imprecise calibration is twofold, and this patch
fixes the first reason:

In setup_APIC_timer(), the registered clockevent device's frequency
is calculated by first dividing tsc_khz by TSC_DIVISOR and multiplying
it with 1000 afterwards:

  (tsc_khz / TSC_DIVISOR) * 1000

The multiplication with 1000 is done for converting from kHz to Hz and the
division by TSC_DIVISOR is carried out in order to make sure that the final
result fits into an u32.

However, with the order given in this calculation, the roundoff error
introduced by the division gets magnified by a factor of 1000 by the
following multiplication.

To fix it, reversing the order of the division and the multiplication a la:

  (tsc_khz * 1000) / TSC_DIVISOR

... reduces the roundoff error already.

Furthermore, if TSC_DIVISOR divides 1000, associativity holds:

  (tsc_khz * 1000) / TSC_DIVISOR = tsc_khz * (1000 / TSC_DIVISOR)

and thus, the roundoff error even vanishes and the whole operation can be
carried out within 32 bits.

The powers of two that divide 1000 are 2, 4 and 8. A value of 8 for
TSC_DIVISOR still allows for TSC frequencies up to
2^32 / 10^9ns * 8 = 34.4GHz which is way larger than anything to expect
in the next years.

Thus we also replace the current TSC_DIVISOR value of 32 by 8. Reverse
the order of the divison and the multiplication in the calculation of
the registered clockevent device's frequency.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Christopher S. Hall <christopher.s.hall@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20160714152255.18295-2-nicstange@gmail.com
[ Improved changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10 12:37:38 +02:00
..
alpha mm: do not pass mm_struct into handle_mm_fault 2016-07-26 16:19:19 -07:00
arc DeviceTree update for 4.8: 2016-07-30 11:32:01 -07:00
arm DeviceTree update for 4.8: 2016-07-30 11:32:01 -07:00
arm64 DeviceTree update for 4.8: 2016-07-30 11:32:01 -07:00
avr32 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32 2016-07-29 13:09:55 -07:00
blackfin Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-29 13:55:30 -07:00
c6x c6x: Remove unnecessary of_platform_populate with default match table 2016-06-23 14:59:39 -05:00
cris DeviceTree update for 4.8: 2016-07-30 11:32:01 -07:00
frv mm: do not pass mm_struct into handle_mm_fault 2016-07-26 16:19:19 -07:00
h8300 locking/atomic: Remove linux/atomic.h:atomic_fetch_or() 2016-06-16 10:48:32 +02:00
hexagon Merge branch 'akpm' (patches from Andrew) 2016-07-26 19:55:54 -07:00
ia64 Merge branch 'akpm' (patches from Andrew) 2016-07-26 19:55:54 -07:00
m32r mm: do not pass mm_struct into handle_mm_fault 2016-07-26 16:19:19 -07:00
m68k mm: do not pass mm_struct into handle_mm_fault 2016-07-26 16:19:19 -07:00
metag DeviceTree update for 4.8: 2016-07-30 11:32:01 -07:00
microblaze Merge branch 'akpm' (patches from Andrew) 2016-07-26 19:55:54 -07:00
mips DeviceTree update for 4.8: 2016-07-30 11:32:01 -07:00
mn10300 mm: do not pass mm_struct into handle_mm_fault 2016-07-26 16:19:19 -07:00
nios2 DeviceTree update for 4.8: 2016-07-30 11:32:01 -07:00
openrisc Merge branch 'akpm' (patches from Andrew) 2016-07-26 19:55:54 -07:00
parisc Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-07-29 17:38:46 -07:00
powerpc DeviceTree update for 4.8: 2016-07-30 11:32:01 -07:00
s390 Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit 2016-07-29 17:54:17 -07:00
score mm: do not pass mm_struct into handle_mm_fault 2016-07-26 16:19:19 -07:00
sh DeviceTree update for 4.8: 2016-07-30 11:32:01 -07:00
sparc sparc64: Trim page tables for 8M hugepages 2016-07-29 10:49:16 -07:00
tile Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-07-29 17:38:46 -07:00
um Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-07-29 17:38:46 -07:00
unicore32 New LED class driver: 2016-07-27 14:03:52 -07:00
x86 x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents frequency roundoff error 2016-08-10 12:37:38 +02:00
xtensa DeviceTree update for 4.8: 2016-07-30 11:32:01 -07:00
.gitignore
Kconfig Clarify naming of thread info/stack allocators 2016-06-24 15:09:37 -07:00