linux/kernel/time
Feng Tang 2ed08e4bc5 clocksource: Scale the watchdog read retries automatically
On a 8-socket server the TSC is wrongly marked as 'unstable' and disabled
during boot time on about one out of 120 boot attempts:

    clocksource: timekeeping watchdog on CPU227: wd-tsc-wd excessive read-back delay of 153560ns vs. limit of 125000ns,
    wd-wd read-back delay only 11440ns, attempt 3, marking tsc unstable
    tsc: Marking TSC unstable due to clocksource watchdog
    TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
    sched_clock: Marking unstable (119294969739, 159204297)<-(125446229205, -5992055152)
    clocksource: Checking clocksource tsc synchronization from CPU 319 to CPUs 0,99,136,180,210,542,601,896.
    clocksource: Switched to clocksource hpet

The reason is that for platform with a large number of CPUs, there are
sporadic big or huge read latencies while reading the watchog/clocksource
during boot or when system is under stress work load, and the frequency and
maximum value of the latency goes up with the number of online CPUs.

The cCurrent code already has logic to detect and filter such high latency
case by reading the watchdog twice and checking the two deltas. Due to the
randomness of the latency, there is a low probabilty that the first delta
(latency) is big, but the second delta is small and looks valid. The
watchdog code retries the readouts by default twice, which is not
necessarily sufficient for systems with a large number of CPUs.

There is a command line parameter 'max_cswd_read_retries' which allows to
increase the number of retries, but that's not user friendly as it needs to
be tweaked per system. As the number of required retries is proportional to
the number of online CPUs, this parameter can be calculated at runtime.

Scale and enlarge the number of retries according to the number of online
CPUs and remove the command line parameter completely.

[ tglx: Massaged change log and comments ]

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jin Wang <jin1.wang@intel.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Waiman Long <longman@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20240221060859.1027450-1-feng.tang@intel.com
2024-02-21 12:00:42 +01:00
..
alarmtimer.c alarmtimer: Use maximum alarm time for suspend 2023-10-09 15:03:28 +02:00
clockevents.c clockevents: Make clockevents_subsys const 2024-02-07 15:11:24 +01:00
clocksource-wdtest.c clocksource: Scale the watchdog read retries automatically 2024-02-21 12:00:42 +01:00
clocksource.c clocksource: Scale the watchdog read retries automatically 2024-02-21 12:00:42 +01:00
hrtimer.c Merge tag 'v6.8-rc5' into timers/core, to resolve conflict 2024-02-19 22:27:57 +01:00
itimer.c time: Prevent undefined behaviour in timespec64_to_ns() 2020-10-26 11:48:11 +01:00
jiffies.c clocksource: Make clocksource watchdog test safe for slow-HZ systems 2021-08-28 17:01:32 +02:00
Kconfig clocksource: Loosen clocksource watchdog constraints 2023-01-03 20:43:45 -08:00
Makefile time: Improve performance of time64_to_tm() 2021-06-24 11:51:59 +02:00
namespace.c vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copy 2022-12-01 11:35:40 +01:00
ntp_internal.h ntp: Make the RTC synchronization more reliable 2020-12-11 10:40:52 +01:00
ntp.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00
posix-clock.c posix-clock: introduce posix_clock_context concept 2023-10-15 20:07:52 +01:00
posix-cpu-timers.c posix-cpu-timers: Implement the missing timer_wait_running callback 2023-04-21 15:34:33 +02:00
posix-stubs.c posix-timers: Get rid of [COMPAT_]SYS_NI() uses 2023-12-20 21:30:27 -08:00
posix-timers.c posix-timers: Refer properly to CONFIG_HIGH_RES_TIMERS 2023-06-18 22:41:53 +02:00
posix-timers.h
sched_clock.c time/sched_clock: Provide sched_clock_noinstr() 2023-06-05 21:11:04 +02:00
test_udelay.c time/debug: Fix memory leak with using debugfs_lookup() 2023-02-09 20:12:27 +01:00
tick-broadcast-hrtimer.c time/tick-broadcast: Remove RCU_NONIDLE() usage 2023-01-13 11:48:16 +01:00
tick-broadcast.c tick/broadcast: Make broadcast device replacement work correctly 2023-05-08 23:18:16 +02:00
tick-common.c tick/common: Align tick period during sched_timer setup 2023-06-16 20:45:28 +02:00
tick-internal.h time: Make sysfs_get_uname() function visible in header 2023-11-22 14:12:10 +01:00
tick-legacy.c timekeeping: remove xtime_update 2020-10-30 21:57:07 +01:00
tick-oneshot.c time: Fix various kernel-doc problems 2023-01-03 11:07:58 +01:00
tick-sched.c tick/sched: Add function description for tick_nohz_next_event() 2024-02-19 09:38:00 +01:00
tick-sched.h timers/nohz: Protect idle/iowait sleep time under seqcount 2023-04-18 16:35:12 +02:00
time_test.c time/kunit: Use correct format specifier 2024-02-21 12:00:42 +01:00
time.c time: add kernel-doc in time.c 2023-07-14 13:47:07 -06:00
timeconst.bc
timeconv.c time: Improve performance of time64_to_tm() 2021-06-24 11:51:59 +02:00
timecounter.c time/timecounter: Mark 1st argument of timecounter_cyc2time() as const 2021-04-16 21:03:50 +02:00
timekeeping_debug.c
timekeeping_internal.h timekeeping/vsyscall: Provide vdso_update_begin/end() 2020-08-06 10:57:30 +02:00
timekeeping.c timekeeping: Fix cross-timestamp interpolation for non-x86 2024-02-19 12:18:51 +01:00
timekeeping.h asm-generic: cross-architecture timer cleanup 2020-12-16 00:07:17 -08:00
timer_list.c timer_list: Print name of per-cpu wakeup device 2021-05-31 17:04:49 +02:00
timer.c timers: Add struct member description for timer_base 2024-02-19 09:38:00 +01:00
vsyscall.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00