linux/kernel
Rik van Riel e78c349679 time, signal: Protect resource use statistics with seqlock
Both times() and clock_gettime(CLOCK_PROCESS_CPUTIME_ID) have scalability
issues on large systems, due to both functions being serialized with a
lock.

The lock protects against reporting a wrong value, due to a thread in the
task group exiting, its statistics reporting up to the signal struct, and
that exited task's statistics being counted twice (or not at all).

Protecting that with a lock results in times() and clock_gettime() being
completely serialized on large systems.

This can be fixed by using a seqlock around the events that gather and
propagate statistics. As an additional benefit, the protection code can
be moved into thread_group_cputime(), slightly simplifying the calling
functions.

In the case of posix_cpu_clock_get_task() things can be simplified a
lot, because the calling function already ensures that the task sticks
around, and the rest is now taken care of in thread_group_cputime().

This way the statistics reporting code can run lockless.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Daeseok Youn <daeseok.youn@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guillaume Morin <guillaume@morinfr.org>
Cc: Ionut Alexa <ionut.m.alexa@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Michal Schmidt <mschmidt@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: umgwanakikbuti@gmail.com
Cc: fweisbec@gmail.com
Cc: srao@redhat.com
Cc: lwoodman@redhat.com
Cc: atheurer@redhat.com
Link: http://lkml.kernel.org/r/20140816134010.26a9b572@annuminas.surriel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-08 08:17:01 +02:00
..
bpf net: filter: split 'struct sk_filter' into socket and bpf parts 2014-08-02 15:03:58 -07:00
debug kdb: Use ktime_get_ts() 2014-06-12 16:18:45 +02:00
events Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-08-24 16:16:55 -07:00
gcov kernel/gcov/fs.c: remove unnecessary null test before debugfs_remove 2014-08-08 15:57:24 -07:00
irq irq: Export handle_fasteoi_irq 2014-08-25 21:13:30 +02:00
locking arch, locking: Ciao arch_mutex_cpu_relax() 2014-07-17 12:32:47 +02:00
power PM / sleep: Fix test_suspend= command line option 2014-09-03 01:21:03 +02:00
printk printk: Add function to return log buffer address and size 2014-08-13 15:13:44 +10:00
rcu rcu: Make nocb leader kthreads process pending callbacks after spawning 2014-08-28 05:59:59 -07:00
sched time, signal: Protect resource use statistics with seqlock 2014-09-08 08:17:01 +02:00
time time, signal: Protect resource use statistics with seqlock 2014-09-08 08:17:01 +02:00
trace trace: Fix epoll hang when we race with new entries 2014-08-25 20:18:11 -04:00
.gitignore
acct.c kernel/acct.c: fix coding style warnings and errors 2014-08-08 15:57:27 -07:00
async.c
audit_tree.c inotify: Fix reporting of cookies for inotify events 2014-02-18 11:17:17 +01:00
audit_watch.c inotify: Fix reporting of cookies for inotify events 2014-02-18 11:17:17 +01:00
audit.c CAPABILITIES: remove undefined caps from all processes 2014-07-24 21:53:47 +10:00
audit.h audit: Use struct net not pid_t to remember the network namespce to reply in 2014-03-20 10:10:53 -04:00
auditfilter.c kernel/auditfilter.c: replace count*size kmalloc by kcalloc 2014-08-06 18:01:12 -07:00
auditsc.c auditsc: audit_krule mask accesses need bounds checking 2014-06-10 08:44:40 -07:00
backtracetest.c kernel/backtracetest.c: replace no level printk by pr_info() 2014-06-04 16:54:14 -07:00
bounds.c page-cgroup: get rid of NR_PCG_FLAGS 2014-08-08 15:57:18 -07:00
capability.c CAPABILITIES: remove undefined caps from all processes 2014-07-24 21:53:47 +10:00
cgroup_freezer.c cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes 2014-07-15 11:05:09 -04:00
cgroup.c Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-08-04 10:11:28 -07:00
compat.c compat: nanosleep: Clarify error handling 2014-09-06 12:58:18 +02:00
configs.c
context_tracking.c x86/kprobes: Fix build errors and blacklist context_track_user 2014-06-14 09:07:44 +02:00
cpu_pm.c
cpu.c sched: Rework check_for_tasks() 2014-07-05 11:17:45 +02:00
cpuset.c Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-08-04 10:11:28 -07:00
crash_dump.c
cred.c
delayacct.c delayacct: Remove braindamaged type conversions 2014-07-23 10:18:06 -07:00
dma.c
elfcore.c
exec_domain.c kernel/exec_domain.c: code clean-up 2014-06-04 16:54:15 -07:00
exit.c time, signal: Protect resource use statistics with seqlock 2014-09-08 08:17:01 +02:00
extable.c asmlinkage: Make main_extable_sort_needed visible 2014-02-13 18:13:22 -08:00
fork.c time, signal: Protect resource use statistics with seqlock 2014-09-08 08:17:01 +02:00
freezer.c
futex_compat.c compat: Get rid of (get|put)_compat_time(val|spec) 2014-02-02 14:09:12 -08:00
futex.c futex: Simplify futex_lock_pi_atomic() and make it more robust 2014-06-21 22:26:24 +02:00
groups.c kernel/groups.c: remove return value of set_groups 2014-04-03 16:21:05 -07:00
hung_task.c kernel/hung_task.c: convert simple_strtoul to kstrtouint 2014-06-04 16:54:15 -07:00
irq_work.c irq_work: Remove BUG_ON in irq_work_run() 2014-07-05 11:17:26 +02:00
jump_label.c
kallsyms.c kernel/kallsyms.c: fix %pB when there's no symbol at the address 2014-08-08 15:57:18 -07:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER 2014-07-16 14:57:13 +02:00
Kconfig.preempt
kexec.c kexec: create a new config option CONFIG_KEXEC_FILE for new syscall 2014-08-29 16:28:16 -07:00
kmod.c signals: change wait_for_helper() to use kernel_sigaction() 2014-06-06 16:08:12 -07:00
kprobes.c kprobes: Skip kretprobe hit in NMI context to avoid deadlock 2014-08-08 10:38:04 +02:00
ksysfs.c kobject: Make support for uevent_helper optional. 2014-04-25 12:00:49 -07:00
kthread.c kthread_work: wake up worker only when the worker is idle 2014-07-28 14:07:52 -04:00
latencytop.c kernel/latencytop.c: convert seq_printf to seq_puts 2014-06-04 16:54:15 -07:00
Makefile bin2c: move bin2c in scripts/basic 2014-08-08 15:57:32 -07:00
module_signing.c
module-internal.h
module.c module: Clean up ro/nx after early module load failures 2014-08-16 04:47:00 +09:30
notifier.c kprobes, notifier: Use NOKPROBE_SYMBOL macro in notifier 2014-04-24 10:26:39 +02:00
nsproxy.c namespaces: Use task_lock and not rcu to protect nsproxy 2014-07-29 18:08:50 -07:00
padata.c
panic.c panic: add TAINT_SOFTLOCKUP 2014-08-08 15:57:24 -07:00
params.c Add module param type 'ullong' 2014-07-17 22:07:37 +02:00
pid_namespace.c pid_namespace: pidns_get() should check task_active_pid_ns() != NULL 2014-04-02 16:20:21 -07:00
pid.c
profile.c kernel/profile.c: use static const char instead of static char 2014-06-06 16:08:13 -07:00
ptrace.c sched: Remove proliferation of wait_on_bit() action functions 2014-07-16 15:10:39 +02:00
range.c
reboot.c kernel/reboot.c: convert simple_strtoul to kstrtoint 2014-06-04 16:54:15 -07:00
relay.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-04-12 14:49:50 -07:00
res_counter.c kernel/res_counter.c: replace simple_strtoull by kstrtoull 2014-06-04 16:54:15 -07:00
resource.c resource: fix the case of null pointer access 2014-08-29 16:28:15 -07:00
seccomp.c seccomp: Replace BUG(!spin_is_locked()) with assert_spin_lock 2014-08-11 13:29:12 -07:00
signal.c Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
smp.c kernel/smp.c:on_each_cpu_cond(): fix warning in fallback path 2014-08-06 18:01:22 -07:00
smpboot.c
smpboot.h
softirq.c Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu 2014-05-22 11:36:10 +02:00
stacktrace.c
stop_machine.c kernel/stop_machine.c: kernel-doc warning fix 2014-06-04 16:54:15 -07:00
sys_ni.c kexec: new syscall kexec_file_load() declaration 2014-08-08 15:57:32 -07:00
sys.c time, signal: Protect resource use statistics with seqlock 2014-09-08 08:17:01 +02:00
sysctl_binary.c ipv6: Allow accepting RA from local IP addresses. 2014-07-01 12:16:24 -07:00
sysctl.c mm, hugetlb: remove hugetlb_zero and hugetlb_infinity 2014-08-06 18:01:19 -07:00
system_certificates.S
system_keyring.c KEYS: validate certificate trust only with builtin keys 2014-07-17 09:35:17 -04:00
task_work.c
taskstats.c
test_kprobes.c kernel/test_kprobes.c: use current logging functions 2014-08-08 15:57:18 -07:00
torture.c torture: Avoid format string leak to thead name 2014-07-07 10:12:56 -07:00
tracepoint.c tracing: syscall_regfunc() should not skip kernel threads 2014-06-21 00:15:26 -04:00
tsacct.c sched: Make task->start_time nanoseconds based 2014-07-23 10:18:05 -07:00
uid16.c
up.c smp: Rename __smp_call_function_single() to smp_call_function_single_async() 2014-02-24 14:47:15 -08:00
user_namespace.c proc: constify seq_operations 2014-08-08 15:57:22 -07:00
user-return-notifier.c
user.c kernel/user.c: drop unused field 'files' from user_struct 2014-06-04 16:54:16 -07:00
utsname_sysctl.c sysctl: convert use of typedef ctl_table to struct ctl_table 2014-06-06 16:08:16 -07:00
utsname.c namespaces: Use task_lock and not rcu to protect nsproxy 2014-07-29 18:08:50 -07:00
watchdog.c panic: add TAINT_SOFTLOCKUP 2014-08-08 15:57:24 -07:00
workqueue_internal.h workqueue: rename manager_mutex to attach_mutex 2014-05-20 10:59:32 -04:00
workqueue.c Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2014-08-04 10:09:27 -07:00