linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-01 08:31:37 +00:00

History

Yang Tao ca16d5bee5 futex: Prevent robust futex exit race Robust futexes utilize the robust_list mechanism to allow the kernel to release futexes which are held when a task exits. The exit can be voluntary or caused by a signal or fault. This prevents that waiters block forever. The futex operations in user space store a pointer to the futex they are either locking or unlocking in the op_pending member of the per task robust list. After a lock operation has succeeded the futex is queued in the robust list linked list and the op_pending pointer is cleared. After an unlock operation has succeeded the futex is removed from the robust list linked list and the op_pending pointer is cleared. The robust list exit code checks for the pending operation and any futex which is queued in the linked list. It carefully checks whether the futex value is the TID of the exiting task. If so, it sets the OWNER_DIED bit and tries to wake up a potential waiter. This is race free for the lock operation but unlock has two race scenarios where waiters might not be woken up. These issues can be observed with regular robust pthread mutexes. PI aware pthread mutexes are not affected. (1) Unlocking task is killed after unlocking the futex value in user space before being able to wake a waiter. pthread_mutex_unlock() \| V atomic_exchange_rel (&mutex->__data.__lock, 0) <------------------------killed lll_futex_wake () \| \| \|(__lock = 0) \|(enter kernel) \| V do_exit() exit_mm() mm_release() exit_robust_list() handle_futex_death() \| \|(__lock = 0) \|(uval = 0) \| V if ((uval & FUTEX_TID_MASK) != task_pid_vnr(curr)) return 0; The sanity check which ensures that the user space futex is owned by the exiting task prevents the wakeup of waiters which in consequence block infinitely. (2) Waiting task is killed after a wakeup and before it can acquire the futex in user space. OWNER WAITER futex_wait() pthread_mutex_unlock() \| \| \| \|(__lock = 0) \| \| \| V \| futex_wake() ------------> wakeup() \| \|(return to userspace) \|(__lock = 0) \| V oldval = mutex->__data.__lock <-----------------killed atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, \| id \| assume_other_futex_waiters, 0) \| \| \| (enter kernel)\| \| V do_exit() \| \| V handle_futex_death() \| \|(__lock = 0) \|(uval = 0) \| V if ((uval & FUTEX_TID_MASK) != task_pid_vnr(curr)) return 0; The sanity check which ensures that the user space futex is owned by the exiting task prevents the wakeup of waiters, which seems to be correct as the exiting task does not own the futex value, but the consequence is that other waiters wont be woken up and block infinitely. In both scenarios the following conditions are true: - task->robust_list->list_op_pending != NULL - user space futex value == 0 - Regular futex (not PI) If these conditions are met then it is reasonably safe to wake up a potential waiter in order to prevent the above problems. As this might be a false positive it can cause spurious wakeups, but the waiter side has to handle other types of unrelated wakeups, e.g. signals gracefully anyway. So such a spurious wakeup will not affect the correctness of these operations. This workaround must not touch the user space futex value and cannot set the OWNER_DIED bit because the lock value is 0, i.e. uncontended. Setting OWNER_DIED in this case would result in inconsistent state and subsequently in malfunction of the owner died handling in user space. The rest of the user space state is still consistent as no other task can observe the list_op_pending entry in the exiting tasks robust list. The eventually woken up waiter will observe the uncontended lock value and take it over. [ tglx: Massaged changelog and comment. Made the return explicit and not depend on the subsequent check and added constants to hand into handle_futex_death() instead of plain numbers. Fixed a few coding style issues. ] Fixes: `0771dfefc9` ("[PATCH] lightweight robust futexes: core") Signed-off-by: Yang Tao <yang.tao172@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1573010582-35297-1-git-send-email-wang.yi59@zte.com.cn Link: https://lkml.kernel.org/r/20191106224555.943191378@linutronix.de		2019-11-15 19:10:49 +01:00
..
bpf	locking/lockdep: Remove unused @nested argument from lock_release()	2019-10-09 12:46:10 +02:00
cgroup	Merge branch 'for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup	2019-09-17 15:57:22 -07:00
configs
debug	kgdb: don't use a notifier to enter kgdb at panic; call directly	2019-09-25 17:51:40 -07:00
dma	dma-mapping: fix false positivse warnings in dma_common_free_remap()	2019-10-05 10:24:17 +02:00
events	perf_event_open: switch to copy_struct_from_user()	2019-10-01 15:45:22 +02:00
gcov	um: Enable CONFIG_CONSTRUCTORS	2019-09-15 21:37:13 +02:00
irq	Power management updates for 5.4-rc1	2019-09-17 19:15:14 -07:00
livepatch	livepatch: Nullify obj->mod in klp_module_coming()'s error path	2019-08-19 13:03:37 +02:00
locking	locking/lockdep: Update the comment for __lock_release()	2019-11-13 11:07:48 +01:00
power	Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security	2019-09-28 08:14:15 -07:00
printk	locking/lockdep: Remove unused @nested argument from lock_release()	2019-10-09 12:46:10 +02:00
rcu	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2019-09-16 17:25:49 -07:00
sched	locking/lockdep: Remove unused @nested argument from lock_release()	2019-10-09 12:46:10 +02:00
time	tick: broadcast-hrtimer: Fix a race in bc_set_next	2019-09-27 14:45:55 +02:00
trace	A few more tracing fixes:	2019-09-30 09:29:53 -07:00
.gitignore	Provide in-kernel headers to make extending kernel easier	2019-04-29 16:48:03 +02:00
acct.c	acct_on(): don't mess with freeze protection	2019-04-04 21:04:13 -04:00
async.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
audit_fsnotify.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157	2019-05-30 11:26:37 -07:00
audit_tree.c	fsnotify: switch send_to_group() and ->handle_event to const struct qstr *	2019-04-26 13:51:03 -04:00
audit_watch.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156	2019-05-30 11:26:35 -07:00
audit.c	audit/stable-5.3 PR 20190702	2019-07-08 18:55:42 -07:00
audit.h	audit/stable-5.3 PR 20190702	2019-07-08 18:55:42 -07:00
auditfilter.c	audit/stable-5.3 PR 20190702	2019-07-08 18:55:42 -07:00
auditsc.c	audit: enforce op for string fields	2019-05-28 17:46:43 -04:00
backtracetest.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
bounds.c
capability.c	LSM: add SafeSetID module that gates setid calls	2019-01-25 11:22:43 -08:00
compat.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500	2019-06-19 17:09:55 +02:00
configs.c	kernel/configs: Replace GPL boilerplate code with SPDX identifier	2019-07-30 18:34:15 +02:00
context_tracking.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
cpu_pm.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282	2019-06-05 17:36:37 +02:00
cpu.c	locking/lockdep: Remove unused @nested argument from lock_release()	2019-10-09 12:46:10 +02:00
crash_core.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 230	2019-06-19 17:09:06 +02:00
crash_dump.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
cred.c	Merge branch 'access-creds'	2019-07-25 08:36:29 -07:00
delayacct.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 25	2019-05-21 11:52:39 +02:00
dma.c
elfcore.c	kernel/elfcore.c: include proper prototypes	2019-09-25 17:51:39 -07:00
exec_domain.c
exit.c	tasks, sched/core: With a grace period after finish_task_switch(), remove unnecessary code	2019-09-25 17:42:29 +02:00
extable.c	extable: Add function to search only kernel exception table	2019-08-21 22:23:48 +10:00
fail_function.c	fail_function: no need to check return value of debugfs_create functions	2019-06-03 15:49:06 +02:00
fork.c	kernel/sysctl.c: do not override max_threads provided by userspace	2019-10-07 15:47:19 -07:00
freezer.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
futex.c	futex: Prevent robust futex exit race	2019-11-15 19:10:49 +01:00
gen_kheaders.sh	kheaders: make headers archive reproducible	2019-10-05 15:29:49 +09:00
groups.c
hung_task.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
iomem.c	mm/nvdimm: add is_ioremap_addr and use that to check ioremap address	2019-07-12 11:05:40 -07:00
irq_work.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
jump_label.c	jump_label: Don't warn on __exit jump entries	2019-08-29 15:10:10 +01:00
kallsyms.c	kallsyms: Don't let kallsyms_lookup_size_offset() fail on retrieving the first symbol	2019-08-27 16:19:56 +01:00
kcmp.c
Kconfig.freezer	treewide: Add SPDX license identifier - Makefile/Kconfig	2019-05-21 10:50:46 +02:00
Kconfig.hz	treewide: Add SPDX license identifier - Makefile/Kconfig	2019-05-21 10:50:46 +02:00
Kconfig.locks	treewide: Add SPDX license identifier - Makefile/Kconfig	2019-05-21 10:50:46 +02:00
Kconfig.preempt	sched/rt, Kconfig: Unbreak def/oldconfig with CONFIG_PREEMPT=y	2019-07-22 18:05:11 +02:00
kcov.c	kcov: convert kcov.refcount to refcount_t	2019-03-07 18:32:02 -08:00
kexec_core.c	kexec: bail out upon SIGKILL when allocating memory.	2019-09-25 17:51:40 -07:00
kexec_elf.c	kexec_elf: support 32 bit ELF files	2019-09-06 23:58:44 +02:00
kexec_file.c	Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security	2019-09-28 08:14:15 -07:00
kexec_internal.h
kexec.c	kexec_load: Disable at runtime if the kernel is locked down	2019-08-19 21:54:15 -07:00
kheaders.c	kheaders: Move from proc to sysfs	2019-05-24 20:16:01 +02:00
kmod.c
kprobes.c	Tracing updates:	2019-09-20 11:19:48 -07:00
ksysfs.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 170	2019-05-30 11:26:39 -07:00
kthread.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
latencytop.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
Makefile	Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity	2019-09-27 19:37:27 -07:00
module_signature.c	MODSIGN: Export module signature definitions	2019-08-05 18:39:56 -04:00
module_signing.c	MODSIGN: Export module signature definitions	2019-08-05 18:39:56 -04:00
module-internal.h	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36	2019-05-24 17:27:11 +02:00
module.c	Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security	2019-09-28 08:14:15 -07:00
notifier.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
nsproxy.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
padata.c	padata: remove cpu_index from the parallel_queue	2019-09-13 21:15:41 +10:00
panic.c	panic: ensure preemption is disabled during panic()	2019-10-07 15:47:19 -07:00
params.c	lockdown: Lock down module params that specify hardware parameters (eg. ioport)	2019-08-19 21:54:16 -07:00
pid_namespace.c	proc/sysctl: add shared variables for range check	2019-07-18 17:08:07 -07:00
pid.c	kernel/pid.c: convert struct pid count to refcount_t	2019-07-16 19:23:24 -07:00
profile.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
ptrace.c	ptrace: add PTRACE_GET_SYSCALL_INFO request	2019-07-16 19:23:24 -07:00
range.c
reboot.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
relay.c	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2019-03-12 13:27:20 -07:00
resource.c	mm/memory_hotplug.c: use PFN_UP / PFN_DOWN in walk_system_ram_range()	2019-09-24 15:54:09 -07:00
rseq.c	signal: Remove task parameter from force_sig	2019-05-27 09:36:28 -05:00
seccomp.c	signal: Remove the signal number and task parameters from force_sig_info	2019-05-29 09:31:44 -05:00
signal.c	core-process-v5.4	2019-09-16 09:28:19 -07:00
smp.c	smp: Warn on function calls from softirq context	2019-07-20 11:27:16 +02:00
smpboot.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
smpboot.h
softirq.c	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2019-07-08 11:01:13 -07:00
stackleak.c
stacktrace.c	stacktrace: Constify 'entries' arguments	2019-07-25 15:43:26 +02:00
stop_machine.c	stop_machine: Fix stop_cpus_in_progress ordering	2019-08-08 09:09:30 +02:00
sys_ni.c	arch: handle arches who do not yet define clone3	2019-06-21 01:54:53 +02:00
sys.c	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2019-09-17 12:35:15 -07:00
sysctl_binary.c	kernel/sysctl: add panic_print into sysctl	2019-01-04 13:13:47 -08:00
sysctl.c	arm64, mm: move generic mmap layout functions to mm	2019-09-24 15:54:11 -07:00
task_work.c
taskstats.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157	2019-05-30 11:26:37 -07:00
test_kprobes.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 25	2019-05-21 11:52:39 +02:00
torture.c	torture: Remove exporting of internal functions	2019-08-01 14:30:22 -07:00
tracepoint.c	The main changes in this release include:	2019-07-18 11:51:00 -07:00
tsacct.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157	2019-05-30 11:26:37 -07:00
ucount.c	proc/sysctl: add shared variables for range check	2019-07-18 17:08:07 -07:00
uid16.c
uid16.h
umh.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
up.c	smp: Remove smp_call_function() and on_each_cpu() return values	2019-06-23 14:26:26 +02:00
user_namespace.c	Keyrings namespacing	2019-07-08 19:36:47 -07:00
user-return-notifier.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
user.c	Keyrings namespacing	2019-07-08 19:36:47 -07:00
utsname_sysctl.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
utsname.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
watchdog_hld.c	kernel/watchdog_hld.c: hard lockup message should end with a newline	2019-04-19 09:46:05 -07:00
watchdog.c	watchdog: Mark watchdog_hrtimer to expire in hard interrupt context	2019-08-01 20:51:20 +02:00
workqueue_internal.h	sched/core, workqueues: Distangle worker accounting from rq lock	2019-04-16 16:55:15 +02:00
workqueue.c	workqueue: require CPU hotplug read exclusion for apply_workqueue_attrs	2019-09-13 21:15:40 +10:00