linux/fs
Thomas Gleixner f2530dc71c kthread: Prevent unpark race which puts threads on the wrong cpu
The smpboot threads rely on the park/unpark mechanism which binds per
cpu threads on a particular core. Though the functionality is racy:

CPU0	       	 	CPU1  	     	    CPU2
unpark(T)				    wake_up_process(T)
  clear(SHOULD_PARK)	T runs
			leave parkme() due to !SHOULD_PARK  
  bind_to(CPU2)		BUG_ON(wrong CPU)						    

We cannot let the tasks move themself to the target CPU as one of
those tasks is actually the migration thread itself, which requires
that it starts running on the target cpu right away.

The solution to this problem is to prevent wakeups in park mode which
are not from unpark(). That way we can guarantee that the association
of the task to the target cpu is working correctly.

Add a new task state (TASK_PARKED) which prevents other wakeups and
use this state explicitly for the unpark wakeup.

Peter noticed: Also, since the task state is visible to userspace and
all the parked tasks are still in the PID space, its a good hint in ps
and friends that these tasks aren't really there for the moment.

The migration thread has another related issue.

CPU0	      	     	 CPU1
Bring up CPU2
create_thread(T)
park(T)
 wait_for_completion()
			 parkme()
			 complete()
sched_set_stop_task()
			 schedule(TASK_PARKED)

The sched_set_stop_task() call is issued while the task is on the
runqueue of CPU1 and that confuses the hell out of the stop_task class
on that cpu. So we need the same synchronizaion before
sched_set_stop_task().

Reported-by: Dave Jones <davej@redhat.com>
Reported-and-tested-by: Dave Hansen <dave@sr71.net>
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Acked-by: Peter Ziljstra <peterz@infradead.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: dhillf@gmail.com
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304091635430.21884@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-12 14:18:43 +02:00
..
9p fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
adfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
affs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
afs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
autofs4 fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
befs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
bfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2013-03-29 11:13:25 -07:00
cachefiles FS-Cache: Mark cancellation of in-progress operation 2012-12-20 22:34:00 +00:00
ceph fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
cifs Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 2013-03-21 17:59:22 -07:00
coda fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
configfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
cramfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
debugfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
devpts fs: Limit sys_mount to only request filesystem modules (Part 2). 2013-03-07 01:08:55 -08:00
dlm hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ecryptfs ecryptfs: close rmmod race 2013-04-09 14:08:16 -04:00
efs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
exofs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
exportfs hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ext2 ext2: Fix BUG_ON in evict() on inode deletion 2013-03-13 15:23:44 +01:00
ext3 ext3: Fix format string issues 2013-03-11 22:05:56 +01:00
ext4 ext4: fix big-endian bugs which could cause fs corruptions 2013-04-03 12:37:17 -04:00
f2fs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
fat fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
freevxfs fs: Readd the fs module aliases. 2013-03-12 18:55:21 -07:00
fscache hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
fuse fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
gfs2 GFS2: Issue discards in 512b sectors 2013-04-05 17:55:13 +01:00
hfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
hfsplus fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
hostfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2013-03-13 15:47:50 -07:00
hpfs fs: Limit sys_mount to only request filesystem modules. (Part 3) 2013-03-11 07:09:48 -07:00
hppfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
hugetlbfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
isofs fs: Readd the fs module aliases. 2013-03-12 18:55:21 -07:00
jbd jbd: don't wake kjournald unnecessarily 2013-01-14 22:50:45 +01:00
jbd2 jbd2: fix use after free in jbd2_journal_dirty_metadata() 2013-03-11 13:24:56 -04:00
jffs2 fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
jfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
lockd Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux 2013-02-28 18:02:55 -08:00
logfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
minix fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
ncpfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
nfs NFS client bugfixes for Linux 3.9 2013-04-10 10:26:49 -07:00
nfs_common nfs_common: Update the translation between nfsv3 acls linux posix acls 2013-02-13 06:15:14 -08:00
nfsd nfsd4: reject "negative" acl lengths 2013-03-26 16:18:27 -04:00
nilfs2 fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
nls
notify hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ntfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
ocfs2 fs: Limit sys_mount to only request filesystem modules (Part 2). 2013-03-07 01:08:55 -08:00
omfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
openpromfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
proc kthread: Prevent unpark race which puts threads on the wrong cpu 2013-04-12 14:18:43 +02:00
pstore A few fixes to reduce places where pstore might hang 2013-02-21 09:38:18 -08:00
qnx4 fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
qnx6 fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
quota quota: add missing use of dq_data_lock in __dquot_initialize 2013-03-11 22:05:56 +01:00
ramfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
reiserfs reiserfs: Fix warning and inode leak when deleting inode with xattrs 2013-03-29 17:08:43 +01:00
romfs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
squashfs fs: Limit sys_mount to only request filesystem modules. (Part 3) 2013-03-11 07:09:48 -07:00
sysfs sysfs fixes for 3.9-rc4 2013-03-28 15:52:14 -07:00
sysv fs: Readd the fs module aliases. 2013-03-12 18:55:21 -07:00
ubifs UBIFS: make space fixup work in the remount case 2013-03-14 11:20:22 +02:00
udf fs: Limit sys_mount to only request filesystem modules. (Part 3) 2013-03-11 07:09:48 -07:00
ufs fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
xfs - Fix for a potential infinite loop which was introduced in 4d559a3bcb 2013-03-19 15:17:40 -07:00
aio.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
anon_inodes.c get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero 2013-02-26 02:46:11 -05:00
attr.c userns: Allow chown and setgid preservation 2012-11-20 04:17:24 -08:00
bad_inode.c lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
binfmt_aout.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
binfmt_elf_fdpic.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
binfmt_elf.c ImgTec Meta architecture changes for v3.9-rc1 2013-03-03 12:06:09 -08:00
binfmt_em86.c exec: use -ELOOP for max recursion depth 2012-12-17 17:15:23 -08:00
binfmt_flat.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
binfmt_misc.c fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
binfmt_script.c exec: do not leave bprm->interp on stack 2012-12-20 17:40:19 -08:00
binfmt_som.c get rid of pt_regs argument of ->load_binary() 2012-11-28 21:53:38 -05:00
bio-integrity.c
bio.c block: add missing block_bio_complete() tracepoint 2013-01-14 15:00:36 +01:00
block_dev.c loop: prevent bdev freeing while device in use 2013-04-01 15:48:47 -07:00
buffer.c Merge branch 'for-3.9/core' of git://git.kernel.dk/linux-block 2013-02-28 12:52:24 -08:00
char_dev.c char_dev: pin parent kobject 2012-10-22 08:50:37 +03:00
compat_binfmt_elf.c coredump: extend core dump note section to contain file names of mapped files 2012-10-06 03:05:17 +09:00
compat_ioctl.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
compat.c Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys 2013-03-12 11:05:45 -07:00
coredump.c coredump: remove redundant defines for dumpable states 2013-02-27 19:10:11 -08:00
coredump.h coredump: update coredump-related headers 2012-10-06 03:05:15 +09:00
dcache.c Nest rename_lock inside vfsmount_lock 2013-03-26 18:25:57 -04:00
dcookies.c
direct-io.c fs: Fix possible use-after-free with AIO 2013-02-22 23:31:36 -05:00
drop_caches.c
eventfd.c fs, eventfd: add procfs fdinfo helper 2012-12-17 17:15:27 -08:00
eventpoll.c epoll: prevent missed events on EPOLL_CTL_MOD 2013-01-02 09:16:43 -08:00
exec.c coredump: remove redundant defines for dumpable states 2013-02-27 19:10:11 -08:00
fcntl.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
fhandle.c Merge branch 'for-3.8' of git://linux-nfs.org/~bfields/linux 2012-12-20 14:04:11 -08:00
fifo.c
file_table.c cache the value of file_inode() in struct file 2013-03-01 19:48:30 -05:00
file.c locking: Various static lock initializer fixes 2013-02-19 08:42:45 +01:00
filesystems.c fs: Limit sys_mount to only request filesystem modules. 2013-03-03 19:36:31 -08:00
fs_struct.c constify path_get/path_put and fs_struct.c stuff 2013-03-01 23:51:07 -05:00
fs-writeback.c 2 writeback fixes 2013-02-28 13:21:44 -08:00
generic_acl.c
inode.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
internal.h Don't bother with redoing rw_verify_area() from default_file_splice_from() 2013-03-21 13:11:11 -04:00
ioctl.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
ioprio.c
Kconfig fuse: Move CUSE Kconfig entry from fs/Kconfig into fs/fuse/Kconfig 2013-01-17 13:08:45 +01:00
Kconfig.binfmt coredump: make core dump functionality optional 2012-10-06 03:05:15 +09:00
libfs.c vfs: drop vmtruncate 2012-12-20 18:46:29 -05:00
locks.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
Makefile f2fs: update Kconfig and Makefile 2012-12-11 13:43:42 +09:00
mbcache.c
mount.h proc: Usable inode numbers for the namespace file descriptors. 2012-11-20 04:19:49 -08:00
mpage.c
namei.c vfs: don't BUG_ON() if following a /proc fd pseudo-symlink results in a symlink 2013-03-08 09:03:07 -08:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-04-09 12:22:49 -07:00
no-block.c
open.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-03-03 13:23:03 -08:00
pipe.c vfs: fix pipe counter breakage 2013-03-12 08:29:17 -07:00
pnode.c vfs: Carefully propogate mounts across user namespaces 2013-03-27 07:50:05 -07:00
pnode.h vfs: Carefully propogate mounts across user namespaces 2013-03-27 07:50:05 -07:00
posix_acl.c
proc_namespace.c
read_write.c vfs/splice: Fix missed checks in new __kernel_write() helper 2013-03-27 09:24:02 -07:00
read_write.h compat: fs: Generic compat_sys_sendfile implementation 2012-10-02 21:35:55 -04:00
readdir.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
select.c sched/rt: Move rt specific bits into new header file 2013-02-07 20:51:08 +01:00
seq_file.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-03-03 13:23:03 -08:00
signalfd.c fs, epoll: add procfs fdinfo helper 2012-12-17 17:15:27 -08:00
splice.c Don't bother with redoing rw_verify_area() from default_file_splice_from() 2013-03-21 13:11:11 -04:00
stack.c
stat.c switch vfs_getattr() to struct path 2013-02-26 02:46:08 -05:00
statfs.c vfs: fix user_statfs to retry once on ESTALE errors 2012-12-20 18:50:07 -05:00
super.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
sync.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
timerfd.c compat: restore timerfd settime and gettime compat syscalls 2013-03-02 09:35:13 -05:00
utimes.c vfs: allow utimensat() calls to retry once on an ESTALE error 2012-12-20 18:50:08 -05:00
xattr_acl.c userns: Fix posix_acl_file_xattr_userns gid conversion 2012-10-12 13:16:48 -07:00
xattr.c vfs: make lremovexattr retry once on ESTALE error 2012-12-20 18:50:11 -05:00