linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-01 01:31:44 +00:00

Author	SHA1	Message	Date
Linus Torvalds	6432ed648a	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Revert `738d2be`, simplify set_task_cpu()	2009-12-23 09:12:57 -08:00
Peter Zijlstra	0c69774e6c	sched: Revert `738d2be`, simplify set_task_cpu() Effectively reverts `738d2be430`. As demonstrated by Eric, we really need to call __set_task_cpu() early in the fork() path to properly initialize the various task state -- specifically the cgroup state through set_task_rq(). [ we could probably fix this by explicitly calling __set_task_cpu() from sched_fork(), but lets try that for the next cycle and simply revert to the old behaviour for now. ] Reported-by: Eric Paris <eparis@redhat.com> Tested-by: Eric Paris <eparis@redhat.com>, Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: efault@gmx.de LKML-Reference: <1261492999.4937.36.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-23 10:04:10 +01:00
Linus Torvalds	fe35d4a028	Merge git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: jfs: Fix 32bit build warning Remove obsolete comment in fs.h Sanitize f_flags helpers Fix f_flags/f_mode in case of lookup_instantiate_filp() from open(pathname, 3) anonfd: Allow making anon files read-only fs/compat_ioctl.c: fix build error when !BLOCK pohmelfs needs I_LOCK alloc_file(): simplify handling of mnt_clone_write() errors	2009-12-22 14:20:48 -08:00
Stefani Seibold	86d4880313	kfifo: add record handling functions Add kfifo_in_rec() - puts some record data into the FIFO Add kfifo_out_rec() - gets some record data from the FIFO Add kfifo_from_user_rec() - puts some data from user space into the FIFO Add kfifo_to_user_rec() - gets data from the FIFO and write it to user space Add kfifo_peek_rec() - gets the size of the next FIFO record field Add kfifo_skip_rec() - skip the next fifo out record Add kfifo_avail_rec() - determinate the number of bytes available in a record FIFO Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-22 14:17:56 -08:00
Stefani Seibold	a121f24acc	kfifo: add kfifo_skip, kfifo_from_user and kfifo_to_user Add kfifo_reset_out() for save lockless discard the fifo output Add kfifo_skip() to skip a number of output bytes Add kfifo_from_user() to copy user space data into the fifo Add kfifo_to_user() to copy fifo data to user space Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-22 14:17:56 -08:00
Stefani Seibold	7acd72eb85	kfifo: rename kfifo_put... into kfifo_in... and kfifo_get... into kfifo_out... rename kfifo_put... into kfifo_in... to prevent miss use of old non in kernel-tree drivers ditto for kfifo_get... -> kfifo_out... Improve the prototypes of kfifo_in and kfifo_out to make the kerneldoc annotations more readable. Add mini "howto porting to the new API" in kfifo.h Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-22 14:17:56 -08:00
Stefani Seibold	e64c026dd0	kfifo: cleanup namespace change name of __kfifo_* functions to kfifo_*, because the prefix __kfifo should be reserved for internal functions only. Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-22 14:17:56 -08:00
Stefani Seibold	c1e13f2567	kfifo: move out spinlock Move the pointer to the spinlock out of struct kfifo. Most users in tree do not actually use a spinlock, so the few exceptions now have to call kfifo_{get,put}_locked, which takes an extra argument to a spinlock. Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-22 14:17:56 -08:00
Stefani Seibold	4546548789	kfifo: move struct kfifo in place This is a new generic kernel FIFO implementation. The current kernel fifo API is not very widely used, because it has to many constrains. Only 17 files in the current 2.6.31-rc5 used it. FIFO's are like list's a very basic thing and a kfifo API which handles the most use case would save a lot of development time and memory resources. I think this are the reasons why kfifo is not in use: - The API is to simple, important functions are missing - A fifo can be only allocated dynamically - There is a requirement of a spinlock whether you need it or not - There is no support for data records inside a fifo So I decided to extend the kfifo in a more generic way without blowing up the API to much. The new API has the following benefits: - Generic usage: For kernel internal use and/or device driver. - Provide an API for the most use case. - Slim API: The whole API provides 25 functions. - Linux style habit. - DECLARE_KFIFO, DEFINE_KFIFO and INIT_KFIFO Macros - Direct copy_to_user from the fifo and copy_from_user into the fifo. - The kfifo itself is an in place member of the using data structure, this save an indirection access and does not waste the kernel allocator. - Lockless access: if only one reader and one writer is active on the fifo, which is the common use case, no additional locking is necessary. - Remove spinlock - give the user the freedom of choice what kind of locking to use if one is required. - Ability to handle records. Three type of records are supported: - Variable length records between 0-255 bytes, with a record size field of 1 bytes. - Variable length records between 0-65535 bytes, with a record size field of 2 bytes. - Fixed size records, which no record size field. - Preserve memory resource. - Performance! - Easy to use! This patch: Since most users want to have the kfifo as part of another object, reorganize the code to allow including struct kfifo in another data structure. This requires changing the kfifo_alloc and kfifo_init prototypes so that we pass an existing kfifo pointer into them. This patch changes the implementation and all existing users. [akpm@linux-foundation.org: fix warning] Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-22 14:17:55 -08:00
Linus Torvalds	83f57a11d8	Revert "time: Remove xtime_cache" This reverts commit `7bc7d63745`, as requested by John Stultz. Quoting John: "Petr Titěra reported an issue where he saw odd atime regressions with 2.6.33 where there were a full second worth of nanoseconds in the nanoseconds field. He also reviewed the time code and narrowed down the problem: unhandled overflow of the nanosecond field caused by rounding up the sub-nanosecond accumulated time. Details: * At the end of update_wall_time(), we currently round up the sub-nanosecond portion of accumulated time when storing it into xtime. This was added to avoid time inconsistencies caused when the sub-nanosecond portion was truncated when storing into xtime. Unfortunately we don't handle the possible second overflow caused by that rounding. * Previously the xtime_cache code hid this overflow by normalizing the xtime value when storing into the xtime_cache. * We could try to handle the second overflow after the rounding up, but since this affects the timekeeping's internal state, this would further complicate the next accumulation cycle, causing small errors in ntp steering. As much as I'd like to get rid of it, the xtime_cache code is known to work. * The correct fix is really to include the sub-nanosecond portion in the timekeeping accessor function, so we don't need to round up at during accumulation. This would greatly simplify the accumulation code. Unfortunately, we can't do this safely until the last three non-GENERIC_TIME arches (sparc32, arm, cris) are converted (those patches are in -mm) and we kill off the spots where arches set xtime directly. This is all 2.6.34 material, so I think reverting the xtime_cache change is the best approach for now. Many thanks to Petr for both reporting and finding the issue!" Reported-by: Petr Titěra <P.Titera@century.cz> Requested-by: john stultz <johnstul@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-22 14:10:37 -08:00
Al Viro	5300990c03	Sanitize f_flags helpers * pull ACC_MODE to fs.h; we have several copies all over the place * nightmarish expression calculating f_mode by f_flags deserves a helper too (OPEN_FMODE(flags)) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:34 -05:00
Roland Dreier	628ff7c1d8	anonfd: Allow making anon files read-only It seems a couple places such as arch/ia64/kernel/perfmon.c and drivers/infiniband/core/uverbs_main.c could use anon_inode_getfile() instead of a private pseudo-fs + alloc_file(), if only there were a way to get a read-only file. So provide this by having anon_inode_getfile() create a read-only file if we pass O_RDONLY in flags. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:34 -05:00
Dominik Brodowski	0e2c8b8f55	resources: fix call to alignf() in allocate_resource() The second parameter to alignf() in allocate_resource() must reflect what new resource is attempted to be allocated, else functions like pcibios_align_resource() (at least on x86) or pcmcia_align() can't work correctly. Commit `1e5ad96790` broke this by setting the "new" resource until we're about to return success. To keep the resource untouched when allocate_resource() fails, a "tmp" resource is introduced. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-21 10:42:29 -08:00
Peter Zijlstra	70f1120527	sched: Fix hotplug hang The hot-unplug kstopmachine usage does a wakeup after deactivating the cpu, hence we cannot use cpu_active() here but must rely on the good olde online. Reported-by: Sachin Sant <sachinp@in.ibm.com> Reported-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Jens Axboe <jens.axboe@oracle.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> LKML-Reference: <1261326987.4314.24.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-20 23:31:23 +01:00
Peter Zijlstra	3df0fc5b2e	sched: Restore printk sanity Revert the braindead pr_* crap. (Commit `663997d` "sched: Use pr_fmt() and pr_<level>()") It's dumb and causes stupid "sched: " strings all over the place. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Mike Galbraith <efault@gmx.de> Cc: Joe Perches <joe@perches.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1261315437.4314.6.camel@laptop> [ i dont mind the pr_*() patterns that much - but Peter dislikes them with a vengence. ] [ - v2: remove spurious diffstat from changelog :-/ ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-20 19:05:02 +01:00
Linus Torvalds	eca9dfcd00	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf session: Make events_stats u64 to avoid overflow on 32-bit arches hw-breakpoints: Fix hardware breakpoints -> perf events dependency perf events: Dont report side-band events on each cpu for per-task-per-cpu events perf events, x86/stacktrace: Fix performance/softlockup by providing a special frame pointer-only stack walker perf events, x86/stacktrace: Make stack walking optional perf events: Remove unused perf_counter.h header file perf probe: Check new event name kprobe-tracer: Check new event/group name perf probe: Check whether debugfs path is correct perf probe: Fix libdwarf include path for Debian	2009-12-19 09:48:42 -08:00
Linus Torvalds	aac3d39693	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) sched: Fix broken assertion sched: Assert task state bits at build time sched: Update task_state_arraypwith new states sched: Add missing state chars to TASK_STATE_TO_CHAR_STR sched: Move TASK_STATE_TO_CHAR_STR near the TASK_state bits sched: Teach might_sleep() about preemptible RCU sched: Make warning less noisy sched: Simplify set_task_cpu() sched: Remove the cfs_rq dependency from set_task_cpu() sched: Add pre and post wakeup hooks sched: Move kthread_bind() back to kthread.c sched: Fix select_task_rq() vs hotplug issues sched: Fix sched_exec() balancing sched: Ensure set_task_cpu() is never called on blocked tasks sched: Use TASK_WAKING for fork wakups sched: Select_task_rq_fair() must honour SD_LOAD_BALANCE sched: Fix task_hot() test order sched: Fix set_cpu_active() in cpu_down() sched: Mark boot-cpu active before smp_init() sched: Fix cpu_clock() in NMIs, on !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK ...	2009-12-19 09:47:49 -08:00
Linus Torvalds	10e5453ffa	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sys: Fix missing rcu protection for __task_cred() access signals: Fix more rcu assumptions signal: Fix racy access to __task_cred in kill_pid_info_as_uid()	2009-12-19 09:47:34 -08:00
Linus Torvalds	3cd312c3e8	Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timers: Remove duplicate setting of new_base in __mod_timer() clockevents: Prevent clockevent_devices list corruption on cpu hotplug	2009-12-19 09:47:18 -08:00
Al Viro	b4c30aad39	fix more leaks in audit_tree.c tag_chunk() Several leaks in audit_tree didn't get caught by commit `318b6d3d7d`, including the leak on normal exit in case of multiple rules refering to the same chunk. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-19 09:27:43 -08:00
Al Viro	6f5d511489	fix braindamage in audit_tree.c untag_chunk() ... aka "Al had badly fscked up when writing that thing and nobody noticed until Eric had fixed leaks that used to mask the breakage". The function essentially creates a copy of old array sans one element and replaces the references to elements of original (they are on cyclic lists) with those to corresponding elements of new one. After that the old one is fair game for freeing. First of all, there's a dumb braino: when we get to list_replace_init we use indices for wrong arrays - position in new one with the old array and vice versa. Another bug is more subtle - termination condition is wrong if the element to be excluded happens to be the last one. We shouldn't go until we fill the new array, we should go until we'd finished the old one. Otherwise the element we are trying to kill will remain on the cyclic lists... That crap used to be masked by several leaks, so it was not quite trivial to hit. Eric had fixed some of those leaks a while ago and the shit had hit the fan... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-19 09:27:43 -08:00
Linus Torvalds	55db493b65	Merge branch 'cpumask-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * 'cpumask-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: cpumask: rename tsk_cpumask to tsk_cpus_allowed cpumask: don't recommend set_cpus_allowed hack in Documentation/cpu-hotplug.txt cpumask: avoid dereferencing struct cpumask cpumask: convert drivers/idle/i7300_idle.c to cpumask_var_t cpumask: use modern cpumask style in drivers/scsi/fcoe/fcoe.c cpumask: avoid deprecated function in mm/slab.c cpumask: use cpu_online in kernel/perf_event.c	2009-12-17 17:00:20 -08:00
Linus Torvalds	efc8e7f4c8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: Keys: KEYCTL_SESSION_TO_PARENT needs TIF_NOTIFY_RESUME architecture support NOMMU: Optimise away the {dac_,}mmap_min_addr tests security/min_addr.c: make init_mmap_min_addr() static keys: PTR_ERR return of wrong pointer in keyctl_get_security()	2009-12-17 16:58:26 -08:00
Linus Torvalds	dcc7cd0112	Merge branch 'kmemleak' of git://linux-arm.org/linux-2.6 * 'kmemleak' of git://linux-arm.org/linux-2.6: kmemleak: fix kconfig for crc32 build error kmemleak: Reduce the false positives by checking for modified objects kmemleak: Show the age of an unreferenced object kmemleak: Release the object lock before calling put_object() kmemleak: Scan the _ftrace_events section in modules kmemleak: Simplify the kmemleak_scan_area() function prototype kmemleak: Do not use off-slab management with SLAB_NOLEAKTRACE	2009-12-17 16:00:19 -08:00
Randy Dunlap	6485536bcf	printk: fix new kernel-doc warnings Fix kernel-doc warnings in printk.c: Warning(kernel/printk.c:1422): No description found for parameter 'dumper' Warning(kernel/printk.c:1422): Excess function parameter 'dump' description in 'kmsg_dump_register' Warning(kernel/printk.c:1451): No description found for parameter 'dumper' Warning(kernel/printk.c:1451): Excess function parameter 'dump' description in 'kmsg_dump_unregister' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-17 15:45:32 -08:00
Oleg Nesterov	9cd80bbb07	do_wait() optimization: do not place sub-threads on task_struct->children list Thanks to Roland who pointed out de_thread() issues. Currently we add sub-threads to ->real_parent->children list. This buys nothing but slows down do_wait(). With this patch ->children contains only main threads (group leaders). The only complication is that forget_original_parent() should iterate over sub-threads by hand, and de_thread() needs another list_replace() when it changes ->group_leader. Henceforth do_wait_thread() can never see task_detached() && !EXIT_DEAD tasks, we can remove this check (and we can unify do_wait_thread() and ptrace_do_wait()). This change can confuse the optimistic search in mm_update_next_owner(), but this is fixable and minor. Perhaps badness() and oom_kill_process() should be updated, but they should be fixed in any case. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ratan Nalumasu <rnalumasu@gmail.com> Cc: Vitaly Mayatskikh <vmayatsk@redhat.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-17 15:45:31 -08:00
WANG Cong	3e26120cc7	kernel/sysctl.c: fix the incomplete part of sysctl_max_map_count-should-be-non-negative.patch It is a mistake that we used 'proc_dointvec', it should be 'proc_dointvec_minmax', as in the original patch. Signed-off-by: WANG Cong <amwang@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-17 15:45:30 -08:00
Linus Torvalds	5a865c0606	Merge branch 'for-33' of git://repo.or.cz/linux-kbuild * 'for-33' of git://repo.or.cz/linux-kbuild: (29 commits) net: fix for utsrelease.h moving to generated gen_init_cpio: fixed fwrite warning kbuild: fix make clean after mismerge kbuild: generate modules.builtin genksyms: properly consider EXPORT_UNUSED_SYMBOL{,_GPL}() score: add asm/asm-offsets.h wrapper unifdef: update to upstream revision 1.190 kbuild: specify absolute paths for cscope kbuild: create include/generated in silentoldconfig scripts/package: deb-pkg: use fakeroot if available scripts/package: add KBUILD_PKG_ROOTCMD variable scripts/package: tar-pkg: use tar --owner=root Kbuild: clean up marker net: add net_tstamp.h to headers_install kbuild: move utsrelease.h to include/generated kbuild: move autoconf.h to include/generated drop explicit include of autoconf.h kbuild: move compile.h to include/generated kbuild: drop include/asm kbuild: do not check for include/asm-$ARCH ... Fixed non-conflicting clean merge of modpost.c as per comments from Stephen Rothwell (modpost.c had grown an include of linux/autoconf.h that needed to be changed to generated/autoconf.h)	2009-12-17 07:23:42 -08:00
Peter Zijlstra	077614ee1e	sched: Fix broken assertion There's a preemption race in the set_task_cpu() debug check in that when we get preempted after setting task->state we'd still be on the rq proper, but fail the test. Check for preempted tasks, since those are always on the RQ. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20091217121830.137155561@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-17 13:22:46 +01:00
Peter Zijlstra	5d27c23df0	perf events: Dont report side-band events on each cpu for per-task-per-cpu events Acme noticed that his FORK/MMAP numbers were inflated by about the same factor as his cpu-count. This led to the discovery of a few more sites that need to respect the event->cpu filter. Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091217121830.215333434@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-17 13:21:36 +01:00
Frederic Weisbecker	61c1917f47	perf events, x86/stacktrace: Make stack walking optional The current print_context_stack helper that does the stack walking job is good for usual stacktraces as it walks through all the stack and reports even addresses that look unreliable, which is nice when we don't have frame pointers for example. But we have users like perf that only require reliable stacktraces, and those may want a more adapted stack walker, so lets make this function a callback in stacktrace_ops that users can tune for their needs. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1261024834-5336-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-17 09:56:19 +01:00
Frederic Weisbecker	234da7bcdc	sched: Teach might_sleep() about preemptible RCU In practice, it is harmless to voluntarily sleep in a rcu_read_lock() section if we are running under preempt rcu, but it is illegal if we build a kernel running non-preemptable rcu. Currently, might_sleep() doesn't notice sleepable operations under rcu_read_lock() sections if we are running under preemptable rcu because preempt_count() is left untouched after rcu_read_lock() in this case. But we want developers who test their changes under such config to notice the "sleeping while atomic" issues. So we add rcu_read_lock_nesting to prempt_count() in might_sleep() checks. [ v2: Handle rcu-tiny ] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1260991265-8451-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-17 09:46:44 +01:00
Masami Hiramatsu	6f3cf44047	kprobe-tracer: Check new event/group name Check new event/group name is same syntax as a C symbol. In other words, checking the name is as like as other tracepoint events. This can prevent user to create an event with useless name (e.g. foo\|bar, foo*bar). Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jason Baron <jbaron@redhat.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> LKML-Reference: <20091216222408.14459.68790.stgit@dhcp-100-2-132.bos.redhat.com> [ v2: minor cleanups ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-17 09:42:44 +01:00
Ingo Molnar	416eb39556	sched: Make warning less noisy Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.807938893@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-17 06:05:49 +01:00
Rusty Russell	62ac127950	cpumask: avoid dereferencing struct cpumask struct cpumask will be undefined soon with CONFIG_CPUMASK_OFFSTACK=y, to avoid them being declared on the stack. cpumask_bits() does what we want here (of course, this code is crap). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> To: Thomas Gleixner <tglx@linutronix.de>	2009-12-17 11:43:29 +10:30
Rusty Russell	f6325e30eb	cpumask: use cpu_online in kernel/perf_event.c Also, we want to check against nr_cpu_ids, not num_possible_cpus(). The latter works, but the correct bounds check is < nr_cpu_ids. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> To: Thomas Gleixner <tglx@linutronix.de>	2009-12-17 11:43:11 +10:30
Simon Horman	cf1e367ee8	timers: Remove duplicate setting of new_base in __mod_timer() new_base is set using per_cpu(tvec_bases, cpu) after selecting the desired value of cpu immediately below so this line is a unnecessary. Signed-off-by: Simon Horman <horms@verge.net.au> LKML-Reference: <20091217001542.GD25317@verge.net.au> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-12-17 01:30:49 +01:00
David Howells	6e14154676	NOMMU: Optimise away the {dac_,}mmap_min_addr tests In NOMMU mode clamp dac_mmap_min_addr to zero to cause the tests on it to be skipped by the compiler. We do this as the minimum mmap address doesn't make any sense in NOMMU mode. mmap_min_addr and round_hint_to_min() can be discarded entirely in NOMMU mode. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-12-17 09:25:19 +11:00
Andi Kleen	61cf693159	[sysctl] Fix breakage on systems with older glibc As predicted during code review, the sysctl(2) changes made systems with old glibc nearly unusable. About every command gives a: warning: process `ls' used the deprecated sysctl system call with 1.4 warning in the log. I see this on a SUSE 10.0 system with glibc 2.3.5. Don't warn for this common case. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-16 12:36:18 -08:00
Linus Torvalds	8aedf8a6ae	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (52 commits) perf record: Use per-task-per-cpu events for inherited events perf record: Properly synchronize child creation perf events: Allow per-task-per-cpu counters perf diff: Percent calcs should use double values perf diff: Change the default sort order to "dso,symbol" perf diff: Use perf_session__fprintf_hists just like 'perf record' perf report: Fix cut'n'paste error recently introduced perf session: Move perf report specific hits out of perf_session__fprintf_hists perf tools: Move hist entries printing routines from perf report perf report: Generalize perf_session__fprintf_hists() perf symbols: Move symbol filtering to event__preprocess_sample() perf symbols: Adopt the strlists for dso, comm perf symbols: Make symbol_conf global perf probe: Fix to show which probe point is not found perf probe: Check symbols in symtab/kallsyms perf probe: Check build-id of vmlinux perf probe: Reject second attempt of adding same-name event perf probe: Support event name for --add option perf probe: Add glob matching support on --del perf probe: Use strlist__for_each macros in probe-event.c ...	2009-12-16 12:32:47 -08:00
Linus Torvalds	da184a8064	Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Fix return of trace_dump_stack() ksym_tracer: Fix bad cast tracing/power: Remove two exports tracing: Change event->profile_count to be int type tracing: Simplify trace_option_write() tracing: Remove useless trace option tracing: Use seq file for trace_clock tracing: Use seq file for trace_options function-graph: Allow writing the same val to set_graph_function ftrace: Call trace_parser_clear() properly ftrace: Return EINVAL when writing invalid val to set_ftrace_filter tracing: Move a printk out of ftrace_raw_reg_event_foo() tracing: Pull up calls to trace_define_common_fields() tracing: Extract duplicate ftrace_raw_init_event_foo() ftrace.h: Use common pr_info fmt string tracing: Add stack trace to irqsoff tracer tracing: Add trace_dump_stack() ring-buffer: Move resize integrity check under reader lock ring-buffer: Use sync sched protection on ring buffer resizing tracing: Fix wrong usage of strstrip in trace_ksyms	2009-12-16 12:02:25 -08:00
Linus Torvalds	74f3ae7434	Merge branch 'module' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * 'module' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: modpost: fix segfault with short symbol names module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y Kbuild: clear marker out of modpost module: make MODULE_SYMBOL_PREFIX into a CONFIG option ARM: unexport symbols used to implement floating point emulation ARM: use unified discard definition in linker script x86: don't export inline function sparc64: don't export static inline pci_ functions	2009-12-16 10:47:24 -08:00
Linus Torvalds	60d9aa758c	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: (90 commits) jffs2: Fix long-standing bug with symlink garbage collection. mtd: OneNAND: Fix test of unsigned in onenand_otp_walk() mtd: cfi_cmdset_0002, fix lock imbalance Revert "mtd: move mxcnd_remove to .exit.text" mtd: m25p80: add support for Macronix MX25L4005A kmsg_dump: fix build for CONFIG_PRINTK=n mtd: nandsim: add support for 4KiB pages mtd: mtdoops: refactor as a kmsg_dumper mtd: mtdoops: make record size configurable mtd: mtdoops: limit the maximum mtd partition size mtd: mtdoops: keep track of used/unused pages in an array mtd: mtdoops: several minor cleanups core: Add kernel message dumper to call on oopses and panics mtd: add ARM pismo support mtd: pxa3xx_nand: Fix PIO data transfer mtd: nand: fix multi-chip suspend problem mtd: add support for switching old SST chips into QRY mode mtd: fix M29W800D dev_id and uaddr mtd: don't use PF_MEMALLOC mtd: Add bad block table overrides to Davinci NAND driver ... Fixed up conflicts (mostly trivial) in drivers/mtd/devices/m25p80.c drivers/mtd/maps/pcmciamtd.c drivers/mtd/nand/pxa3xx_nand.c kernel/printk.c	2009-12-16 10:23:43 -08:00
Peter Zijlstra	738d2be430	sched: Simplify set_task_cpu() Rearrange code a bit now that its a simpler function. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170518.269101883@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-16 19:01:59 +01:00
Peter Zijlstra	88ec22d3ed	sched: Remove the cfs_rq dependency from set_task_cpu() In order to remove the cfs_rq dependency from set_task_cpu() we need to ensure the task is cfs_rq invariant for all callsites. The simple approach is to substract cfs_rq->min_vruntime from se->vruntime on dequeue, and add cfs_rq->min_vruntime on enqueue. However, this has the downside of breaking FAIR_SLEEPERS since we loose the old vruntime as we only maintain the relative position. To solve this, we observe that we only migrate runnable tasks, we do this using deactivate_task(.sleep=0) and activate_task(.wakeup=0), therefore we can restrain the min_vruntime invariance to that state. The only other case is wakeup balancing, since we want to maintain the old vruntime we cannot make it relative on dequeue, but since we don't migrate inactive tasks, we can do so right before we activate it again. This is where we need the new pre-wakeup hook, we need to call this while still holding the old rq->lock. We could fold it into ->select_task_rq(), but since that has multiple callsites and would obfuscate the locking requirements, that seems like a fudge. This leaves the fork() case, simply make sure that ->task_fork() leaves the ->vruntime in a relative state. This covers all cases where set_task_cpu() gets called, and ensures it sees a relative vruntime. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170518.191697025@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-16 19:01:58 +01:00
Peter Zijlstra	efbbd05a59	sched: Add pre and post wakeup hooks As will be apparent in the next patch, we need a pre wakeup hook for sched_fair task migration, hence rename the post wakeup hook and one pre wakeup. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170518.114746117@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-16 19:01:58 +01:00
Peter Zijlstra	881232b70b	sched: Move kthread_bind() back to kthread.c Since kthread_bind() lost its dependencies on sched.c, move it back where it came from. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170518.039524041@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-16 19:01:57 +01:00
Peter Zijlstra	5da9a0fb67	sched: Fix select_task_rq() vs hotplug issues Since select_task_rq() is now responsible for guaranteeing ->cpus_allowed and cpu_active_mask, we need to verify this. select_task_rq_rt() can blindly return smp_processor_id()/task_cpu() without checking the valid masks, select_task_rq_fair() can do the same in the rare case that all SD_flags are disabled. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.961475466@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-16 19:01:57 +01:00
Peter Zijlstra	3802290628	sched: Fix sched_exec() balancing Since we access ->cpus_allowed without holding rq->lock we need a retry loop to validate the result, this comes for near free when we merge sched_migrate_task() into sched_exec() since that already does the needed check. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.884743662@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-16 19:01:56 +01:00
Peter Zijlstra	e2912009fb	sched: Ensure set_task_cpu() is never called on blocked tasks In order to clean up the set_task_cpu() rq dependencies we need to ensure it is never called on blocked tasks because such usage does not pair with consistent rq->lock usage. This puts the migration burden on ttwu(). Furthermore we need to close a race against changing ->cpus_allowed, since select_task_rq() runs with only preemption disabled. For sched_fork() this is safe because the child isn't in the tasklist yet, for wakeup we fix this by synchronizing set_cpus_allowed_ptr() against TASK_WAKING, which leaves sched_exec to be a problem This also closes a hole in (`6ad4c1888` sched: Fix balance vs hotplug race) where ->select_task_rq() doesn't validate the result against the sched_domain/root_domain. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.807938893@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-16 19:01:56 +01:00

1 2 3 4 5 ...

8956 Commits