Pull block driver updates from Jens Axboe:
"It might look big in volume, but when categorized, not a lot of
drivers are touched. The pull request contains:
- mtip32xx fixes from Micron.
- A slew of drbd updates, this time in a nicer series.
- bcache, a flash/ssd caching framework from Kent.
- Fixes for cciss"
* 'for-3.10/drivers' of git://git.kernel.dk/linux-block: (66 commits)
bcache: Use bd_link_disk_holder()
bcache: Allocator cleanup/fixes
cciss: bug fix to prevent cciss from loading in kdump crash kernel
cciss: add cciss_allow_hpsa module parameter
drivers/block/mg_disk.c: add CONFIG_PM_SLEEP to suspend/resume functions
mtip32xx: Workaround for unaligned writes
bcache: Make sure blocksize isn't smaller than device blocksize
bcache: Fix merge_bvec_fn usage for when it modifies the bvm
bcache: Correctly check against BIO_MAX_PAGES
bcache: Hack around stuff that clones up to bi_max_vecs
bcache: Set ra_pages based on backing device's ra_pages
bcache: Take data offset from the bdev superblock.
mtip32xx: mtip32xx: Disable TRIM support
mtip32xx: fix a smatch warning
bcache: Disable broken btree fuzz tester
bcache: Fix a format string overflow
bcache: Fix a minor memory leak on device teardown
bcache: Documentation updates
bcache: Use WARN_ONCE() instead of __WARN()
bcache: Add missing #include <linux/prefetch.h>
...
Also add some missing printk levels.
Signed-off-by: Dave Jones <davej@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130425174002.GA26769@redhat.com
[ Tweaked the messages a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We occasionally get reports of these BUGs being hit, and the
stack trace doesn't necessarily always tell us what we need to
know about why we are hitting those limits.
If users start attaching /proc/lock_stats to reports we may have
more of a clue what's going on.
Signed-off-by: Dave Jones <davej@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130423163403.GA12839@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This reverts commit 6aa9707099.
Commit 6aa9707099 ("lockdep: check that no locks held at freeze time")
causes problems with NFS root filesystems. The failures were noticed on
OMAP2 and 3 boards during kernel init:
[ BUG: swapper/0/1 still has locks held! ]
3.9.0-rc3-00344-ga937536 #1 Not tainted
-------------------------------------
1 lock held by swapper/0/1:
#0: (&type->s_umount_key#13/1){+.+.+.}, at: [<c011e84c>] sget+0x248/0x574
stack backtrace:
rpc_wait_bit_killable
__wait_on_bit
out_of_line_wait_on_bit
__rpc_execute
rpc_run_task
rpc_call_sync
nfs_proc_get_root
nfs_get_root
nfs_fs_mount_common
nfs_try_mount
nfs_fs_mount
mount_fs
vfs_kern_mount
do_mount
sys_mount
do_mount_root
mount_root
prepare_namespace
kernel_init_freeable
kernel_init
Although the rootfs mounts, the system is unstable. Here's a transcript
from a PM test:
http://www.pwsan.com/omap/testlogs/test_v3.9-rc3/20130317194234/pm/37xxevm/37xxevm_log.txt
Here's what the test log should look like:
http://www.pwsan.com/omap/testlogs/test_v3.8/20130218214403/pm/37xxevm/37xxevm_log.txt
Mailing list discussion is here:
http://lkml.org/lkml/2013/3/4/221
Deal with this for v3.9 by reverting the problem commit, until folks can
figure out the right long-term course of action.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: <maciej.rutecki@gmail.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ben Chan <benchan@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hack, but bcache needs a way around lockdep for locking during garbage
collection - we need to keep multiple btree nodes locked for coalescing
and rw_lock_nested() isn't really sufficient or appropriate here.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Ingo Molnar <mingo@redhat.com>
We shouldn't try_to_freeze if locks are held. Holding a lock can cause a
deadlock if the lock is later acquired in the suspend or hibernate path
(e.g. by dpm). Holding a lock can also cause a deadlock in the case of
cgroup_freezer if a lock is held inside a frozen cgroup that is later
acquired by a process outside that group.
[akpm@linux-foundation.org: export debug_check_no_locks_held]
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Ben Chan <benchan@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is considered good form to lock the lock you claim to be nested in.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
[ removed nest_lock arg to print_lock_nested_lock_not_held in favour
of hlock->nest_lock, also renamed the lock arg to hlock since its
a held_lock type ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/5051A9E7.5040501@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It is illegal to use RCU from a CPU that has reported idleness or
offlinedness to RCU. However, it can be quite difficult to determine
from a stack trace whether or not a given CPU is idle or offline.
Therefore, this commit adds idle/offline diagnostics to the lockdep-RCU
error message.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (106 commits)
perf kvm: Fix copy & paste error in description
perf script: Kill script_spec__delete
perf top: Fix a memory leak
perf stat: Introduce get_ratio_color() helper
perf session: Remove impossible condition check
perf tools: Fix feature-bits rework fallout, remove unused variable
perf script: Add generic perl handler to process events
perf tools: Use for_each_set_bit() to iterate over feature flags
perf tools: Unify handling of features when writing feature section
perf report: Accept fifos as input file
perf tools: Moving code in some files
perf tools: Fix out-of-bound access to struct perf_session
perf tools: Continue processing header on unknown features
perf tools: Improve macros for struct feature_ops
perf: builtin-record: Document and check that mmap_pages must be a power of two.
perf: builtin-record: Provide advice if mmap'ing fails with EPERM.
perf tools: Fix truncated annotation
perf script: look up thread using tid instead of pid
perf tools: Look up thread names for system wide profiling
perf tools: Fix comm for processes with named threads
...
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
cpu: Export cpu_up()
rcu: Apply ACCESS_ONCE() to rcu_boost() return value
Revert "rcu: Permit rt_mutex_unlock() with irqs disabled"
docs: Additional LWN links to RCU API
rcu: Augment rcu_batch_end tracing for idle and callback state
rcu: Add rcutorture tests for srcu_read_lock_raw()
rcu: Make rcutorture test for hotpluggability before offlining CPUs
driver-core/cpu: Expose hotpluggability to the rest of the kernel
rcu: Remove redundant rcu_cpu_stall_suppress declaration
rcu: Adaptive dyntick-idle preparation
rcu: Keep invoking callbacks if CPU otherwise idle
rcu: Irq nesting is always 0 on rcu_enter_idle_common
rcu: Don't check irq nesting from rcu idle entry/exit
rcu: Permit dyntick-idle with callbacks pending
rcu: Document same-context read-side constraints
rcu: Identify dyntick-idle CPUs on first force_quiescent_state() pass
rcu: Remove dynticks false positives and RCU failures
rcu: Reduce latency of rcu_prepare_for_idle()
rcu: Eliminate RCU_FAST_NO_HZ grace-period hang
rcu: Avoid needlessly IPIing CPUs at GP end
...
Inform the user if an RCU usage error is detected by lockdep while in
an extended quiescent state (in this case, the RCU-free window in idle).
This is accomplished by adding a line to the RCU lockdep splat indicating
whether or not the splat occurred in extended quiescent state.
Uses of RCU from within extended quiescent state mode are totally ignored
by RCU, hence the importance of this diagnostic.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Since commit f59de89 ("lockdep: Clear whole lockdep_map on initialization"),
lockdep_init_map() will clear all the struct. But it will break
lock_set_class()/lock_set_subclass(). A typical race condition
is like below:
CPU A CPU B
lock_set_subclass(lockA);
lock_set_class(lockA);
lockdep_init_map(lockA);
/* lockA->name is cleared */
memset(lockA);
__lock_acquire(lockA);
/* lockA->class_cache[] is cleared */
register_lock_class(lockA);
look_up_lock_class(lockA);
WARN_ON_ONCE(class->name !=
lock->name);
lock->name = name;
So restore to what we have done before commit f59de89 but annotate
->lock with kmemcheck_mark_initialized() to suppress the kmemcheck
warning reported in commit f59de89.
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Suggested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: <stable@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111109080451.GB8124@zhy
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch prints the name of the lock which is acquired
before lockdep_init() is called, so that users can easily
find which lock triggered the lockdep init error warning.
This patch also removes the lockdep_init_error() message
of "Arch code didn't call lockdep_init() early enough?"
since lockdep_init() is called in arch independent code now.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1321508072-23853-2-git-send-email-tom.leiming@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since commit f59de89 ("lockdep: Clear whole lockdep_map on initialization"),
lockdep_init_map() will clear all the struct. But it will break
lock_set_class()/lock_set_subclass(). A typical race condition
is like below:
CPU A CPU B
lock_set_subclass(lockA);
lock_set_class(lockA);
lockdep_init_map(lockA);
/* lockA->name is cleared */
memset(lockA);
__lock_acquire(lockA);
/* lockA->class_cache[] is cleared */
register_lock_class(lockA);
look_up_lock_class(lockA);
WARN_ON_ONCE(class->name !=
lock->name);
lock->name = name;
So restore to what we have done before commit f59de89 but annotate
->lock with kmemcheck_mark_initialized() to suppress the kmemcheck
warning reported in commit f59de89.
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Suggested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111109080451.GB8124@zhy
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Show the taint flags in all lockdep and rtmutex-debug error messages.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1319773015.6759.30.camel@deadeye
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit ["62016250 lockdep: Add improved subclass caching"] tries to
improve performance (expecially to reduce the cost of rq->lock)
when using lockdep, but it fails due to lockdep_init_map() in
which ->class_cache is cleared. The typical caller is
lock_set_subclass(), after that class will not be cached anymore.
This patch tries to achive the goal of commit 62016250 by always
setting ->class_cache in register_lock_class().
=== Score comparison of benchmarks ===
for i in `seq 1 10`; do ./perf bench -f simple sched messaging; done
before: min: 0.604, max: 0.660, avg: 0.622
after: min: 0.414, max: 0.473, avg: 0.427
for i in `seq 1 10`; do ./perf bench -f simple sched messaging -g 40; done
before: min: 2.347, max: 2.421, avg: 2.391
after: min: 1.652, max: 1.699, avg: 1.671
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111109080714.GC8124@zhy
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The pretty print of the lockdep debug splat uses just the lock name
to show how the locking scenario happens. But when it comes to
nesting locks, the output becomes confusing which takes away the point
of the pretty printing of the lock scenario.
Without displaying the subclass info, we get the following output:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(slock-AF_INET);
lock(slock-AF_INET);
lock(slock-AF_INET);
lock(slock-AF_INET);
*** DEADLOCK ***
The above looks more of a A->A locking bug than a A->B B->A.
By adding the subclass to the output, we can see what really happened:
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(slock-AF_INET);
lock(slock-AF_INET/1);
lock(slock-AF_INET);
lock(slock-AF_INET/1);
*** DEADLOCK ***
This bug was discovered while tracking down a real bug caught by lockdep.
Link: http://lkml.kernel.org/r/20111025202049.GB25043@hostway.ca
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Simon Kirby <sim@hostway.ca>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
rcu: Move propagation of ->completed from rcu_start_gp() to rcu_report_qs_rsp()
rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states
rcu: Wire up RCU_BOOST_PRIO for rcutree
rcu: Make rcu_torture_boost() exit loops at end of test
rcu: Make rcu_torture_fqs() exit loops at end of test
rcu: Permit rt_mutex_unlock() with irqs disabled
rcu: Avoid having just-onlined CPU resched itself when RCU is idle
rcu: Suppress NMI backtraces when stall ends before dump
rcu: Prohibit grace periods during early boot
rcu: Simplify unboosting checks
rcu: Prevent early boot set_need_resched() from __rcu_pending()
rcu: Dump local stack if cannot dump all CPUs' stacks
rcu: Move __rcu_read_unlock()'s barrier() within if-statement
rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation
rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier
rcu: Make rcu_implicit_dynticks_qs() locals be correct size
rcu: Eliminate in_irq() checks in rcu_enter_nohz()
nohz: Remove nohz_cpu_mask
rcu: Document interpretation of RCU-lockdep splats
rcu: Allow rcutorture's stat_interval parameter to be changed at runtime
...
Long ago, using TREE_RCU with PREEMPT would result in "scheduling
while atomic" diagnostics if you blocked in an RCU read-side critical
section. However, PREEMPT now implies TREE_PREEMPT_RCU, which defeats
this diagnostic. This commit therefore adds a replacement diagnostic
based on PROVE_RCU.
Because rcu_lockdep_assert() and lockdep_rcu_dereference() are now being
used for things that have nothing to do with rcu_dereference(), rename
lockdep_rcu_dereference() to lockdep_rcu_suspicious() and add a third
argument that is a string indicating what is suspicious. This third
argument is passed in from a new third argument to rcu_lockdep_assert().
Update all calls to rcu_lockdep_assert() to add an informative third
argument.
Also, add a pair of rcu_lockdep_assert() calls from within
rcu_note_context_switch(), one complaining if a context switch occurs
in an RCU-bh read-side critical section and another complaining if a
context switch occurs in an RCU-sched read-side critical section.
These are present only if the PROVE_RCU kernel parameter is enabled.
Finally, fix some checkpatch whitespace complaints in lockdep.c.
Again, you must enable PROVE_RCU to see these new diagnostics. But you
are enabling PROVE_RCU to check out new RCU uses in any case, aren't you?
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Andrew requested I comment all the lockdep WARN()s to help other people
figure out wth is wrong..
Requested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315301493.3191.9.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
match_held_lock() was assuming it was being called on a lock class
that had already seen usage.
This condition was true for bug-free code using lockdep_assert_held(),
since you're in fact holding the lock when calling it. However the
assumption fails the moment you assume the assertion can fail, which
is the whole point of having the assertion in the first place.
Anyway, now that there's more lockdep_is_held() users, notably
__rcu_dereference_check(), its much easier to trigger this since we
test for a number of locks and we only need to hold any one of them to
be good.
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1312547787.28695.2.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
lockdep_init_map() only initializes parts of lockdep_map and triggers
kmemcheck warning when it is copied as a whole. There isn't anything
to be gained by clearing selectively. memset() the whole structure
and remove loop for ->class_cache[] clearing.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=35532
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Christian Casteyde <casteyde.christian@free.fr>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=35532
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110714131909.GJ3455@htj.dyndns.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On Sun, 2011-07-24 at 21:06 -0400, Arnaud Lacombe wrote:
> /src/linux/linux/kernel/lockdep.c: In function 'mark_held_locks':
> /src/linux/linux/kernel/lockdep.c:2471:31: warning: comparison of
> distinct pointer types lacks a cast
The warning is harmless in this case, but the below makes it go away.
Reported-by: Arnaud Lacombe <lacombar@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1311588599.2617.56.camel@laptop
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit dd4e5d3ac4 ("lockdep: Fix trace_[soft,hard]irqs_[on,off]()
recursion") made a bit of a mess of the various checks and error
conditions.
In particular it moved the check for !irqs_disabled() before the
spurious enable test, resulting in some warnings.
Reported-by: Arnaud Lacombe <lacombar@gmail.com>
Reported-by: Dave Jones <davej@redhat.com>
Reported-and-tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1311679697.24752.28.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Thomas noticed that a lock marked with lockdep_set_novalidate_class()
will still trigger warnings for IRQ inversions. Cure this by skipping
those when marking irq state.
Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-2dp5vmpsxeraqm42kgww6ge2@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit:
1efc5da3cf: [PATCH] order of lockdep off/on in vprintk() should be changed
explains the reason for having raw_local_irq_*() and lockdep_off()
in printk(). Instead of working around the broken recursion detection
of interrupt state tracking, fix it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: efault@gmx.de
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110621153806.185242734@chello.nl
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The main lock_is_held() user is lockdep_assert_held(), avoid false
assertions in lockdep_off() sections by unconditionally reporting the
lock is taken.
[ the reason this is important is a lockdep_assert_held() in ttwu()
which triggers a warning under lockdep_off() as in printk() which
can trigger another wakeup and lock up due to spinlock
recursion, as reported and heroically debugged by Arne Jansen ]
Reported-and-tested-by: Arne Jansen <lists@die-jansens.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1307398759.2497.966.camel@laptop
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For some reason nr_chain_hlocks is updated with cmpxchg, but
this is performed inside of the lockdep global "grab_lock()",
which also makes simple modification of this variable atomic.
Remove the cmpxchg logic for updating nr_chain_hlocks and
simplify the code.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110421014300.727863282@goodmis.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Lockdep output can be pretty cryptic, having nicer output
can save a lot of head scratching. When a simple irq inversion
scenario is detected by lockdep (lock A taken in interrupt
context but also in thread context without disabling interrupts)
we now get the following (hopefully more informative) output:
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(lockA);
<Interrupt>
lock(lockA);
*** DEADLOCK ***
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110421014300.436140880@goodmis.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The message of "Bad BFS generated tree" is a bit confusing.
Replace it with a more sane error message.
Thanks to Peter Zijlstra for helping me come up with a better
message.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110421014300.135521252@goodmis.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Irq inversion and irq dependency bugs are only subtly
different. The diffenerence lies where the interrupt occurred.
For irq dependency:
irq_disable
lock(A)
lock(B)
unlock(B)
unlock(A)
irq_enable
lock(B)
unlock(B)
<interrupt>
lock(A)
The interrupt comes in after it has been established that lock A
can be held when taking an irq unsafe lock. Lockdep detects the
problem when taking lock A in interrupt context.
With the irq_inversion the irq happens before it is established
and lockdep detects the problem with the taking of lock B:
<interrupt>
lock(A)
irq_disable
lock(A)
lock(B)
unlock(B)
unlock(A)
irq_enable
lock(B)
unlock(B)
Since the problem with the locking logic for both of these issues
is in actuality the same, they both should report the same scenario.
This patch implements that and prints this:
other info that might help us debug this:
Chain exists of:
&rq->lock --> lockA --> lockC
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(lockC);
local_irq_disable();
lock(&rq->lock);
lock(lockA);
<Interrupt>
lock(&rq->lock);
*** DEADLOCK ***
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110421014259.910720381@goodmis.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Lockdep output can be pretty cryptic, having nicer output
can save a lot of head scratching. When a simple deadlock
scenario is detected by lockdep (lock A -> lock A) we now
get the following new output:
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(lock)->rlock);
lock(&(lock)->rlock);
*** DEADLOCK ***
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110421014259.643930104@goodmis.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The lockdep output can be pretty cryptic, having nicer output
can save a lot of head scratching. When a normal deadlock
scenario is detected by lockdep (lock A -> lock B and there
exists a place where lock B -> lock A) we now get the following
new output:
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(lockB);
lock(lockA);
lock(lockB);
lock(lockA);
*** DEADLOCK ***
On cases where there's a deeper chair, it shows the partial
chain that can cause the issue:
Chain exists of:
lockC --> lockA --> lockB
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(lockB);
lock(lockA);
lock(lockB);
lock(lockC);
*** DEADLOCK ***
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110421014259.380621789@goodmis.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Locking order inversion due to interrupts is a subtle problem.
When an irq lockiinversion discovered by lockdep it currently
reports something like:
[ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
... and then prints out the locks that are involved, as back traces.
Judging by lkml feedback developers were routinely confused by what
a HARDIRQ->safe to unsafe issue is all about, and sometimes even
blew it off as a bug in lockdep.
It is not obvious when lockdep prints this message about a lock that
is never taken in interrupt context.
After explaining the problems that lockdep is reporting, I
decided to add a description of the problem in visual form. Now
the following is shown:
---
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(lockA);
local_irq_disable();
lock(&rq->lock);
lock(lockA);
<Interrupt>
lock(&rq->lock);
*** DEADLOCK ***
---
The above is the case when the unsafe lock is taken while
holding a lock taken in irq context. But when a lock is taken
that also grabs a unsafe lock, the call chain is shown:
---
other info that might help us debug this:
Chain exists of:
&rq->lock --> lockA --> lockC
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(lockC);
local_irq_disable();
lock(&rq->lock);
lock(lockA);
<Interrupt>
lock(&rq->lock);
*** DEADLOCK ***
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110421014259.132728798@goodmis.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
During early boot, local IRQ is disabled until IRQ subsystem is
properly initialized. During this time, no one should enable
local IRQ and some operations which usually are not allowed with
IRQ disabled, e.g. operations which might sleep or require
communications with other processors, are allowed.
lockdep tracked this with early_boot_irqs_off/on() callbacks.
As other subsystems need this information too, move it to
init/main.c and make it generally available. While at it,
toggle the boolean to early_boot_irqs_disabled instead of
enabled so that it can be initialized with %false and %true
indicates the exceptional condition.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110120110635.GB6036@htj.dyndns.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Current look_up_lock_class() doesn't check the parameter "subclass".
This rarely rises problems because the main caller of this function,
register_lock_class(), checks it.
But register_lock_class() is not the only function which calls
look_up_lock_class(). lock_set_class() and its callees also call it.
And lock_set_class() doesn't check this parameter.
This will rise problems when the the value of subclass is larger than
MAX_LOCKDEP_SUBCLASSES. Because the address (used as the key of class)
caliculated with too large subclass has a probability to point
another key in different lock_class_key.
Of course this problem depends on the memory layout and
occurs with really low probability.
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Vojtech Pavlik <vojtech@ucw.cz>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286958626-986-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Current lockdep_map only caches one class with subclass == 0,
and looks up hash table of classes when subclass != 0.
It seems that this has no problem because the case of
subclass != 0 is rare. But locks of struct rq are
acquired with subclass == 1 when task migration is executed.
Task migration is high frequent event, so I modified lockdep
to cache subclasses.
I measured the score of perf bench sched messaging.
This patch has slightly but certain (order of milli seconds
or 10 milli seconds) effect when lots of tasks are running.
I'll show the result in the tail of this description.
NR_LOCKDEP_CACHING_CLASSES specifies how many classes can be
cached in the instances of lockdep_map.
I discussed with Peter Zijlstra in LinuxCon Japan about
this approach and he taught me that caching every subclasses(8)
is cleary waste of memory. So number of cached classes
should be configurable.
=== Score comparison of benchmarks ===
# "min" means best score, and "max" means worst score
for i in `seq 1 10`; do ./perf bench -f simple sched messaging; done
before: min: 0.565000, max: 0.583000, avg: 0.572500
after: min: 0.559000, max: 0.568000, avg: 0.563300
# with more processes
for i in `seq 1 10`; do ./perf bench -f simple sched messaging -g 40; done
before: min: 2.274000, max: 2.298000, avg: 2.286300
after: min: 2.242000, max: 2.270000, avg: 2.259700
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286269311-28336-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There is no longer any functional difference between
__debug_show_held_locks() and debug_show_held_locks(),
so remove the former.
Signed-off-by: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1281021054-4228-1-git-send-email-jkacur@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For people who otherwise get to write: cpu_clock(smp_processor_id()),
there is now: local_clock().
Also, as per suggestion from Andrew, provide some documentation on
the various clock interfaces, and minimize the unsigned long long vs
u64 mess.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jens Axboe <jaxboe@fusionio.com>
LKML-Reference: <1275052414.1645.52.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The conversion of device->sem to device->mutex resulted in lockdep
warnings. Create a novalidate class for now until the driver folks
come up with separate classes. That way we have at least the basic
mutex debugging coverage.
Add a checkpatch error so the usage is reserved for device->mutex.
[ tglx: checkpatch and compile fix for LOCKDEP=n ]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (311 commits)
perf tools: Add mode to build without newt support
perf symbols: symbol inconsistency message should be done only at verbose=1
perf tui: Add explicit -lslang option
perf options: Type check all the remaining OPT_ variants
perf options: Type check OPT_BOOLEAN and fix the offenders
perf options: Check v type in OPT_U?INTEGER
perf options: Introduce OPT_UINTEGER
perf tui: Add workaround for slang < 2.1.4
perf record: Fix bug mismatch with -c option definition
perf options: Introduce OPT_U64
perf tui: Add help window to show key associations
perf tui: Make <- exit menus too
perf newt: Add single key shortcuts for zoom into DSO and threads
perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed
perf newt: Fix the 'A'/'a' shortcut for annotate
perf newt: Make <- exit the ui_browser
x86, perf: P4 PMU - fix counters management logic
perf newt: Make <- zoom out filters
perf report: Report number of events, not samples
perf hist: Clarify events_stats fields usage
...
Fix up trivial conflicts in kernel/fork.c and tools/perf/builtin-record.c
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits)
rcu: remove all rcu head initializations, except on_stack initializations
rcu head introduce rcu head init on stack
Debugobjects transition check
rcu: fix build bug in RCU_FAST_NO_HZ builds
rcu: RCU_FAST_NO_HZ must check RCU dyntick state
rcu: make SRCU usable in modules
rcu: improve the RCU CPU-stall warning documentation
rcu: reduce the number of spurious RCU_SOFTIRQ invocations
rcu: permit discontiguous cpu_possible_mask CPU numbering
rcu: improve RCU CPU stall-warning messages
rcu: print boot-time console messages if RCU configs out of ordinary
rcu: disable CPU stall warnings upon panic
rcu: enable CPU_STALL_VERBOSE by default
rcu: slim down rcutiny by removing rcu_scheduler_active and friends
rcu: refactor RCU's context-switch handling
rcu: rename rcutiny rcu_ctrlblk to rcu_sched_ctrlblk
rcu: shrink rcutiny by making synchronize_rcu_bh() be inline
rcu: fix now-bogus rcu_scheduler_active comments.
rcu: Fix bogus CONFIG_PROVE_LOCKING in comments to reflect reality.
rcu: ignore offline CPUs in last non-dyntick-idle CPU check
...
There is no need to disable lockdep after an RCU lockdep splat,
so remove the debug_lockdeps_off() from lockdep_rcu_dereference().
To avoid repeated lockdep splats, use a static variable in the inlined
rcu_dereference_check() and rcu_dereference_protected() macros so that
a given instance splats only once, but so that multiple instances can
be detected per boot.
This is controlled by a new config variable CONFIG_PROVE_RCU_REPEATEDLY,
which is disabled by default. This provides the normal lockdep behavior
by default, but permits people who want to find multiple RCU-lockdep
splats per boot to easily do so.
Requested-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Tested-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>