linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-14 16:12:02 +00:00

Author	SHA1	Message	Date
David Howells	1bb4b7f98f	FS-Cache: The retrieval remaining-pages counter needs to be atomic_t struct fscache_retrieval contains a count of the number of pages that still need some processing (n_pages). This is decremented as the pages are processed. However, this needs to be atomic as fscache_retrieval_complete() (I think) just occasionally may be called from cachefiles_read_backing_file() and cachefiles_read_copier() simultaneously. This happens when an fscache_read_or_alloc_pages() request containing a lot of pages (say a couple of hundred) is being processed. The read on each backing page is dispatched individually because we need to insert a monitor into the waitqueue to catch when the read completes. However, under low-memory conditions, we might be forced to wait in the allocator - and this gives the I/O on the backing page a chance to complete first. When the I/O completes, fscache_enqueue_retrieval() chucks the retrieval onto the workqueue without waiting for the operation to finish the initial I/O dispatch (we want to release any pages we can as soon as we can), thus both can end up running simultaneously and potentially attempting to partially complete the retrieval simultaneously (ENOMEM may occur, backing pages may already be in the page cache). This was demonstrated by parallelling the non-atomic counter with an atomic counter and printing both of them when the assertion fails. At this point, the atomic counter has reached zero, but the non-atomic counter has not. To fix this, make the counter an atomic_t. This results in the following bug appearing FS-Cache: Assertion failed 3 == 5 is false ------------[ cut here ]------------ kernel BUG at fs/fscache/operation.c:421! or FS-Cache: Assertion failed 3 == 5 is false ------------[ cut here ]------------ kernel BUG at fs/fscache/operation.c:414! With a backtrace like the following: RIP: 0010:[<ffffffffa0211b1d>] fscache_put_operation+0x1ad/0x240 [fscache] Call Trace: [<ffffffffa0213185>] fscache_retrieval_work+0x55/0x270 [fscache] [<ffffffffa0213130>] ? fscache_retrieval_work+0x0/0x270 [fscache] [<ffffffff81090b10>] worker_thread+0x170/0x2a0 [<ffffffff81096d10>] ? autoremove_wake_function+0x0/0x40 [<ffffffff810909a0>] ? worker_thread+0x0/0x2a0 [<ffffffff81096966>] kthread+0x96/0xa0 [<ffffffff8100c0ca>] child_rip+0xa/0x20 [<ffffffff810968d0>] ? kthread+0x0/0xa0 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-and-tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2013-06-19 14:16:47 +01:00
Haicheng Li	2144bc78d4	cachefiles: remove unused macro list_to_page() Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2013-06-19 14:16:47 +01:00
David Howells	1362729b16	FS-Cache: Simplify cookie retention for fscache_objects, fixing oops Simplify the way fscache cache objects retain their cookie. The way I implemented the cookie storage handling made synchronisation a pain (ie. the object state machine can't rely on the cookie actually still being there). Instead of the the object being detached from the cookie and the cookie being freed in __fscache_relinquish_cookie(), we defer both operations: () The detachment of the object from the list in the cookie now takes place in fscache_drop_object() and is thus governed by the object state machine (fscache_detach_from_cookie() has been removed). () The release of the cookie is now in fscache_object_destroy() - which is called by the cache backend just before it frees the object. This means that the fscache_cookie struct is now available to the cache all the way through from ->alloc_object() to ->drop_object() and ->put_object() - meaning that it's no longer necessary to take object->lock to guarantee access. However, __fscache_relinquish_cookie() doesn't wait for the object to go all the way through to destruction before letting the netfs proceed. That would massively slow down the netfs. Since __fscache_relinquish_cookie() leaves the cookie around, in must therefore break all attachments to the netfs - which includes ->def, ->netfs_data and any outstanding page read/writes. To handle this, struct fscache_cookie now has an n_active counter: (1) This starts off initialised to 1. (2) Any time the cache needs to get at the netfs data, it calls fscache_use_cookie() to increment it - if it is not zero. If it was zero, then access is not permitted. (3) When the cache has finished with the data, it calls fscache_unuse_cookie() to decrement it. This does a wake-up on it if it reaches 0. (4) __fscache_relinquish_cookie() decrements n_active and then waits for it to reach 0. The initialisation to 1 in step (1) ensures that we only get wake ups when we're trying to get rid of the cookie. This leaves __fscache_relinquish_cookie() a lot simpler. *** This fixes a problem in the current code whereby if fscache_invalidate() is followed sufficiently quickly by fscache_relinquish_cookie() then it is possible for __fscache_relinquish_cookie() to have detached the cookie from the object and cleared the pointer before a thread is dispatched to process the invalidation state in the object state machine. Since the pending write clearance was deferred to the invalidation state to make it asynchronous, we need to either wait in relinquishment for the stores tree to be cleared in the invalidation state or we need to handle the clearance in relinquishment. Further, if the relinquishment code does clear the tree, then the invalidation state need to make the clearance contingent on still having the cookie to hand (since that's where the tree is rooted) and we have to prevent the cookie from disappearing for the duration. This can lead to an oops like the following: BUG: unable to handle kernel NULL pointer dereference at 000000000000000c ... RIP: 0010:[<ffffffff8151023e>] _spin_lock+0xe/0x30 ... CR2: 000000000000000c ... ... Process kslowd002 (...) .... Call Trace: [<ffffffffa01c3278>] fscache_invalidate_writes+0x38/0xd0 [fscache] [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 [<ffffffff8105e759>] ? find_busiest_queue+0x69/0x150 [<ffffffff8110ddd4>] ? slow_work_enqueue+0x104/0x180 [<ffffffffa01c1303>] fscache_object_slow_work_execute+0x5e3/0x9d0 [fscache] [<ffffffff81096b67>] ? bit_waitqueue+0x17/0xd0 [<ffffffff8110e233>] slow_work_execute+0x233/0x310 [<ffffffff8110e515>] slow_work_thread+0x205/0x360 [<ffffffff81096ca0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8110e310>] ? slow_work_thread+0x0/0x360 [<ffffffff81096936>] kthread+0x96/0xa0 [<ffffffff8100c0ca>] child_rip+0xa/0x20 [<ffffffff810968a0>] ? kthread+0x0/0xa0 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 The parameter to fscache_invalidate_writes() was object->cookie which is NULL. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2013-06-19 14:16:47 +01:00
David Howells	caaef6900b	FS-Cache: Fix object state machine to have separate work and wait states Fix object state machine to have separate work and wait states as that makes it easier to envision. There are now three kinds of state: (1) Work state. This is an execution state. No event processing is performed by a work state. The function attached to a work state returns a pointer indicating the next state to which the OSM should transition. Returning NO_TRANSIT repeats the current state, but goes back to the scheduler first. (2) Wait state. This is an event processing state. No execution is performed by a wait state. Wait states are just tables of "if event X occurs, clear it and transition to state Y". The dispatcher returns to the scheduler if none of the events in which the wait state has an interest are currently pending. (3) Out-of-band state. This is a special work state. Transitions to normal states can be overridden when an unexpected event occurs (eg. I/O error). Instead the dispatcher disables and clears the OOB event and transits to the specified work state. This then acts as an ordinary work state, though object->state points to the overridden destination. Returning NO_TRANSIT resumes the overridden transition. In addition, the states have names in their definitions, so there's no need for tables of state names. Further, the EV_REQUEUE event is no longer necessary as that is automatic for work states. Since the states are now separate structs rather than values in an enum, it's not possible to use comparisons other than (non-)equality between them, so use some object->flags to indicate what phase an object is in. The EV_RELEASE, EV_RETIRE and EV_WITHDRAW events have been squished into one (EV_KILL). An object flag now carries the information about retirement. Similarly, the RELEASING, RECYCLING and WITHDRAWING states have been merged into an KILL_OBJECT state and additional states have been added for handling waiting dependent objects (JUMPSTART_DEPS and KILL_DEPENDENTS). A state has also been added for synchronising with parent object initialisation (WAIT_FOR_PARENT) and another for initiating look up (PARENT_READY). Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2013-06-19 14:16:47 +01:00
David Howells	493f7bc114	FS-Cache: Wrap checks on object state Wrap checks on object state (mostly outside of fs/fscache/object.c) with inline functions so that the mechanism can be replaced. Some of the state checks within object.c are left as-is as they will be replaced. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2013-06-19 14:16:47 +01:00
David Howells	610be24ee4	FS-Cache: Uninline fscache_object_init() Uninline fscache_object_init() so as not to expose some of the FS-Cache internals to the cache backend. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2013-06-19 14:16:47 +01:00
David Howells	0c59a95d90	FS-Cache: Don't sleep in page release if __GFP_FS is not set Don't sleep in __fscache_maybe_release_page() if __GFP_FS is not set. This goes some way towards mitigating fscache deadlocking against ext4 by way of the allocator, eg: INFO: task flush-8:0:24427 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. flush-8:0 D ffff88003e2b9fd8 0 24427 2 0x00000000 ffff88003e2b9138 0000000000000046 ffff880012e3a040 ffff88003e2b9fd8 0000000000011c80 ffff88003e2b9fd8 ffffffff81a10400 ffff880012e3a040 0000000000000002 ffff880012e3a040 ffff88003e2b9098 ffffffff8106dcf5 Call Trace: [<ffffffff8106dcf5>] ? __lock_is_held+0x31/0x53 [<ffffffff81219b61>] ? radix_tree_lookup_element+0xf4/0x12a [<ffffffff81454bed>] schedule+0x60/0x62 [<ffffffffa01d349c>] __fscache_wait_on_page_write+0x8b/0xa5 [fscache] [<ffffffff810498a8>] ? __init_waitqueue_head+0x4d/0x4d [<ffffffffa01d393a>] __fscache_maybe_release_page+0x30c/0x324 [fscache] [<ffffffffa01d369a>] ? __fscache_maybe_release_page+0x6c/0x324 [fscache] [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170 [<ffffffffa01fd7b2>] nfs_fscache_release_page+0x68/0x94 [nfs] [<ffffffffa01ef73e>] nfs_release_page+0x7e/0x86 [nfs] [<ffffffff810aa553>] try_to_release_page+0x32/0x3b [<ffffffff810b6c70>] shrink_page_list+0x535/0x71a [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170 [<ffffffff810b7352>] shrink_inactive_list+0x20a/0x2dd [<ffffffff81071a13>] ? mark_held_locks+0xbe/0xea [<ffffffff810b7a65>] shrink_lruvec+0x34c/0x3eb [<ffffffff810b7bd3>] do_try_to_free_pages+0xcf/0x355 [<ffffffff810b7fc8>] try_to_free_pages+0x9a/0xa1 [<ffffffff810b08d2>] __alloc_pages_nodemask+0x494/0x6f7 [<ffffffff810d9a07>] kmem_getpages+0x58/0x155 [<ffffffff810dc002>] fallback_alloc+0x120/0x1f3 [<ffffffff8106db23>] ? trace_hardirqs_off+0xd/0xf [<ffffffff810dbed3>] ____cache_alloc_node+0x177/0x186 [<ffffffff81162a6c>] ? ext4_init_io_end+0x1c/0x37 [<ffffffff810dc403>] kmem_cache_alloc+0xf1/0x176 [<ffffffff810b17ac>] ? test_set_page_writeback+0x101/0x113 [<ffffffff81162a6c>] ext4_init_io_end+0x1c/0x37 [<ffffffff81162ce4>] ext4_bio_write_page+0x20f/0x3af [<ffffffff8115cc02>] mpage_da_submit_io+0x26e/0x2f6 [<ffffffff811088e5>] ? __find_get_block_slow+0x38/0x133 [<ffffffff81161348>] mpage_da_map_and_submit+0x3a7/0x3bd [<ffffffff81161a60>] ext4_da_writepages+0x30d/0x426 [<ffffffff810b3359>] do_writepages+0x1c/0x2a [<ffffffff81102f4d>] __writeback_single_inode+0x3e/0xe5 [<ffffffff81103995>] writeback_sb_inodes+0x1bd/0x2f4 [<ffffffff81103b3b>] __writeback_inodes_wb+0x6f/0xb4 [<ffffffff81103c81>] wb_writeback+0x101/0x195 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170 [<ffffffff811043aa>] ? wb_do_writeback+0xaa/0x173 [<ffffffff8110434a>] wb_do_writeback+0x4a/0x173 [<ffffffff81071bbc>] ? trace_hardirqs_on+0xd/0xf [<ffffffff81038554>] ? del_timer+0x4b/0x5b [<ffffffff811044e0>] bdi_writeback_thread+0x6d/0x147 [<ffffffff81104473>] ? wb_do_writeback+0x173/0x173 [<ffffffff81048fbc>] kthread+0xd0/0xd8 [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55 [<ffffffff81456aac>] ret_from_fork+0x7c/0xb0 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55 2 locks held by flush-8:0/24427: #0: (&type->s_umount_key#41){.+.+..}, at: [<ffffffff810e3b73>] grab_super_passive+0x4c/0x76 #1: (jbd2_handle){+.+...}, at: [<ffffffff81190d81>] start_this_handle+0x475/0x4ea The problem here is that another thread, which is attempting to write the to-be-stored NFS page to the on-ext4 cache file is waiting for the journal lock, eg: INFO: task kworker/u:2:24437 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u:2 D ffff880039589768 0 24437 2 0x00000000 ffff8800395896d8 0000000000000046 ffff8800283bf040 ffff880039589fd8 0000000000011c80 ffff880039589fd8 ffff880039f0b040 ffff8800283bf040 0000000000000006 ffff8800283bf6b8 ffff880039589658 ffffffff81071a13 Call Trace: [<ffffffff81071a13>] ? mark_held_locks+0xbe/0xea [<ffffffff81455e73>] ? _raw_spin_unlock_irqrestore+0x3a/0x50 [<ffffffff81071b53>] ? trace_hardirqs_on_caller+0x114/0x170 [<ffffffff81071bbc>] ? trace_hardirqs_on+0xd/0xf [<ffffffff81454bed>] schedule+0x60/0x62 [<ffffffff81190c23>] start_this_handle+0x317/0x4ea [<ffffffff810498a8>] ? __init_waitqueue_head+0x4d/0x4d [<ffffffff81190fcc>] jbd2__journal_start+0xb3/0x12e [<ffffffff81176606>] __ext4_journal_start_sb+0xb2/0xc6 [<ffffffff8115f137>] ext4_da_write_begin+0x109/0x233 [<ffffffff810a964d>] generic_file_buffered_write+0x11a/0x264 [<ffffffff811032cf>] ? __mark_inode_dirty+0x2d/0x1ee [<ffffffff810ab1ab>] __generic_file_aio_write+0x2a5/0x2d5 [<ffffffff810ab24a>] generic_file_aio_write+0x6f/0xd0 [<ffffffff81159a2c>] ext4_file_write+0x38c/0x3c4 [<ffffffff810e0915>] do_sync_write+0x91/0xd1 [<ffffffffa00a17f0>] cachefiles_write_page+0x26f/0x310 [cachefiles] [<ffffffffa01d470b>] fscache_write_op+0x21e/0x37a [fscache] [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e [<ffffffffa01d2479>] fscache_op_work_func+0x78/0xd7 [fscache] [<ffffffff8104455a>] process_one_work+0x232/0x3a8 [<ffffffff810444ff>] ? process_one_work+0x1d7/0x3a8 [<ffffffff81044ee0>] worker_thread+0x214/0x303 [<ffffffff81044ccc>] ? manage_workers+0x245/0x245 [<ffffffff81048fbc>] kthread+0xd0/0xd8 [<ffffffff81455eb2>] ? _raw_spin_unlock_irq+0x29/0x3e [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55 [<ffffffff81456aac>] ret_from_fork+0x7c/0xb0 [<ffffffff81048eec>] ? __init_kthread_worker+0x55/0x55 4 locks held by kworker/u:2/24437: #0: (fscache_operation){.+.+.+}, at: [<ffffffff810444ff>] process_one_work+0x1d7/0x3a8 #1: ((&op->work)){+.+.+.}, at: [<ffffffff810444ff>] process_one_work+0x1d7/0x3a8 #2: (sb_writers#14){.+.+.+}, at: [<ffffffff810ab22c>] generic_file_aio_write+0x51/0xd0 #3: (&sb->s_type->i_mutex_key#19){+.+.+.}, at: [<ffffffff810ab236>] generic_file_aio_write+0x5b/0x fscache already tries to cancel pending stores, but it can't cancel a write for which I/O is already in progress. An alternative would be to accept writing garbage to the cache under extreme circumstances and to kill the afflicted cache object if we have to do this. However, we really need to know how strapped the allocator is before deciding to do that. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2013-06-19 14:16:47 +01:00
J. Bruce Fields	6bd5e82b09	CacheFiles: name i_mutex lock class explicitly Just some cleanup. (And note the caller of this function may, for example, call vfs_unlink on a child, so the "1" (I_MUTEX_PARENT) really was what was intended here.) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2013-06-19 14:16:47 +01:00
Sebastian Andrzej Siewior	ee8be57bc3	fs/fscache: remove spin_lock() from the condition in while() The spinlock() within the condition in while() will cause a compile error if it is not a function. This is not a problem on mainline but it does not look pretty and there is no reason to do it that way. That patch writes it a little differently and avoids the double condition. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2013-06-19 14:16:47 +01:00
David Howells	cb65537ee1	Add wait_on_atomic_t() and wake_up_atomic_t() Add wait_on_atomic_t() and wake_up_atomic_t() to indicate became-zero events on atomic_t types. This uses the bit-wake waitqueue table. The key is set to a value outside of the number of bits in a long so that wait_on_bit() won't be woken up accidentally. What I'm using this for is: in a following patch I add a counter to struct fscache_cookie to count the number of outstanding operations that need access to netfs data. The way this works is: (1) When a cookie is allocated, the counter is initialised to 1. (2) When an operation wants to access netfs data, it calls atomic_inc_unless() to increment the counter before it does so. If it was 0, then the counter isn't incremented, the operation isn't permitted to access the netfs data (which might by this point no longer exist) and the operation aborts in some appropriate manner. (3) When an operation finishes with the netfs data, it decrements the counter and if it reaches 0, calls wake_up_atomic_t() on it - the assumption being that it was the last blocker. (4) When a cookie is released, the counter is decremented and the releaser uses wait_on_atomic_t() to wait for the counter to become 0 - which should indicate no one is using the netfs data any longer. The netfs data can then be destroyed. There are some alternatives that I have thought of and that have been suggested by Tejun Heo: (A) Using wait_on_bit() to wait on a bit in the counter. This doesn't work because if that bit happens to be 0 then the wait won't happen - even if the counter is non-zero. (B) Using wait_on_bit() to wait on a flag elsewhere which is cleared when the counter reaches 0. Such a flag would be redundant and would add complexity. (C) Adding a waitqueue to fscache_cookie - this would expand that struct by several words for an event that happens just once in each cookie's lifetime. Further, cookies are generally per-file so there are likely to be a lot of them. (D) Similar to (C), but add a pointer to a waitqueue in the cookie instead of a waitqueue. This would add single word per cookie and so would be less of an expansion - but still an expansion. (E) Adding a static waitqueue to the fscache module. Generally this would be fine, but under certain circumstances many cookies will all get added at the same time (eg. NFS umount, cache withdrawal) thereby presenting scaling issues. Note that the wait may be significant as disk I/O may be in progress. So, I think reusing the wait_on_bit() waitqueue set is reasonable. I don't make much use of the waitqueue I need on a per-cookie basis, but sometimes I have a huge flood of the cookies to deal with. I also don't want to add a whole new set of global waitqueue tables specifically for the dec-to-0 event if I can reuse the bit tables. Signed-off-by: David Howells <dhowells@redhat.com> Tested-By: Milosz Tanski <milosz@adfin.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2013-05-15 13:50:38 +01:00
Linus Torvalds	b973425cbb	Fixed regressions (two stability regressions and a performance regression) introduced during the 3.10-rc1 merge window. Also included is a bug fix relating to allocating blocks after resizing an ext3 file system when using the ext4 file system driver. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJRkZBlAAoJENNvdpvBGATwLYQP/iWOBs2z93WG23cqkgqvL8o6 ZyeJdgy9dkFCArVDX5SSnGkJXZ3iqIKi5HoTKTJKfytgMzgiDAZcLsIHVv6NczwR UGhjgS3HEdV5tJ46E6JnpB3NLSb+rAdc5kCdlsbzU46CP+JjFiYEhxVpK7ELuM/G yctChbIH9FY+1OwxHccacBOaJU2ELhnH6B/8Ry/6gM2H0vfKeTNOdocOHdxvbNqg ooGjytMfVopMQEfVG8aXtTfy341NFJH5fAYEahCcXxeO9ta6Unj9yOu5JV2wVrTt 39+DBsquGX6AVQsc9IxJ6YAN6ldwWN7l3huE9/AI0o/alwGsfVi5M+M/d1MMjDqf Fgl2EzzBpZQeKKY9UXNi4LLgYdBiILMgKDOGoRKhRb8ynSSf/JX43+24FvidEi3o o//J4aR+oSZfaovGAeikqyF1cumayhoNN8MINRN8igIinBiC4GjBFEl/Kl/1eAY/ lREGcsmYPXOkVPpM72waRYlP4GwNdOg4QSEY0SGljpwluO+dYtKQjHXcv/s/xL5v j3GemzYVyjx4zaq1g3PxGfuD6VKFHr0T6jvzd6cHu17lnPlw9fwznHbEm9BEcXDY gbGx9u+a2ZTqDwYVALbeoRpf9Zz6DUCse3ts4N3rbkXUQQiBYo7tybfVopIMAukb CexvidDE/ryJrJJFBwoK =6cRD -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 update from Ted Ts'o: "Fixed regressions (two stability regressions and a performance regression) introduced during the 3.10-rc1 merge window. Also included is a bug fix relating to allocating blocks after resizing an ext3 file system when using the ext4 file system driver" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd,jbd2: fix oops in jbd2_journal_put_journal_head() ext4: revert "ext4: use io_end for multiple bios" ext4: limit group search loop for non-extent files ext4: fix fio regression	2013-05-14 09:30:54 -07:00
Linus Torvalds	7fb30d2b60	Merge branch 'for-3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fix from Tejun Heo: "A fix for a workqueue_congested() regression that broke fscache" * 'for-3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: workqueue_congested() shouldn't translate WORK_CPU_UNBOUND into node number	2013-05-14 09:06:29 -07:00
Linus Torvalds	a2c7a54fcc	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Benjamin Herrenschmidt: "This is mostly bug fixes (some of them regressions, some of them I deemed worth merging now) along with some patches from Li Zhong hooking up the new context tracking stuff (for the new full NO_HZ)" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits) powerpc: Set show_unhandled_signals to 1 by default powerpc/perf: Fix setting of "to" addresses for BHRB powerpc/pmu: Fix order of interpreting BHRB target entries powerpc/perf: Move BHRB code into CONFIG_PPC64 region powerpc: select HAVE_CONTEXT_TRACKING for pSeries powerpc: Use the new schedule_user API on userspace preemption powerpc: Exit user context on notify resume powerpc: Exception hooks for context tracking subsystem powerpc: Syscall hooks for context tracking subsystem powerpc/booke64: Fix kernel hangs at kernel_dbg_exc powerpc: Fix irq_set_affinity() return values powerpc: Provide __bswapdi2 powerpc/powernv: Fix starting of secondary CPUs on OPALv2 and v3 powerpc/powernv: Detect OPAL v3 API version powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning again powerpc: Make CONFIG_RTAS_PROC depend on CONFIG_PROC_FS powerpc: Bring all threads online prior to migration/hibernation powerpc/rtas_flash: Fix validate_flash buffer overflow issue powerpc/kexec: Fix kexec when using VMX optimised memcpy powerpc: Fix build errors STRICT_MM_TYPECHECKS ...	2013-05-14 07:43:11 -07:00
Benjamin Herrenschmidt	e34166ad63	powerpc: Set show_unhandled_signals to 1 by default Just like other architectures Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 18:01:04 +10:00
Michael Neuling	691231846c	powerpc/perf: Fix setting of "to" addresses for BHRB Currently we only set the "to" address in the branch stack when the CPU explicitly gives us a value. Unfortunately it only does this for XL form branches (eg blr, bctr, bctar) and not I and B form branches (eg b, bc). Fortunately if we read the instruction from memory we can extract the offset of a branch and calculate the target address. This adds a function power_pmu_bhrb_to() to calculate the target/to address of the corresponding I and B form branches. It handles branches in both user and kernel spaces. It also plumbs this into the perf brhb reading code. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 16:00:22 +10:00
Michael Neuling	506e70d132	powerpc/pmu: Fix order of interpreting BHRB target entries The current Branch History Rolling Buffer (BHRB) code misinterprets the order of entries in the hardware buffer. It assumes that a branch target address will be read _after_ its corresponding branch. In reality the branch target comes before (lower mfbhrb entry) it's corresponding branch. This is a rewrite of the code to take this into account. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 16:00:22 +10:00
Michael Neuling	d52f2dc40b	powerpc/perf: Move BHRB code into CONFIG_PPC64 region The new Branch History Rolling buffer (BHRB) code is only useful on 64bit processors, so move it into the #ifdef CONFIG_PPC64 region. This avoids code bloat on 32bit systems. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 16:00:21 +10:00
Li Zhong	a1797b2fd2	powerpc: select HAVE_CONTEXT_TRACKING for pSeries Start context tracking support from pSeries. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 16:00:21 +10:00
Li Zhong	5d1c574511	powerpc: Use the new schedule_user API on userspace preemption This patch corresponds to [PATCH] x86: Use the new schedule_user API on userspace preemption commit `0430499ce9` Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 16:00:20 +10:00
Li Zhong	106ed886ab	powerpc: Exit user context on notify resume This patch allows RCU usage in do_notify_resume, e.g. signal handling. It corresponds to [PATCH] x86: Exit RCU extended QS on notify resume commit `edf55fda35` Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 16:00:20 +10:00
Li Zhong	ba12eedee3	powerpc: Exception hooks for context tracking subsystem This is the exception hooks for context tracking subsystem, including data access, program check, single step, instruction breakpoint, machine check, alignment, fp unavailable, altivec assist, unknown exception, whose handlers might use RCU. This patch corresponds to [PATCH] x86: Exception hooks for userspace RCU extended QS commit `6ba3c97a38` But after the exception handling moved to generic code, and some changes in following two commits: `56dd9470d7` context_tracking: Move exception handling to generic code `6c1e0256fa` context_tracking: Restore correct previous context state on exception exit it is able for exception hooks to use the generic code above instead of a redundant arch implementation. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 16:00:19 +10:00
Li Zhong	22ecbe8dce	powerpc: Syscall hooks for context tracking subsystem This is the syscall slow path hooks for context tracking subsystem, corresponding to [PATCH] x86: Syscall hooks for userspace RCU extended QS commit `bf5a3c13b9` TIF_MEMDIE is moved to the second 16-bits (with value 17), as it seems there is no asm code using it. TIF_NOHZ is added to _TIF_SYCALL_T_OR_A, so it is better for it to be in the same 16 bits with others in the group, so in the asm code, andi. with this group could work. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 16:00:19 +10:00
Scott Wood	6cecf76b47	powerpc/booke64: Fix kernel hangs at kernel_dbg_exc MSR_DE is not cleared on entry to the kernel, and we don't clear it explicitly outside of debug code. If we have MSR_DE set in prime_debug_regs(), and the new thread has events enabled in DBCR0 (e.g. ICMP is set in thread->dbsr0, even though it was cleared in the real DBCR0 when the thread got scheduled out), we'll end up taking a debug exception in the kernel when DBCR0 is loaded. DSRR0 will not point to an exception vector, and the kernel ends up hanging at kernel_dbg_exc. Fix this by always clearing MSR_DE when we load new debug state. Another observed source of kernel_dbg_exc hangs is with the branch taken event. If this event is active, but we take a non-debug trap (e.g. a TLB miss or an asynchronous interrupt) before the next branch. We end up taking a branch-taken debug exception on the initial branch instruction of the exception vector, but because the debug exception is DBSR_BT rather than DBSR_IC we branch to kernel_dbg_exc before even checking the DSRR0 address. Fix this by checking for DBSR_BT as well as DBSR_IC, which is what 32-bit does and what the comments suggest was intended in the 64-bit code as well. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 16:00:19 +10:00
Alexander Gordeev	dcb615aef9	powerpc: Fix irq_set_affinity() return values Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 16:00:18 +10:00
David Woodhouse	ca9d7aea59	powerpc: Provide __bswapdi2 Some versions of GCC apparently expect this to be provided by libgcc. Updates from Mikey to fix 32 bit version and adding "r" to registers. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 16:00:17 +10:00
Benjamin Herrenschmidt	b2b48584df	powerpc/powernv: Fix starting of secondary CPUs on OPALv2 and v3 The current code fails to handle kexec on OPALv2. This fixes it and adds code to improve the situation on OPALv3 where we can query the CPU status from the firmware and decide what to do based on that. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 15:12:31 +10:00
Benjamin Herrenschmidt	75b93da43a	powerpc/powernv: Detect OPAL v3 API version Future firmwares will support that new version Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 15:10:02 +10:00
Li Zhong	af945cf4bf	powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning again Saw this warning again, and this time from the ret_from_fork path. It seems we could clear the back chain earlier in copy_thread(), which could cover both path, and also fix potential lockdep usage in schedule_tail(), or exception occurred before we clear the back chain. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 14:36:35 +10:00
Michael Ellerman	b80ec3dc8e	powerpc: Make CONFIG_RTAS_PROC depend on CONFIG_PROC_FS We are getting build errors with CONFIG_PROC_FS=n: arch/powerpc/kernel/rtas_flash.c In function 'rtas_flash_init': 745:33: error: unused variable 'f' [-Werror=unused-variable] But rtas_flash.c should not be built when CONFIG_PROC_FS=n, beacause all it does is provide a /proc interface to the RTAS flash routines. CONFIG_RTAS_FLASH already depends on CONFIG_RTAS_PROC, to indicate that it depends on the RTAS proc support, but CONFIG_RTAS_PROC does not depend on CONFIG_PROC_FS. So fix that. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 14:36:32 +10:00
Robert Jennings	120496ac2d	powerpc: Bring all threads online prior to migration/hibernation This patch brings online all threads which are present but not online prior to migration/hibernation. After migration/hibernation those threads are taken back offline. During migration/hibernation all online CPUs must call H_JOIN, this is required by the hypervisor. Without this patch, threads that are offline (H_CEDE'd) will not be woken to make the H_JOIN call and the OS will be deadlocked (all threads either JOIN'd or CEDE'd). Cc: <stable@kernel.org> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 14:36:29 +10:00
Vasant Hegde	a94a14720e	powerpc/rtas_flash: Fix validate_flash buffer overflow issue ibm,validate-flash-image RTAS call output buffer contains 150 - 200 bytes of data on latest system. Presently we have output buffer size as 64 bytes and we use sprintf to copy data from RTAS buffer to local buffer. This causes kernel oops (see below call trace). This patch increases local buffer size to 256 and also uses snprintf instead of sprintf to copy data from RTAS buffer. Kernel call trace : ------------------- Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=1024 NUMA pSeries Modules linked in: nfs fscache lockd auth_rpcgss nfs_acl sunrpc fuse loop dm_mod ipv6 ipv6_lib usb_storage ehea(X) sr_mod qlge ses cdrom enclosure st be2net sg ext3 jbd mbcache usbhid hid ohci_hcd ehci_hcd usbcore qla2xxx usb_common sd_mod crc_t10dif scsi_dh_hp_sw scsi_dh_rdac scsi_dh_alua scsi_dh_emc scsi_dh lpfc scsi_transport_fc scsi_tgt ipr(X) libata scsi_mod Supported: Yes NIP: 4520323031333130 LR: 4520323031333130 CTR: 0000000000000000 REGS: c0000001b91779b0 TRAP: 0400 Tainted: G X (3.0.13-0.27-ppc64) MSR: 8000000040009032 <EE,ME,IR,DR> CR: 44022488 XER: 20000018 TASK = c0000001bca1aba0[4736] 'cat' THREAD: c0000001b9174000 CPU: 36 GPR00: 4520323031333130 c0000001b9177c30 c000000000f87c98 000000000000009b GPR04: c0000001b9177c4a 000000000000000b 3520323031333130 2032303133313031 GPR08: 3133313031350a4d 000000000000009b 0000000000000000 c0000000003664a4 GPR12: 0000000022022448 c000000003ee6c00 0000000000000002 00000000100e8a90 GPR16: 00000000100cb9d8 0000000010093370 000000001001d310 0000000000000000 GPR20: 0000000000008000 00000000100fae60 000000000000005e 0000000000000000 GPR24: 0000000010129350 46573738302e3030 2046573738302e30 300a4d4720323031 GPR28: 333130313520554e 4b4e4f574e0a4d47 2032303133313031 3520323031333130 NIP [4520323031333130] 0x4520323031333130 LR [4520323031333130] 0x4520323031333130 Call Trace: [c0000001b9177c30] [4520323031333130] 0x4520323031333130 (unreliable) Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 14:36:26 +10:00
Anton Blanchard	79c66ce8f6	powerpc/kexec: Fix kexec when using VMX optimised memcpy commit `b3f271e86e` (powerpc: POWER7 optimised memcpy using VMX and enhanced prefetch) uses VMX when it is safe to do so (ie not in interrupt). It also looks at the task struct to decide if we have to save the current tasks' VMX state. kexec calls memcpy() at a point where the task struct may have been overwritten by the new kexec segments. If it has been overwritten then when memcpy -> enable_altivec looks up current->thread.regs->msr we get a cryptic oops or lockup. I also notice we aren't initialising thread_info->cpu, which means smp_processor_id is broken. Fix that too. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@vger.kernel.org> # 3.6+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 14:36:23 +10:00
Aneesh Kumar K.V	83d5e64b7e	powerpc: Fix build errors STRICT_MM_TYPECHECKS Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 14:36:20 +10:00
Aneesh Kumar K.V	613e60a661	powerpc/mm: Use the correct mask value when looking at pgtable address Our pgtable are 2sizeof(pte_t)PTRS_PER_PTE which is PTE_FRAG_SIZE. Instead of depending on frag size, mask with PMD_MASKED_BITS. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-05-14 14:36:17 +10:00
Linus Torvalds	674825d050	Xen ARM initialization fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJRkRGeAAoJEIlPj0hw4a6QVtQQAKFd2beNkM9cdqnuItZKCNng 6Hh99woohgKZe5WKmBbmESBh3L034fCHRNHKocz8XNS9r+/EHDbEtehAEO6GkbnR QtrJ0CFeh2ESOMzI5JIeLuzhL65KYP+pfJyhbNDB707Gtw3snlgLsPCDEoEVbXTS O/AfNchmKwUOOfuqibykaK1Cuj1C4qq2huVPEVseCy6KC6VSQk0kBBAbZ3IFWRJ0 71saW+iMBfQ8FTFyJmj74sQZs1+Ee1GZ4TKS3CCeIzsozsxbzdzul0mrdgcs6xQN F4ar+0weVAzsV3CGEP0fw9WFYGfGKvdG/39gE5ZrOh3DI4kx+Q79sW1CY3JVWnJn kBXg6zb2rFoL5sB+NOSq3lL/EDNeHl0nle/r/dh/3IJS18OMYnwmLlEXrGG1h1qt tdHf9bIrqwJ6fw4e5Vf8aTbdK8clwZKXpYDsHSbms50qPwIBNnC1byc8FWsKYXbW XYIlk8yVdv/HqKzBjusAXe+g9oa2280N8+eIJ/CaCEfi+XMLPWOevs7G7ZvV79CJ nQ1Q9scMVKtdLduGkChxURTLZ2fEkYSaharylveRgq81lHoLnBiyAl/GY74BWqyH auvqCkbouH4EjeoD1HP+habdcVsc4bbroFSf2uNOLjHhFxqXn70LPoSP4QVSDbYN J0cFnLv0/9/khSZane2z =Q3Ri -----END PGP SIGNATURE----- Merge tag 'fixes-for-3.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen Pull Xen/arm fixes from Stefano Stabellini: "This contains a couple of Xen on ARM initialization fixes and a patch to improve error handling" * tag 'fixes-for-3.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen: xen/arm: rename xen_secondary_init and run it on every online cpu xen/arm: do not handle VCPUOP_register_vcpu_info failures xen/arm: initialize pm functions later	2013-05-13 19:03:49 -07:00
Linus Torvalds	c83bb88589	Merge branch 'parisc-for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc update from Helge Deller: "The second round of parisc updates for 3.10 includes build fixes and enhancements to utilize irq stacks, fixes SMP races when updating PTE and TLB entries by proper locking and makes the search for the correct cross compiler more robust on Debian and Gentoo." * 'parisc-for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: make default cross compiler search more robust (v3) parisc: fix SMP races when updating PTE and TLB entries in entry.S parisc: implement irq stacks - part 2 (v2)	2013-05-13 16:49:59 -07:00
Linus Torvalds	dbbffe6898	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Several small bug fixes all over: 1) be2net driver uses wrong payload length when submitting MAC list get requests to the chip. From Sathya Perla. 2) Fix mwifiex memory leak on driver unload, from Amitkumar Karwar. 3) Prevent random memory access in batman-adv, from Marek Lindner. 4) batman-adv doesn't check for pskb_trim_rcsum() errors, also from Marek Lindner. 5) Fix fec crashes on rapid link up/down, from Frank Li. 6) Fix inner protocol grovelling in GSO, from Pravin B Shelar. 7) Link event validation fix in qlcnic from Rajesh Borundia. 8) Not all FEC chips can support checksum offload, fix from Shawn Guo. 9) EXPORT_SYMBOL + inline doesn't make any sense, from Denis Efremov. 10) Fix race in passthru mode during device removal in macvlan, from Jiri Pirko. 11) Fix RCU hash table lookup socket state race in ipv6, leading to NULL pointer derefs, from Eric Dumazet. 12) Add several missing HAS_DMA kconfig dependencies, from Geert Uyttterhoeven. 13) Fix bogus PCI resource management in 3c59x driver, from Sergei Shtylyov. 14) Fix info leak in ipv6 GRE tunnel driver, from Amerigo Wang. 15) Fix device leak in ipv6 IPSEC policy layer, from Cong Wang. 16) DMA mapping leak fix in qlge from Thadeu Lima de Souza Cascardo. 17) Missing iounmap on probe failure in bna driver, from Wei Yongjun." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits) bna: add missing iounmap() on error in bnad_init() qlge: fix dma map leak when the last chunk is not allocated xfrm6: release dev before returning error ipv6,gre: do not leak info to user-space virtio_net: use default napi weight by default emac: Fix EMAC soft reset on 460EX/GT 3c59x: fix PCI resource management caif: CAIF_VIRTIO should depend on HAS_DMA net/ethernet: MACB should depend on HAS_DMA net/ethernet: ARM_AT91_ETHER should depend on HAS_DMA net/wireless: ATH9K should depend on HAS_DMA net/ethernet: STMMAC_ETH should depend on HAS_DMA net/ethernet: NET_CALXEDA_XGMAC should depend on HAS_DMA ipv6: do not clear pinet6 field macvlan: fix passthru mode race between dev removal and rx path ipv4: ip_output: remove inline marking of EXPORT_SYMBOL functions net/mlx4: Strengthen VLAN tags/priorities enforcement in VST mode net/mlx4_core: Add missing report on VST and spoof-checking dev caps net: fec: enable hardware checksum only on imx6q-fec qlcnic: Fix validation of link event command. ...	2013-05-13 13:25:36 -07:00
Helge Deller	6880b0150a	parisc: make default cross compiler search more robust (v3) People/distros vary how they prefix the toolchain name for 64bit builds. Rather than enforce one convention over another, add a for loop which does a search for all the general prefixes. For 64bit builds, we now search for (in order): hppa64-unknown-linux-gnu hppa64-linux-gnu hppa64-linux For 32bit builds, we look for: hppa-unknown-linux-gnu hppa-linux-gnu hppa-linux hppa2.0-unknown-linux-gnu hppa2.0-linux-gnu hppa2.0-linux hppa1.1-unknown-linux-gnu hppa1.1-linux-gnu hppa1.1-linux This patch was initiated by Mike Frysinger, with feedback from Jeroen Roovers, John David Anglin and Helge Deller. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Jeroen Roovers <jer@gentoo.org> Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>	2013-05-13 22:11:57 +02:00
Wei Yongjun	ba21fc696d	bna: add missing iounmap() on error in bnad_init() Add the missing iounmap() before return from bnad_init() in the error handling case. Introduced by commit `01b54b1451` (bna: tx rx cleanup fix). Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-13 12:54:38 -07:00
Thadeu Lima de Souza Cascardo	ef38079401	qlge: fix dma map leak when the last chunk is not allocated qlge allocates chunks from a page that it maps and unmaps that page when the last chunk is released. When the driver is unloaded or the card is removed, all chunks are released and the page is unmapped for the last chunk. However, when the last chunk of a page is not allocated and the device is removed, that page is not unmapped. In fact, its last reference is not put and there's also a page leak. This bug prevents a device from being properly hotplugged. When the DMA API debug option is enabled, the following messages show the pending DMA allocation after we remove the driver. This patch fixes the bug by unmapping and putting the page from the ring if its last chunk has not been allocated. pci 0005:98:00.0: DMA-API: device driver has pending DMA allocations while released from device [count=1] One of leaked entries details: [device address=0x0000000060a80000] [size=65536 bytes] [mapped with DMA_FROM_DEVICE] [mapped as page] ------------[ cut here ]------------ WARNING: at lib/dma-debug.c:746 Modules linked in: qlge(-) rpadlpar_io rpaphp pci_hotplug fuse [last unloaded: qlge] NIP: c0000000003fc3ec LR: c0000000003fc3e8 CTR: c00000000054de60 REGS: c0000003ee9c74e0 TRAP: 0700 Tainted: G O (3.7.2) MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 28002424 XER: 00000001 SOFTE: 1 CFAR: c0000000007a39c8 TASK = c0000003ee8d5c90[8406] 'rmmod' THREAD: c0000003ee9c4000 CPU: 31 GPR00: c0000000003fc3e8 c0000003ee9c7760 c000000000c789f8 00000000000000ee GPR04: 0000000000000000 00000000000000ef 0000000000004000 0000000000010000 GPR08: 00000000000000be c000000000b22088 c000000000c4c218 00000000007c0000 GPR12: 0000000028002422 c00000000ff26c80 0000000000000000 000001001b0f1b40 GPR16: 00000000100cb9d8 0000000010093088 c000000000cdf910 0000000000000001 GPR20: 0000000000000000 c000000000dbfc00 0000000000000000 c000000000dbfb80 GPR24: c0000003fafc9d80 0000000000000001 000000000001ff80 c0000003f38f7888 GPR28: c000000000ddfc00 0000000000000400 c000000000bd7790 c000000000ddfb80 NIP [c0000000003fc3ec] .dma_debug_device_change+0x22c/0x2b0 LR [c0000000003fc3e8] .dma_debug_device_change+0x228/0x2b0 Call Trace: [c0000003ee9c7760] [c0000000003fc3e8] .dma_debug_device_change+0x228/0x2b0 (unreliable) [c0000003ee9c7840] [c00000000079a098] .notifier_call_chain+0x78/0xf0 [c0000003ee9c78e0] [c0000000000acc20] .__blocking_notifier_call_chain+0x70/0xb0 [c0000003ee9c7990] [c0000000004a9580] .__device_release_driver+0x100/0x140 [c0000003ee9c7a20] [c0000000004a9708] .driver_detach+0x148/0x150 [c0000003ee9c7ac0] [c0000000004a8144] .bus_remove_driver+0xc4/0x150 [c0000003ee9c7b60] [c0000000004aa58c] .driver_unregister+0x8c/0xe0 [c0000003ee9c7bf0] [c0000000004090b4] .pci_unregister_driver+0x34/0xf0 [c0000003ee9c7ca0] [d000000002231194] .qlge_exit+0x1c/0x34 [qlge] [c0000003ee9c7d20] [c0000000000e36d8] .SyS_delete_module+0x1e8/0x290 [c0000003ee9c7e30] [c0000000000098d4] syscall_exit+0x0/0x94 Instruction dump: 7f26cb78 e818003a e87e81a0 e8f80028 e9180030 796b1f24 78001f24 7d6a5a14 7d2a002a e94b0020 483a7595 60000000 <0fe00000> 2fb80000 40de0048 80120050 ---[ end trace 4294f9abdb01031d ]--- Mapped at: [<d000000002222f54>] .ql_update_lbq+0x384/0x580 [qlge] [<d000000002227bd0>] .ql_clean_inbound_rx_ring+0x300/0xc60 [qlge] [<d0000000022288cc>] .ql_napi_poll_msix+0x39c/0x5a0 [qlge] [<c0000000006b3c50>] .net_rx_action+0x170/0x300 [<c000000000081840>] .__do_softirq+0x170/0x300 Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Acked-by: Jitendra Kalsaria <Jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-05-13 12:54:38 -07:00
Stefano Stabellini	3cc8e40e8f	xen/arm: rename xen_secondary_init and run it on every online cpu Rename xen_secondary_init to xen_percpu_init. Run xen_percpu_init on the each online cpu, reuse the current on_each_cpu call. Merge xen_percpu_enable_events into xen_percpu_init. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2013-05-13 16:14:25 +00:00
Stefano Stabellini	d7266d7894	xen/arm: do not handle VCPUOP_register_vcpu_info failures We expect VCPUOP_register_vcpu_info to succeed, do not try to handle failures. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>	2013-05-13 16:14:22 +00:00
Stefano Stabellini	1aa3d8d993	xen/arm: initialize pm functions later If we are running in dom0, we have to wait for the arch specific code to complete the initialization in order for us to successfully reset the power_off and pm_restart functions. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2013-05-13 16:13:04 +00:00
Linus Torvalds	1f638766ff	spi: Updates for v3.10 A few driver specific fixes plus improved error handling in the generic DT GPIO chipselect handling - not exciting but useful. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJRkPjpAAoJELSic+t+oim955wP/RD+X7Y7VtDZ2NGwuMndRM0j F63XfXcqSTeQCq5KvOkOW0FUO8lY7MkPIMUE95PZ2jcQW4lXYo5vrhiW8vpn6+fk ALOqqx9fZfcf/zmi6FRDMqyqor8GSnHEXsif+4ZoP6dwyaKiqICk51Fk5ZEF4+ZQ 98I6aEhxqz3I4J4KfBq5YVpdqpxaAG/USQE8IvLyAjKzJ8rLnfS4J7Uy9vfcSDW3 /7KZoDqMysOXRBN5sNSiOjBg0xD3hnKlnCsQtlktK+zE2b4vJQZBLRv2gyjSrfV0 GIcwlAandP9SD+rjKY1a8oLCj4P4h+u7TdVhKf3zdTOG5/IguvmiSMfgwoHFkhxn GmFD3yXNXp+pFtJHYGznQh3/+36uSKZmWLrYZtxS5X/AEYSl4wqczHoMgF9JvYiQ I+H59+lG3J4TfjSZ6sREcDPu231Sy2PMfyRgC+EFgi/W3F1IYC/apgGqA39MCo9H hP9oAxCNqcrzmdLsJ+Y3rM47wMGdTzYxcpkkEKZAAE04K5yU3+t/GZnqSXZv9bQO gxJHX9nPRgghWsMbg83RdrUuWARrV2i+Sd4sojFEztzC5nV1oDMOBotkqSAdwQ71 h69lWxsOqDQwJhwvFc43TbWmMVWV8BjNthGbcjEM5+n0oaCPjJ4WfKITJvZol1xv kSmwaZ7514E7Le+otttD =FUCL -----END PGP SIGNATURE----- Merge tag 'spi-v3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi updates from Mark Brown: "A few driver specific fixes plus improved error handling in the generic DT GPIO chipselect handling - not exciting but useful." * tag 'spi-v3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi/spi-atmel: BUG: fix doesn' support 16 bits transfers using PIO spi/davinci: fix module build error spi: Return error from of_spi_register_master on bad "cs-gpios" property spi: Initialize cs_gpio and cs_gpios with -ENOENT spi/atmel: fix speed_hz check in atmel_spi_transfer()	2013-05-13 08:12:18 -07:00
Linus Torvalds	fea0f9ff56	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Just a few straggling fixes I hoovered up, and an intel fixes pull from Daniel which fixes some regressions, and some mgag200 fixes from Matrox." * 'drm-next' of git://people.freedesktop.org/~airlied/linux: drm/mgag200: Fix framebuffer base address programming drm/mgag200: Convert counter delays to jiffies drm/mgag200: Fix writes into MGA1064_PIX_CLK_CTL register drm/mgag200: Don't change unrelated registers during modeset drm: Only print a debug message when the polled connector has changed drm: Make the HPD status updates debug logs more readable drm: Use names of ioctls in debug traces drm: Remove pointless '-' characters from drm_fb_helper documentation drm: Add kernel-doc for drm_fb_helper_funcs->initial_config drm: refactor call to request_module drm: Don't prune modes loudly when a connector is disconnected drm: Add missing break in the command line mode parsing code drm/i915: clear the stolen fb before resuming Revert "drm/i915: Calculate correct stolen size for GEN7+" drm/i915: hsw: fix link training for eDP on port-A Revert "drm/i915: revert eDP bpp clamping code changes" drm: don't check modeset locks in panic handler drm/i915: Fix pipe enabled mask for pipe C in WM calculations drm/mm: fix dump table BUG drm/i915: Always normalize return timeout for wait_timeout_ioctl	2013-05-13 07:59:59 -07:00
Linus Torvalds	aef2ea912e	Missing license tag and some fallout from the lguest pagetable rework. Thanks, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJRjzaJAAoJENkgDmzRrbjx5pEP/jzSZlYi56g3ixr7Up15k5xa sXpKiUnG6ubfFYdOkwIgE4vq5Ftsv6QUOfSgnLsiqlAJce3N2vSnXkBkpEwMnuoI heXXCfwyihyBu5H1OdOhe+dt7EVQnUBn3kziezKHuPfbakFz9ItAWhjDjSnD2G63 rb/b4Wq61h6mQgg7jMVqGfXfgXCFceniGnAZPGD6TdG6Xovfh1LRu8B3F6wrGysC zAcf08gDQN02NY+O7wMD2MpeQSoo2OTfCg9jBV5Bc82qltGag2Ju9Z5rc7Sgng/k RrGpQG3A/9JmbU4zWQQhmu3CsaaxOCHQqEw7iXF4QtqjJdoh0tsc0HAZ4DtznDkr 5v52SaiJeoZ43Sf6XXvboZAv+bGmwUIAbes6brgNRCCuv7AQit3x69qJabdVc05U XV61o/CmsEN+S9bOqW6UqGkncVrzTjHBZdzDWBASgaVFm+nB4DTLOex/JxiXROun ljr7F++H4/o0y0ouiAy4mQKsgEvug6Z3KZwIMxnpxpq25Ns3UwSnWIigWLDaSRww /WJ7vZqgKbZy9Pje4UB1UJY6gItNtD+kY1W5d6KYf73E+OfGRzTXxCjZR7QzvHGZ gzPxBvVqri56s5N7F7ij2l70bXqiPjkqE5+YbAmjD6DndFWz7+S2kPqZIUWF9M+L lXnKrIXjSXt5QD8lEko4 =1jEm -----END PGP SIGNATURE----- Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio/lguest fixes from Rusty Russell: "Missing license tag and some fallout from the lguest pagetable rework" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: lguest: clear cached last cpu when guest_set_pgd() called. Add missing module license tag to vring helpers.	2013-05-13 07:59:08 -07:00
Mark Brown	88b0357dde	Merge remote-tracking branch 'spi/fix/grant' into spi-linus	2013-05-13 18:27:18 +04:00
Mark Brown	0faa3146f1	Merge remote-tracking branch 'spi/fix/atmel' into spi-linus	2013-05-13 18:27:16 +04:00
Jan Kara	e2555fde41	jbd,jbd2: fix oops in jbd2_journal_put_journal_head() Commit `ae4647fb` (jbd2: reduce journal_head size) introduced a regression where we occasionally hit panic in jbd2_journal_put_journal_head() because of wrong b_jcount. The bug is caused by gcc making 64-bit access to 32-bit bitfield and thus clobbering b_jcount. At least for now, those 8 bytes saved in struct journal_head are not worth the trouble with gcc bitfield handling so revert that part of the patch. Reported-by: EUNBONG SONG <eunb.song@samsung.com> Reported-by: Tony Luck <tony.luck@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-05-13 09:45:01 -04:00
Christopher Harvey	9f1d036648	drm/mgag200: Fix framebuffer base address programming Higher bits of the base address of framebuffers weren't being programmed properly. This caused framebuffers that didn't happen to be allocated at a low enough address to not be displayed properly. Signed-off-by: Christopher Harvey <charvey@matrox.com> Signed-off-by: Mathieu Larouche <mathieu.larouche@matrox.com> Acked-by: Julia Lemire <jlemire@matrox.com> Tested-by: Julia Lemire <jlemire@matrox.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-05-13 12:17:32 +10:00

1 2 3 4 5 ...

375593 Commits