syzbot is reporting a lockdep warning in fill_pool() because the allocation
from debugobjects is using GFP_ATOMIC, which is (__GFP_HIGH | __GFP_KSWAPD_RECLAIM)
and therefore tries to wake up kswapd, which acquires kswapd_wait::lock.
Since fill_pool() might be called with arbitrary locks held, fill_pool()
should not assume that acquiring kswapd_wait::lock is safe.
Use __GFP_HIGH instead and remove __GFP_NORETRY as it is pointless for
!__GFP_DIRECT_RECLAIM allocation.
Fixes: 3ac7fe5a4a ("infrastructure to debug (dynamic) objects")
Reported-by: syzbot <syzbot+fe0c72f0ccbb93786380@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/6577e1fa-b6ee-f2be-2414-a2b51b1c5e30@I-love.SAKURA.ne.jp
Closes: https://syzkaller.appspot.com/bug?extid=fe0c72f0ccbb93786380
There is an explicit wait-type violation in debug_object_fill_pool()
for PREEMPT_RT=n kernels which allows them to more easily fill the
object pool and reduce the chance of allocation failures.
Lockdep's wait-type checks are designed to check the PREEMPT_RT
locking rules even for PREEMPT_RT=n kernels and object to this, so
create a lockdep annotation to allow this to stand.
Specifically, create a 'lock' type that overrides the inner wait-type
while it is held -- allowing one to temporarily raise it, such that
the violation is hidden.
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Qi Zheng <zhengqi.arch@bytedance.com>
Link: https://lkml.kernel.org/r/20230429100614.GA1489784@hirez.programming.kicks-ass.net
The recent fix to ensure atomicity of lookup and allocation inadvertently
broke the pool refill mechanism.
Prior to that change debug_objects_activate() and debug_objecs_assert_init()
invoked debug_objecs_init() to set up the tracking object for statically
initialized objects. That's not longer the case and debug_objecs_init() is
now the only place which does pool refills.
Depending on the number of statically initialized objects this can be
enough to actually deplete the pool, which was observed by Ido via a
debugobjects OOM warning.
Restore the old behaviour by adding explicit refill opportunities to
debug_objects_activate() and debug_objecs_assert_init().
Fixes: 63a759694e ("debugobject: Prevent init race with static objects")
Reported-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/871qk05a9d.ffs@tglx
Statically initialized objects are usually not initialized via the init()
function of the subsystem. They are special cased and the subsystem
provides a function to validate whether an object which is not yet tracked
by debugobjects is statically initialized. This means the object is started
to be tracked on first use, e.g. activation.
This works perfectly fine, unless there are two concurrent operations on
that object. Schspa decoded the problem:
T0 T1
debug_object_assert_init(addr)
lock_hash_bucket()
obj = lookup_object(addr);
if (!obj) {
unlock_hash_bucket();
- > preemption
lock_subsytem_object(addr);
activate_object(addr)
lock_hash_bucket();
obj = lookup_object(addr);
if (!obj) {
unlock_hash_bucket();
if (is_static_object(addr))
init_and_track(addr);
lock_hash_bucket();
obj = lookup_object(addr);
obj->state = ACTIVATED;
unlock_hash_bucket();
subsys function modifies content of addr,
so static object detection does
not longer work.
unlock_subsytem_object(addr);
if (is_static_object(addr)) <- Fails
debugobject emits a warning and invokes the fixup function which
reinitializes the already active object in the worst case.
This race exists forever, but was never observed until mod_timer() got a
debug_object_assert_init() added which is outside of the timer base lock
held section right at the beginning of the function to cover the lockless
early exit points too.
Rework the code so that the lookup, the static object check and the
tracking object association happens atomically under the hash bucket
lock. This prevents the issue completely as all callers are serialized on
the hash bucket lock and therefore cannot observe inconsistent state.
Fixes: 3ac7fe5a4a ("infrastructure to debug (dynamic) objects")
Reported-by: syzbot+5093ba19745994288b53@syzkaller.appspotmail.com
Debugged-by: Schspa Shi <schspa@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://syzkaller.appspot.com/bug?id=22c8a5938eab640d1c6bcc0e3dc7be519d878462
Link: https://lore.kernel.org/lkml/20230303161906.831686-1-schspa@gmail.com
Link: https://lore.kernel.org/r/87zg7dzgao.ffs@tglx
- A ptrace API cleanup series from Sergey Shtylyov
- Fixes and cleanups for kexec from ye xingchen
- nilfs2 updates from Ryusuke Konishi
- squashfs feature work from Xiaoming Ni: permit configuration of the
filesystem's compression concurrency from the mount command line.
- A series from Akinobu Mita which addresses bound checking errors when
writing to debugfs files.
- A series from Yang Yingliang to address rapido memory leaks
- A series from Zheng Yejian to address possible overflow errors in
encode_comp_t().
- And a whole shower of singleton patches all over the place.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY5efRgAKCRDdBJ7gKXxA
jgvdAP0al6oFDtaSsshIdNhrzcMwfjt6PfVxxHdLmNhF1hX2dwD/SVluS1bPSP7y
0sZp7Ustu3YTb8aFkMl96Y9m9mY1Nwg=
=ga5B
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- A ptrace API cleanup series from Sergey Shtylyov
- Fixes and cleanups for kexec from ye xingchen
- nilfs2 updates from Ryusuke Konishi
- squashfs feature work from Xiaoming Ni: permit configuration of the
filesystem's compression concurrency from the mount command line
- A series from Akinobu Mita which addresses bound checking errors when
writing to debugfs files
- A series from Yang Yingliang to address rapidio memory leaks
- A series from Zheng Yejian to address possible overflow errors in
encode_comp_t()
- And a whole shower of singleton patches all over the place
* tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (79 commits)
ipc: fix memory leak in init_mqueue_fs()
hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount
rapidio: devices: fix missing put_device in mport_cdev_open
kcov: fix spelling typos in comments
hfs: Fix OOB Write in hfs_asc2mac
hfs: fix OOB Read in __hfs_brec_find
relay: fix type mismatch when allocating memory in relay_create_buf()
ocfs2: always read both high and low parts of dinode link count
io-mapping: move some code within the include guarded section
kernel: kcsan: kcsan_test: build without structleak plugin
mailmap: update email for Iskren Chernev
eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFD
rapidio: fix possible UAF when kfifo_alloc() fails
relay: use strscpy() is more robust and safer
cpumask: limit visibility of FORCE_NR_CPUS
acct: fix potential integer overflow in encode_comp_t()
acct: fix accuracy loss for input value of encode_comp_t()
linux/init.h: include <linux/build_bug.h> and <linux/stringify.h>
rapidio: rio: fix possible name leak in rio_register_mport()
rapidio: fix possible name leaks when rio_add_device() fails
...
Delayed kobject debugging (CONFIG_DEBUG_KOBJECT_RELEASE) prints the kobject
pointer that's being released in kobject_release() before scheduling a
randomly delayed work to do the actual release work.
If the caller of kobject_put() frees the kobject upon return then this will
typically emit a debugobject warning about freeing an active timer.
Usually the release function is the function that does the kfree() of the
struct containing the kobject.
For example the following print is seen
kobject: 'queue' (ffff888114236190): kobject_release, parent 0000000000000000 (delayed 1000)
------------[ cut here ]------------
ODEBUG: free active (active state 0) object type: timer_list hint: kobject_delayed_cleanup+0x0/0x390
but the kobject printk cannot be matched with the debug object printk
because it could be any number of kobjects that was released around that
time. The random delay for the work doesn't help either.
Print the address of the object being tracked to help to figure out which
kobject is the problem here. Note that this does not use %px here to match
the other %p usage in debugobject debugging. Due to %p usage it is required
to disable pointer hashing to correlate the two pointer printks.
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20220519202201.2348343-1-swboyd@chromium.org
1. Var debug_objects_allocated tracks valid kmem_cache_alloc calls, so
track it in debug_objects_replace_static_objects. Do similar things in
object_cpu_offline.
2. In debug_objects_mem_init, there is no need to call function
cpuhp_setup_state_nocalls when debug_objects_enabled = 0 (out of
memory).
Link: https://lkml.kernel.org/r/20220611130634.99741-1-wuchi.zero@gmail.com
Fixes: 634d61f45d ("debugobjects: Percpu pool lookahead freeing/allocation")
Fixes: c4b73aabd0 ("debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done")
Signed-off-by: wuchi <wuchi.zero@gmail.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
On PREEMPT_RT enabled kernels it is not possible to refill the object pool
from atomic context (preemption or interrupts disabled) as the allocator
might acquire 'sleeping' spinlocks.
Guard the invocation of fill_pool() accordingly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/87sfzehdnl.ffs@tglx
If a CPU is offlined the debug objects per CPU pool is not cleaned up. If
the CPU is never onlined again then the objects in the pool are wasted.
Add a CPU hotplug callback which is invoked after the CPU is dead to free
the pool.
[ tglx: Massaged changelog and added comment about remote access safety ]
Signed-off-by: Zqiang <qiang.zhang@windriver.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20200908062709.11441-1-qiang.zhang@windriver.com
The debugobject core could be slightly harder to corrupt if the
debug_obj_descr would be a pointer to const memory.
Depending on the architecture, const data structures are placed into
read-only memory and thus are harder to corrupt or hijack.
This descriptor is used to fix up stuff like timers and workqueues when
core kernel data structures are busted, so moving the descriptors to
read-only memory will make debugobjects more resilient to something going
wrong and then corrupting the function pointers inside struct
debug_obj_descr.
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200815004027.2046113-2-swboyd@chromium.org
The counters obj_pool_free, and obj_nr_tofree, and the flag obj_freeing are
read locklessly outside the pool_lock critical sections. If read with plain
accesses, this would result in data races.
This is addressed as follows:
* reads outside critical sections become READ_ONCE()s (pairing with
WRITE_ONCE()s added);
* writes become WRITE_ONCE()s (pairing with READ_ONCE()s added); since
writes happen inside critical sections, only the write and not the read
of RMWs needs to be atomic, thus WRITE_ONCE(var, var +/- X) is
sufficient.
The data races were reported by KCSAN:
BUG: KCSAN: data-race in __free_object / fill_pool
write to 0xffffffff8beb04f8 of 4 bytes by interrupt on cpu 1:
__free_object+0x1ee/0x8e0 lib/debugobjects.c:404
__debug_check_no_obj_freed+0x199/0x330 lib/debugobjects.c:969
debug_check_no_obj_freed+0x3c/0x44 lib/debugobjects.c:994
slab_free_hook mm/slub.c:1422 [inline]
read to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 2:
fill_pool+0x3d/0x520 lib/debugobjects.c:135
__debug_object_init+0x3c/0x810 lib/debugobjects.c:536
debug_object_init lib/debugobjects.c:591 [inline]
debug_object_activate+0x228/0x320 lib/debugobjects.c:677
debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline]
BUG: KCSAN: data-race in __debug_object_init / fill_pool
read to 0xffffffff8beb04f8 of 4 bytes by task 10 on cpu 6:
fill_pool+0x3d/0x520 lib/debugobjects.c:135
__debug_object_init+0x3c/0x810 lib/debugobjects.c:536
debug_object_init_on_stack+0x39/0x50 lib/debugobjects.c:606
init_timer_on_stack_key kernel/time/timer.c:742 [inline]
write to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 3:
alloc_object lib/debugobjects.c:258 [inline]
__debug_object_init+0x717/0x810 lib/debugobjects.c:544
debug_object_init lib/debugobjects.c:591 [inline]
debug_object_activate+0x228/0x320 lib/debugobjects.c:677
debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline]
BUG: KCSAN: data-race in free_obj_work / free_object
read to 0xffffffff9140c190 of 4 bytes by task 10 on cpu 6:
free_object+0x4b/0xd0 lib/debugobjects.c:426
debug_object_free+0x190/0x210 lib/debugobjects.c:824
destroy_timer_on_stack kernel/time/timer.c:749 [inline]
write to 0xffffffff9140c190 of 4 bytes by task 93 on cpu 1:
free_obj_work+0x24f/0x480 lib/debugobjects.c:313
process_one_work+0x454/0x8d0 kernel/workqueue.c:2264
worker_thread+0x9a/0x780 kernel/workqueue.c:2410
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200116185529.11026-1-elver@google.com
The db->lock is a raw spinlock and so the lock hold time is supposed
to be short. This will not be the case when printk() is being involved
in some of the critical sections. In order to avoid the long hold time,
in case some messages need to be printed, the debug_object_is_on_stack()
and debug_print_object() calls are now moved out of those critical
sections.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-6-longman@redhat.com
After a system bootup and 3 parallel kernel builds, a partial output
of the debug objects stats file was:
pool_free :5101
pool_pcp_free :4181
pool_min_free :220
pool_used :104172
pool_max_used :171920
on_free_list :0
objs_allocated:39268280
objs_freed :39160031
More than 39 millions debug objects had since been allocated and then
freed. The pool_max_used, however, was only about 172k. So this is a
lot of extra overhead in freeing and allocating objects from slabs. It
may also causes the slabs to be more fragmented and harder to reclaim.
Make the freeing of excess debug objects less aggressive by freeing them at
a maximum frequency of 10Hz and about 1k objects at each round of freeing.
With that change applied, the partial output of the debug objects stats
file after similar actions became:
pool_free :5901
pool_pcp_free :3742
pool_min_free :1022
pool_used :104805
pool_max_used :168081
on_free_list :0
objs_allocated:5796864
objs_freed :5687182
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-5-longman@redhat.com
In fill_pool(), the pool_lock is acquired and then released once per debug
object. If many objects are to be filled, the constant lock and unlock
operations are extra overhead.
To reduce the overhead, batch them up and do an allocation of 4 objects per
lock/unlock sequence.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-4-longman@redhat.com
Most workloads will allocate a bunch of memory objects, work on them
and then freeing all or most of them. So just having a percpu free pool
may not reduce the pool_lock contention significantly if large number
of objects are being used.
To help those situations, we are now doing lookahead allocation and
freeing of the debug objects into and out of the percpu free pool. This
will hopefully reduce the number of times the pool_lock needs to be
taken and hence its contention level.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-3-longman@redhat.com
When a multi-threaded workload does a lot of small memory object
allocations and deallocations, it may cause the allocation and freeing of
many debug objects. This will make the global pool_lock a bottleneck in the
performance of the workload. Since interrupts are disabled when acquiring
the pool_lock, it may even cause hard lockups to happen.
To reduce contention of the global pool_lock, add a percpu debug object
free pool that can be used to buffer some of the debug object allocation
and freeing requests without acquiring the pool_lock. Each CPU will now
have a percpu free pool that can hold up to a maximum of 64 debug
objects. Allocation and freeing requests will go to the percpu free pool
first. If that fails, the pool_lock will be taken and the global free pool
will be used.
The presence or absence of obj_cache is used as a marker to see if the
percpu cache should be used.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-2-longman@redhat.com
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Qian Cai <cai@gmx.us>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Waiman Long <longman@redhat.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190612153513.GA21082@kroah.com
The current value of the early boot static pool size, 1024 is not big
enough for systems with large number of CPUs with timer or/and workqueue
objects selected. As the results, systems have 60+ CPUs with both timer
and workqueue objects enabled could trigger "ODEBUG: Out of memory.
ODEBUG disabled".
Some debug objects are allocated during the early boot. Enabling some
options like timers or workqueue objects may increase the size required
significantly with large number of CPUs. For example,
CONFIG_DEBUG_OBJECTS_TIMERS:
No. CPUs x 2 (worker pool) objects:
start_kernel
workqueue_init_early
init_worker_pool
init_timer_key
debug_object_init
plus No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
sched_init
hrtick_rq_init
hrtimer_init
CONFIG_DEBUG_OBJECTS_WORK:
No. CPUs objects:
vmalloc_init
__init_work
plus No. CPUs x 6 (workqueue) objects:
workqueue_init_early
alloc_workqueue
__alloc_workqueue_key
alloc_and_link_pwqs
init_pwq
Also, plus No. CPUs objects:
perf_event_init
__init_srcu_struct
init_srcu_struct_fields
init_srcu_struct_nodes
__init_work
However, none of the things are actually used or required before
debug_objects_mem_init() is invoked, so just move the call right before
vmalloc_init().
According to tglx, "the reason why the call is at this place in
start_kernel() is historical. It's because back in the days when
debugobjects were added the memory allocator was enabled way later than
today."
Link: http://lkml.kernel.org/r/20181126102407.1836-1-cai@gmx.us
Signed-off-by: Qian Cai <cai@gmx.us>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to
recursive calls.
fill_pool
kmemleak_ignore
make_black_object
put_object
__call_rcu (kernel/rcu/tree.c)
debug_rcu_head_queue
debug_object_activate
debug_object_init
fill_pool
kmemleak_ignore
make_black_object
...
So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly
allocated debug objects at all.
Link: http://lkml.kernel.org/r/20181126165343.2339-1-cai@gmx.us
Signed-off-by: Qian Cai <cai@gmx.us>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kmem_cache_destroy() has a built in NULL pointer check, so the one at the
call can be removed.
Signed-off-by: Zhong Jiang <zhongjiang@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <longman@redhat.com>
Cc: <arnd@arndb.de>
Cc: <yang.shi@linux.alibaba.com>
Link: https://lkml.kernel.org/r/1533054298-35824-1-git-send-email-zhongjiang@huawei.com
While debugging an issue debugobject tracking warned about an annotation
issue of an object on stack. It turned out that the issue was due to the
object in concern being on a different stack which was due to another
issue.
Thomas suggested to print the pointers and the location of the stack for
the currently running task. This helped to figure out that the object was
on the wrong stack.
As this is general useful information for debugging similar issues, make
the error message more informative by printing the pointers.
[ tglx: Massaged changelog ]
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: kernel-team@android.com
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: astrachan@google.com
Link: https://lkml.kernel.org/r/20180723212531.202328-1-joel@joelfernandes.org
debug_objects_maxchecked is only updated in __debug_check_no_obj_freed(),
and only read in debug_objects_maxchecked, unfortunately both of these are
optional and depend on different Kconfig symbols.
When both CONFIG_DEBUG_OBJECTS_FREE and CONFIG_DEBUG_FS are disabled this
warning is emitted:
lib/debugobjects.c:56:14: error: 'debug_objects_maxchecked' defined but not used [-Werror=unused-variable]
Rather than trying to add more complex #ifdef protections, mark the
variable as __maybe_unused so it can be silently dropped when usused.
Fixes: bd9dcd0465 ("debugobjects: Export max loops counter")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20180313131857.158876-1-arnd@arndb.de
The removal of the batched object freeing has caused the debug_objects_freed
to become read-only, and the reading is inside an ifdef, so gcc warns that it
is completely unused without CONFIG_DEBUG_FS:
lib/debugobjects.c:71:14: error: 'debug_objects_freed' defined but not used [-Werror=unused-variable]
Assuming we are still interested in this number, this adds back code to
keep track of the freed objects.
Fixes: 636e1970fd ("debugobjects: Use global free list in free_object()")
Suggested-by: Waiman Long <longman@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20180222155335.1647466-1-arnd@arndb.de
__debug_check_no_obj_freed() iterates over the to be freed memory region in
chunks and iterates over the corresponding hash bucket list for each
chunk. This can accumulate to hundred thousands of checked objects. In the
worst case this can trigger the soft lockup detector:
NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s!
CPU: 15 PID: 110342 Comm: stress-ng-getde
Call Trace:
[<ffffffff8141177e>] debug_check_no_obj_freed+0x13e/0x220
[<ffffffff811f8751>] __free_pages_ok+0x1f1/0x5c0
[<ffffffff811fa785>] __free_pages+0x25/0x40
[<ffffffff812638db>] __free_slab+0x19b/0x270
[<ffffffff812639e9>] discard_slab+0x39/0x50
[<ffffffff812679f7>] __slab_free+0x207/0x270
[<ffffffff81269966>] ___cache_free+0xa6/0xb0
[<ffffffff8126c267>] qlist_free_all+0x47/0x80
[<ffffffff8126c5a9>] quarantine_reduce+0x159/0x190
[<ffffffff8126b3bf>] kasan_kmalloc+0xaf/0xc0
[<ffffffff8126b8a2>] kasan_slab_alloc+0x12/0x20
[<ffffffff81265e8a>] kmem_cache_alloc+0xfa/0x360
[<ffffffff812abc8f>] ? getname_flags+0x4f/0x1f0
[<ffffffff812abc8f>] getname_flags+0x4f/0x1f0
[<ffffffff812abe42>] getname+0x12/0x20
[<ffffffff81298da9>] do_sys_open+0xf9/0x210
[<ffffffff81298ede>] SyS_open+0x1e/0x20
[<ffffffff817d6e01>] entry_SYSCALL_64_fastpath+0x1f/0xc2
The code path might be called in either atomic or non-atomic context, but
in_atomic() can't tell if the current context is atomic or not on a
PREEMPT=n kernel, so cond_resched() can't be used to prevent the
softlockup.
Utilize the global free list to shorten the loop execution time.
[ tglx: Massaged changelog ]
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-5-git-send-email-yang.shi@linux.alibaba.com
The newly added global free list allows to avoid lengthy pool_list
iterations in free_obj_work() by putting objects either into the pool list
when the fill level of the pool is below the maximum or by putting them on
the global free list immediately.
As the pool is now guaranteed to never exceed the maximum fill level this
allows to remove the batch removal from pool list in free_obj_work().
Split free_object() into two parts, so the actual queueing function can be
reused without invoking schedule_work() on every invocation.
[ tglx: Remove the batch removal from pool list and massage changelog ]
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-4-git-send-email-yang.shi@linux.alibaba.com
free_object() adds objects to the pool list and schedules work when the
pool list is larger than the pool size. The worker handles the actual
kfree() of the object by iterating the pool list until the pool size is
below the maximum pool size again.
To iterate the pool list, pool_lock has to be held and the objects which
should be freed() need to be put into temporary storage so pool_lock can be
dropped for the actual kmem_cache_free() invocation. That's a pointless and
expensive exercise if there is a large number of objects to free.
In such a case its better to evaulate the fill level of the pool in
free_objects() and queue the object to free either in the pool list or if
it's full on a separate global free list.
The worker can then do the following simpler operation:
- Move objects back from the global free list to the pool list if the
pool list is not longer full.
- Remove the remaining objects in a single list move operation from the
global free list and do the kmem_cache_free() operation lockless from
the temporary list head.
In fill_pool() the global free list is checked as well to avoid real
allocations from the kmem cache.
Add the necessary list head and a counter for the number of objects on the
global free list and export that counter via sysfs:
max_chain :79
max_loops :8147
warnings :0
fixups :0
pool_free :1697
pool_min_free :346
pool_used :15356
pool_max_used :23933
on_free_list :39
objs_allocated:32617
objs_freed :16588
Nothing queues objects on the global free list yet. This happens in a
follow up change.
[ tglx: Simplified implementation and massaged changelog ]
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-3-git-send-email-yang.shi@linux.alibaba.com
__debug_check_no_obj_freed() can be an expensive operation depending on the
size of memory freed. It already exports the maximum chain walk length via
debugfs, but this only records the maximum of a single memory chunk.
Though there is no information about the total number of objects inspected
for a __debug_check_no_obj_freed() operation, which might be significantly
larger when a huge memory region is freed.
Aggregate the number of objects inspected for a single invocation of
__debug_check_no_obj_freed() and export it via sysfs.
The resulting output of /sys/kernel/debug/debug_objects/stats looks like:
max_chain :121
max_checked :543267
warnings :0
fixups :0
pool_free :1764
pool_min_free :341
pool_used :86438
pool_max_used :268887
objs_allocated:6068254
objs_freed :5981076
[ tglx: Renamed the variable to max_checked and adjusted changelog ]
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-2-git-send-email-yang.shi@linux.alibaba.com
The allocated debug objects are either on the free list or in the
hashed bucket lists. So they won't get lost. However if both debug
objects and kmemleak are enabled and kmemleak scanning is done
while some of the debug objects are transitioning from one list to
the others, false negative reporting of memory leaks may happen for
those objects. For example,
[38687.275678] kmemleak: 12 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
unreferenced object 0xffff92e98aabeb68 (size 40):
comm "ksmtuned", pid 4344, jiffies 4298403600 (age 906.430s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 d0 bc db 92 e9 92 ff ff ................
01 00 00 00 00 00 00 00 38 36 8a 61 e9 92 ff ff ........86.a....
backtrace:
[<ffffffff8fa5378a>] kmemleak_alloc+0x4a/0xa0
[<ffffffff8f47c019>] kmem_cache_alloc+0xe9/0x320
[<ffffffff8f62ed96>] __debug_object_init+0x3e6/0x400
[<ffffffff8f62ef01>] debug_object_activate+0x131/0x210
[<ffffffff8f330d9f>] __call_rcu+0x3f/0x400
[<ffffffff8f33117d>] call_rcu_sched+0x1d/0x20
[<ffffffff8f4a183c>] put_object+0x2c/0x40
[<ffffffff8f4a188c>] __delete_object+0x3c/0x50
[<ffffffff8f4a18bd>] delete_object_full+0x1d/0x20
[<ffffffff8fa535c2>] kmemleak_free+0x32/0x80
[<ffffffff8f47af07>] kmem_cache_free+0x77/0x350
[<ffffffff8f453912>] unlink_anon_vmas+0x82/0x1e0
[<ffffffff8f440341>] free_pgtables+0xa1/0x110
[<ffffffff8f44af91>] exit_mmap+0xc1/0x170
[<ffffffff8f29db60>] mmput+0x80/0x150
[<ffffffff8f2a7609>] do_exit+0x2a9/0xd20
The references in the debug objects may also hide a real memory leak.
As there is no point in having kmemleak to track debug object
allocations, kmemleak checking is now disabled for debug objects.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1502718733-8527-1-git-send-email-longman@redhat.com
We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/task_stack.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As suggested by Ingo, the debug_objects_alloc counter is now renamed to
debug_objects_allocated with minor twist in comment and debug output.
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1486503630-1501-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On a large SMP system with many CPUs, the global pool_lock may become
a performance bottleneck as all the CPUs that need to allocate or
free debug objects have to take the lock. That can sometimes cause
soft lockups like:
NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21]
...
RIP: 0010:[<ffffffff817c216b>] [<ffffffff817c216b>]
_raw_spin_unlock_irqrestore+0x3b/0x60
...
Call Trace:
[<ffffffff813f40d1>] free_object+0x81/0xb0
[<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220
[<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
[<ffffffff81284996>] ? file_free_rcu+0x36/0x60
[<ffffffff81251712>] kmem_cache_free+0xd2/0x380
[<ffffffff81284960>] ? fput+0x90/0x90
[<ffffffff81284996>] file_free_rcu+0x36/0x60
[<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550
[<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550
[<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50
[<ffffffff810c59d1>] kthread+0x101/0x120
[<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
[<ffffffff817c2d32>] ret_from_fork+0x22/0x50
To reduce the amount of contention on the pool_lock, the actual
kmem_cache_free() of the debug objects will be delayed if the pool_lock
is busy. This will temporarily increase the amount of free objects
available at the free pool when the system is busy. As a result,
the number of kmem_cache allocation and freeing is reduced.
To further reduce the lock operations free debug objects in batches of
four.
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-4-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
On a large SMP systems with hundreds of CPUs, the current thresholds
for allocating and freeing debug objects (256 and 1024 respectively)
may not work well. This can cause a lot of needless calls to
kmem_aloc() and kmem_free() on those systems.
To alleviate this thrashing problem, the object freeing threshold
is now increased to "1024 + # of CPUs * 32". Whereas the object
allocation threshold is increased to "256 + # of CPUs * 4". That
should make the debug objects subsystem scale better with the number
of CPUs available in the system.
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-3-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
New debugfs stat counters are added to track the numbers of
kmem_cache_alloc() and kmem_cache_free() function calls to get a
sense of how the internal debug objects cache management is performing.
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-2-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull workqueue updates from Tejun Heo:
"Mostly patches to initialize workqueue subsystem earlier and get rid
of keventd_up().
The patches were headed for the last merge cycle but got delayed due
to a bug found late minute, which is fixed now.
Also, to help debugging, destroy_workqueue() is more chatty now on a
sanity check failure."
* 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: move wq_numa_init() to workqueue_init()
workqueue: remove keventd_up()
debugobj, workqueue: remove keventd_up() usage
slab, workqueue: remove keventd_up() usage
power, workqueue: remove keventd_up() usage
tty, workqueue: remove keventd_up() usage
mce, workqueue: remove keventd_up() usage
workqueue: make workqueue available early during boot
workqueue: dump workqueue state on sanity check failures in destroy_workqueue()
Drivers, or other modules, that use a mixture of objects (especially
objects embedded within other objects) would like to take advantage of
the debugobjects facilities to help catch misuse. Currently, the
debugobjects interface is only available to builtin drivers and requires
a set of EXPORT_SYMBOL_GPL for use by modules.
I am using the debugobjects in i915.ko to try and catch some invalid
operations on embedded objects. The problem currently only presents
itself across module unload so forcing i915 to be builtin is not an
option.
Link: http://lkml.kernel.org/r/20161122143039.6433-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Du, Changbin" <changbin.du@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that workqueue can handle work item queueing from very early
during boot, there is no need to gate schedule_work() while
!keventd_up(). Remove it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
When activating a static object we need make sure that the object is
tracked in the object tracker. If it is a non-static object then the
activation is illegal.
In previous implementation, each subsystem need take care of this in
their fixup callbacks. Actually we can put it into debugobjects core.
Thus we can save duplicated code, and have *pure* fixup callbacks.
To achieve this, a new callback "is_static_object" is introduced to let
the type specific code decide whether a object is static or not. If
yes, we take it into object tracker, otherwise give warning and invoke
fixup callback.
This change has paassed debugobjects selftest, and I also do some test
with all debugobjects supports enabled.
At last, I have a concern about the fixups that can it change the object
which is in incorrect state on fixup? Because the 'addr' may not point
to any valid object if a non-static object is not tracked. Then Change
such object can overwrite someone's memory and cause unexpected
behaviour. For example, the timer_fixup_activate bind timer to function
stub_timer.
Link: http://lkml.kernel.org/r/1462576157-14539-1-git-send-email-changbin.du@intel.com
[changbin.du@intel.com: improve code comments where invoke the new is_static_object callback]
Link: http://lkml.kernel.org/r/1462777431-8171-1-git-send-email-changbin.du@intel.com
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If debug_object_fixup() return non-zero when problem has been fixed.
But the code got it backwards, it taks 0 as fixup successfully. So fix
it.
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I am going to introduce debugobjects infrastructure to USB subsystem.
But before this, I found the code of debugobjects could be improved.
This patchset will make fixup functions return bool type instead of int.
Because fixup only need report success or no. boolean is the 'real'
type.
This patch (of 7):
The object debugging infrastructure core provides some fixup callbacks
for the subsystem who use it. These callbacks are called from the debug
code whenever a problem in debug_object_init is detected. And
debugobjects core suppose them returns 1 when the fixup was successful,
otherwise 0. So the return type is boolean.
A bad thing is that debug_object_fixup use the return value for
arithmetic operation. It confused me that what is the reall return
type.
Reading over the whole code, I found some place do use the return value
incorrectly(see next patch). So why use bool type instead?
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Triplett <josh@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On my bigger s390 systems I always get "Out of memory.
ODEBUG disabled". Since the number of objects is needed at
compile time, we can not change the size dynamically before
the caches etc are available. Doubling the size seems to
do the trick. Since it is init data it will be freed anyway,
this should be ok.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: http://lkml.kernel.org/r/1453905478-13409-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Direct conversion of one KERN_DEBUG message without DEBUG definition
(suggested by Josh Triplett)
That message will now be disabled by default. (see
Documentation/CodingStyle Chapter 13)
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove unnecessary work pending test before calling schedule_work(). It
has been tested in queue_work_on() already. No functional changed.
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In order to better respond to things like duplicate invocations
of call_rcu(), RCU needs to see the status of a call to
debug_object_activate(). This would allow RCU to leak the callback in
order to avoid adding freelist-reuse mischief to the duplicate invoations.
This commit therefore makes debug_object_activate() return status,
zero for success and -EINVAL for failure.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There was a return missed in 1fda107d44 "debugobjects: Remove unused
return value from fill_pool()". It makes gcc complain:
lib/debugobjects.c: In function ‘fill_pool’:
lib/debugobjects.c:98:4: warning: ‘return’ with a value, in
function returning void [enabled by default]
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: http://lkml.kernel.org/r/20120418112810.GA2669@elgon.mountain
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>