linux/lib
Hugh Dickins e2bdb933ab radix_tree: take radix_tree_path off stack
Down, down in the deepest depths of GFP_NOIO page reclaim, we have
shrink_page_list() calling __remove_mapping() calling __delete_from_
swap_cache() or __delete_from_page_cache().

You would not expect those to need much stack, but in fact they call
radix_tree_delete(): which declares a 192-byte radix_tree_path array on
its stack (to record the node,offsets it visits when descending, in case
it needs to ascend to update them).  And if any tag is still set [1],
that calls radix_tree_tag_clear(), which declares a further such
192-byte radix_tree_path array on the stack.  (At least we have
interrupts disabled here, so won't then be pushing registers too.)

That was probably a good choice when most users were 32-bit (array of
half the size), and adding fields to radix_tree_node would have bloated
it unnecessarily.  But nowadays many are 64-bit, and each
radix_tree_node contains a struct rcu_head, which is only used when
freeing; whereas the radix_tree_path info is only used for updating the
tree (deleting, clearing tags or setting tags if tagged) when a lock
must be held, of no interest when accessing the tree locklessly.

So add a parent pointer to the radix_tree_node, in union with the
rcu_head, and remove all uses of the radix_tree_path.  There would be
space in that union to save the offset when descending as before (we can
argue that a lock must already be held to exclude other users), but
recalculating it when ascending is both easy (a constant shift and a
constant mask) and uncommon, so it seems better just to do that.

Two little optimizations: no need to decrement height when descending,
adjusting shift is enough; and once radix_tree_tag_if_tagged() has set
tag on a node and its ancestors, it need not ascend from that node
again.

perf on the radix tree test harness reports radix_tree_insert() as 2%
slower (now having to set parent), but radix_tree_delete() 24% faster.
Surely that's an exaggeration from rtth's artificially low map shift 3,
but forcing it back to 6 still rates radix_tree_delete() 8% faster.

[1] Can a pagecache tag (dirty, writeback or towrite) actually still be
set at the time of radix_tree_delete()? Perhaps not if the filesystem is
well-behaved.  But although I've not tracked any stack overflow down to
this cause, I have observed a curious case in which a dirty tag is set
and left set on tmpfs: page migration's migrate_page_copy() happens to
use __set_page_dirty_nobuffers() to set PageDirty on the newpage, and
that sets PAGECACHE_TAG_DIRTY as a side-effect - harmless to a
filesystem which doesn't use tags, except for this stack depth issue.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nai Xia <nai.xia@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-12 20:13:12 -08:00
..
lzo lib: add support for LZO-compressed kernels 2010-01-11 09:34:04 -08:00
mpi mpi/mpi-mpow: NULL dereference on allocation failure 2011-12-08 00:09:23 +11:00
raid6 md: Add in export.h for files using EXPORT_SYMBOL 2011-10-31 19:31:19 -04:00
reed_solomon
xz XZ: Fix incorrect XZ_BUF_ERROR 2011-09-21 13:39:59 -07:00
zlib_deflate zlib: slim down zlib_deflate() workspace when possible 2011-03-22 17:44:17 -07:00
zlib_inflate inflate_fast: sout is already a short so ptr arith was off by one. 2010-03-12 15:52:44 -08:00
.gitignore
argv_split.c tree-wide: convert open calls to remove spaces to skip_spaces() lib function 2009-12-15 08:53:32 -08:00
atomic64_test.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
atomic64.c lib: atomic64: Change the type of local lock to raw_spinlock_t 2011-09-14 13:14:11 +02:00
audit.c audit: support the "standard" <asm-generic/unistd.h> 2011-05-04 14:41:28 -04:00
average.c lib: Improve EWMA efficiency by using bitshifts 2010-12-06 15:58:43 -05:00
bcd.c
bch.c lib: add shared BCH ECC library 2011-03-11 14:25:50 +00:00
bitmap.c lib/bitmap.c: quiet sparse noise about address space 2011-10-31 17:30:56 -07:00
bitrev.c
bsearch.c lib: Add generic binary search function to the kernel. 2011-05-19 16:55:27 +09:30
btree.c btree: export btree_get_prev() so modules can use btree_for_each 2012-01-10 16:30:49 -08:00
bug.c modules: Fix module_bug_list list corruption race 2010-10-05 11:29:27 -07:00
bust_spinlocks.c oops handling: ensure that any oops is flushed to the mtdoops console 2009-01-06 15:59:11 -08:00
check_signature.c
checksum.c lib/checksum.c: optimize do_csum a bit 2011-07-07 04:52:24 -07:00
cmdline.c
cordic.c Docs: wording: functions -> algorithm 2011-10-29 21:20:22 +02:00
cpu_rmap.c lib: cpu_rmap: CPU affinity reverse-mapping 2011-01-24 14:51:56 -08:00
cpu-notifier-error-inject.c fault-injection: add CPU notifier error injection module 2010-05-27 09:12:48 -07:00
cpumask.c cpumask: alloc_cpumask_var() use NUMA_NO_NODE 2011-07-26 16:49:44 -07:00
crc7.c
crc8.c lib: crc8: add new library module providing crc8 algorithm 2011-06-03 15:01:06 -04:00
crc16.c
crc32.c crc32: optimize inner loop 2012-01-10 16:30:51 -08:00
crc32defs.h
crc-ccitt.c
crc-itu-t.c
crc-t10dif.c
ctype.c ctype: constify read-only _ctype string 2009-12-15 08:53:32 -08:00
debug_locks.c Revert "debug_locks: set oops_in_progress if we will log messages." 2010-11-29 15:18:28 -08:00
debugobjects.c debugobjects: Extend to assert that an object is initialized 2011-11-23 18:49:22 +01:00
dec_and_lock.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
decompress_bunzip2.c decompress_bunzip2: remove invalid vi modeline 2011-12-06 10:00:05 +01:00
decompress_inflate.c decompressors: check input size in decompress_inflate.c 2011-01-13 08:03:25 -08:00
decompress_unlzma.c treewide: Fix comment and string typo 'bufer' 2011-12-06 09:53:40 +01:00
decompress_unlzo.c Decompressors: fix callback-to-callback mode in decompress_unlzo.c 2011-01-13 08:03:24 -08:00
decompress_unxz.c Fix common misspellings 2011-03-31 11:26:23 -03:00
decompress.c decompressors: add boot-time XZ support 2011-01-13 08:03:25 -08:00
devres.c Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci 2012-01-11 18:50:26 -08:00
digsig.c crypto: digital signature verification support 2011-11-09 12:10:37 +02:00
div64.c div64_u64(): improve precision on 32bit platforms 2010-10-26 16:52:19 -07:00
dma-debug.c Fix comparison using wrong pointer variable in dma debug code 2011-11-21 11:35:37 +01:00
dump_stack.c
dynamic_debug.c Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core 2011-10-25 12:13:59 +02:00
dynamic_queue_limits.c dql: Dynamic queue limits 2011-11-29 12:46:19 -05:00
extable.c module: trim exception table on init free. 2009-06-12 21:47:04 +09:30
fault-inject.c switch debugfs to umode_t 2012-01-03 22:54:56 -05:00
find_last_bit.c bitops: add #ifndef for each of find bitops 2011-05-26 17:12:38 -07:00
find_next_bit.c arch: remove CONFIG_GENERIC_FIND_{NEXT_BIT,BIT_LE,LAST_BIT} 2011-05-26 17:12:38 -07:00
flex_array.c flex_array: avoid divisions when accessing elements 2011-05-26 17:12:33 -07:00
gcd.c lib: add lib/gcd.c 2009-06-18 13:04:05 -07:00
gen_crc32table.c crc32: major optimization 2010-05-25 08:07:06 -07:00
genalloc.c lib, Make gen_pool memory allocator lockless 2011-08-03 11:15:57 -04:00
halfmd4.c
hexdump.c lib: add error checking to hex2bin 2011-09-20 23:24:44 -04:00
hweight.c x86: Add optimized popcnt variants 2010-04-06 15:52:11 -07:00
idr.c ida: make ida_simple_get/put() IRQ safe 2011-11-02 16:07:00 -07:00
inflate.c MN10300: Don't try and #include <linux/slab.h> in lib/inflate.c from bootloader 2010-08-12 09:51:35 -07:00
int_sqrt.c
iomap_copy.c
iomap.c lib: add GENERIC_PCI_IOMAP 2011-11-28 21:12:42 +02:00
iommu-helper.c iommu: inline iommu_num_pages 2010-08-09 20:45:05 -07:00
ioremap.c ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support 2011-01-12 03:06:19 -05:00
irq_regs.c
is_single_threaded.c kernel: is_current_single_threaded: don't use ->mmap_sem 2009-07-17 09:11:31 +10:00
kasprintf.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
Kconfig Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security 2012-01-10 21:51:23 -08:00
Kconfig.debug lib/Kconfig.debug: fix help message for DEFAULT_HUNG_TASK_TIMEOUT 2011-10-31 17:30:51 -07:00
Kconfig.kgdb mips,kgdb: kdb low level trap catch and stack trace 2010-05-20 21:04:26 -05:00
Kconfig.kmemcheck kmemcheck: depend on HAVE_ARCH_KMEMCHECK 2009-07-01 22:28:44 +02:00
klist.c driver core: Remove completion from struct klist_node 2009-01-06 10:44:30 -08:00
kobject_uevent.c driver-core: skip uevent generation when nobody is listening 2011-12-09 16:23:50 -08:00
kobject.c kobject: remove kset_find_obj_hinted() 2011-12-21 15:13:54 -08:00
kstrtox.c lib/kstrtox: common code between kstrto*() and simple_strto*() functions 2011-10-31 17:30:56 -07:00
kstrtox.h lib/kstrtox: common code between kstrto*() and simple_strto*() functions 2011-10-31 17:30:56 -07:00
lcm.c lib/lcm.c: quiet sparse noise 2011-07-25 20:57:15 -07:00
libcrc32c.c libcrc32c: Fix "crc32c undefined" compilation error 2008-12-25 11:01:42 +11:00
list_debug.c Expand CONFIG_DEBUG_LIST to several other list operations 2011-02-18 11:32:28 -08:00
list_sort.c lib/list_sort: test: check element addresses 2010-10-26 16:52:19 -07:00
llist.c llist: Remove cpu_relax() usage in cmpxchg loops 2011-10-04 12:44:03 +02:00
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c rcu: Fix unpaired rcu_irq_enter() from locking selftests 2011-05-26 09:42:19 -07:00
lru_cache.c lru_cache: use correct type in sizeof for allocation 2011-05-25 08:39:52 -07:00
Makefile Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security 2012-01-10 21:51:23 -08:00
md5.c crypto: Move md5_transform to lib/md5.c 2011-08-06 18:32:45 -07:00
nlattr.c netlink: validate NLA_MSECS length 2011-11-04 17:47:34 -04:00
parser.c Fix common misspellings 2011-03-31 11:26:23 -03:00
pci_iomap.c lib: add GENERIC_PCI_IOMAP 2011-11-28 21:12:42 +02:00
percpu_counter.c lib/percpu_counter.c: enclose hotplug only variables in hotplug ifdef 2011-10-31 17:30:56 -07:00
plist.c plist: Remove the need to supply locks to plist heads 2011-07-08 14:02:53 +02:00
prio_heap.c lib: fix sparse shadowed variable warning 2009-01-06 15:59:11 -08:00
prio_tree.c
proportions.c locking, lib/proportions: Annotate prop_local_percpu::lock as raw 2011-09-13 11:11:50 +02:00
radix-tree.c radix_tree: take radix_tree_path off stack 2012-01-12 20:13:12 -08:00
random32.c Merge branch 'master' into for-next 2010-06-16 18:08:13 +02:00
ratelimit.c locking, printk: Annotate logbuf_lock as raw 2011-09-13 11:11:54 +02:00
rational.c lib/rational.c needs module.h 2010-01-11 09:34:05 -08:00
rbtree.c Export the augmented rbtree helper functions 2011-01-28 12:16:59 +10:00
reciprocal_div.c sch_red: Adaptative RED AQM 2011-12-08 19:52:43 -05:00
rwsem-spinlock.c locking, rwsem: Annotate inner lock as raw 2011-09-13 11:11:59 +02:00
rwsem.c locking, rwsem: Annotate inner lock as raw 2011-09-13 11:11:59 +02:00
scatterlist.c scatterlist: prevent invalid free when alloc fails 2010-08-30 19:55:09 +02:00
sha1.c lib/sha1.c: quiet sparse noise about symbol not declared 2011-09-13 16:09:41 -07:00
show_mem.c arch, mm: filter disallowed nodes from arch specific show_mem functions 2011-05-25 08:39:03 -07:00
smp_processor_id.c sched: Wrap scheduler p->cpus_allowed access 2011-10-06 12:46:56 +02:00
sort.c generic swap(): lib/sort.c: rename swap to swap_func 2009-01-08 08:31:14 -08:00
spinlock_debug.c lib/spinlock_debug.c: print owner on spinlock lockup 2011-10-31 17:30:56 -07:00
string_helpers.c
string.c lib/string.c: fix strim() semantics for strings that have only blanks 2011-10-31 17:30:56 -07:00
swiotlb.c swiotlb: Expose swiotlb_nr_tlb function to modules 2011-12-06 10:38:03 +00:00
syscall.c
test-kstrtox.c kstrtox: fix compile warnings in test 2011-04-14 16:06:54 -07:00
textsearch.c textsearch: doc - fix spelling in lib/textsearch.c. 2011-01-24 23:33:30 -08:00
timerqueue.c Fix common misspellings 2011-03-31 11:26:23 -03:00
ts_bm.c
ts_fsm.c
ts_kmp.c
uuid.c Unified UUID/GUID definition 2010-05-19 22:40:47 -04:00
vsprintf.c net: introduce and use netdev_features_t for device features sets 2011-11-16 17:43:10 -05:00