forked from Minki/linux
e2848a0efe
Most pagecache (and some other) radix tree insertions have the great opportunity to preallocate a few nodes with relaxed gfp flags. But the preallocation is squandered when it comes time to allocate a node, we default to first attempting a GFP_ATOMIC allocation -- that doesn't normally fail, but it can eat into atomic memory reserves that we don't need to be using. Another upshot of this is that it removes the sometimes highly contended zone->lock from underneath tree_lock. Pagecache insertions are always performed with a radix tree preload, and after this change, such a situation will never fall back to kmem_cache_alloc within radix_tree_node_alloc. David Miller reports seeing this allocation fail on a highly threaded sparc64 system: [527319.459981] dd: page allocation failure. order:0, mode:0x20 [527319.460403] Call Trace: [527319.460568] [00000000004b71e0] __slab_alloc+0x1b0/0x6a8 [527319.460636] [00000000004b7bbc] kmem_cache_alloc+0x4c/0xa8 [527319.460698] [000000000055309c] radix_tree_node_alloc+0x20/0x90 [527319.460763] [0000000000553238] radix_tree_insert+0x12c/0x260 [527319.460830] [0000000000495cd0] add_to_page_cache+0x38/0xb0 [527319.460893] [00000000004e4794] mpage_readpages+0x6c/0x134 [527319.460955] [000000000049c7fc] __do_page_cache_readahead+0x170/0x280 [527319.461028] [000000000049cc88] ondemand_readahead+0x208/0x214 [527319.461094] [0000000000496018] do_generic_mapping_read+0xe8/0x428 [527319.461152] [0000000000497948] generic_file_aio_read+0x108/0x170 [527319.461217] [00000000004badac] do_sync_read+0x88/0xd0 [527319.461292] [00000000004bb5cc] vfs_read+0x78/0x10c [527319.461361] [00000000004bb920] sys_read+0x34/0x60 [527319.461424] [0000000000406294] linux_sparc_syscall32+0x3c/0x40 The calltrace is significant: __do_page_cache_readahead allocates a number of pages with GFP_KERNEL, and hence it should have reclaimed sufficient memory to satisfy GFP_ATOMIC allocations. However after the list of pages goes to mpage_readpages, there can be significant intervals (including disk IO) before all the pages are inserted into the radix-tree. So the reserves can easily be depleted at that point. The patch is confirmed to fix the problem. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
---|---|---|
.. | ||
lzo | ||
reed_solomon | ||
zlib_deflate | ||
zlib_inflate | ||
.gitignore | ||
argv_split.c | ||
audit.c | ||
bitmap.c | ||
bitrev.c | ||
bug.c | ||
bust_spinlocks.c | ||
check_signature.c | ||
cmdline.c | ||
cpumask.c | ||
crc7.c | ||
crc16.c | ||
crc32.c | ||
crc32defs.h | ||
crc-ccitt.c | ||
crc-itu-t.c | ||
ctype.c | ||
debug_locks.c | ||
dec_and_lock.c | ||
devres.c | ||
div64.c | ||
dump_stack.c | ||
extable.c | ||
fault-inject.c | ||
find_next_bit.c | ||
gen_crc32table.c | ||
genalloc.c | ||
halfmd4.c | ||
hexdump.c | ||
hweight.c | ||
idr.c | ||
inflate.c | ||
int_sqrt.c | ||
iomap_copy.c | ||
iomap.c | ||
iommu-helper.c | ||
ioremap.c | ||
irq_regs.c | ||
kasprintf.c | ||
Kconfig | ||
Kconfig.debug | ||
kernel_lock.c | ||
klist.c | ||
kobject_uevent.c | ||
kobject.c | ||
kref.c | ||
libcrc32c.c | ||
list_debug.c | ||
locking-selftest-hardirq.h | ||
locking-selftest-mutex.h | ||
locking-selftest-rlock-hardirq.h | ||
locking-selftest-rlock-softirq.h | ||
locking-selftest-rlock.h | ||
locking-selftest-rsem.h | ||
locking-selftest-softirq.h | ||
locking-selftest-spin-hardirq.h | ||
locking-selftest-spin-softirq.h | ||
locking-selftest-spin.h | ||
locking-selftest-wlock-hardirq.h | ||
locking-selftest-wlock-softirq.h | ||
locking-selftest-wlock.h | ||
locking-selftest-wsem.h | ||
locking-selftest.c | ||
Makefile | ||
parser.c | ||
pcounter.c | ||
percpu_counter.c | ||
plist.c | ||
prio_heap.c | ||
prio_tree.c | ||
proportions.c | ||
radix-tree.c | ||
random32.c | ||
rbtree.c | ||
reciprocal_div.c | ||
rwsem-spinlock.c | ||
rwsem.c | ||
scatterlist.c | ||
semaphore-sleepers.c | ||
sha1.c | ||
smp_processor_id.c | ||
sort.c | ||
spinlock_debug.c | ||
string.c | ||
swiotlb.c | ||
textsearch.c | ||
ts_bm.c | ||
ts_fsm.c | ||
ts_kmp.c | ||
vsprintf.c |