linux/include
Kentaro Makita da3bbdd463 fix soft lock up at NFS mount via per-SB LRU-list of unused dentries
[Summary]

 Split LRU-list of unused dentries to one per superblock to avoid soft
 lock up during NFS mounts and remounting of any filesystem.

 Previously I posted here:
 http://lkml.org/lkml/2008/3/5/590

[Descriptions]

- background

  dentry_unused is a list of dentries which are not referenced.
  dentry_unused grows up when references on directories or files are
  released.  This list can be very long if there is huge free memory.

- the problem

  When shrink_dcache_sb() is called, it scans all dentry_unused linearly
  under spin_lock(), and if dentry->d_sb is differnt from given
  superblock, scan next dentry.  This scan costs very much if there are
  many entries, and very ineffective if there are many superblocks.

  IOW, When we need to shrink unused dentries on one dentry, but scans
  unused dentries on all superblocks in the system.  For example, we scan
  500 dentries to unmount a filesystem, but scans 1,000,000 or more unused
  dentries on other superblocks.

  In our case , At mounting NFS*, shrink_dcache_sb() is called to shrink
  unused dentries on NFS, but scans 100,000,000 unused dentries on
  superblocks in the system such as local ext3 filesystems.  I hear NFS
  mounting took 1 min on some system in use.

* : NFS uses virtual filesystem in rpc layer, so NFS is affected by
  this problem.

  100,000,000 is possible number on large systems.

  Per-superblock LRU of unused dentried can reduce the cost in
  reasonable manner.

- How to fix

  I found this problem is solved by David Chinner's "Per-superblock
  unused dentry LRU lists V3"(1), so I rebase it and add some fix to
  reclaim with fairness, which is in Andrew Morton's comments(2).

  1) http://lkml.org/lkml/2006/5/25/318
  2) http://lkml.org/lkml/2006/5/25/320

  Split LRU-list of unused dentries to each superblocks.  Then, NFS
  mounting will check dentries under a superblock instead of all.  But
  this spliting will break LRU of dentry-unused.  So, I've attempted to
  make reclaim unused dentrins with fairness by calculate number of
  dentries to scan on this sb based on following way

  number of dentries to scan on this sb =
  count * (number of dentries on this sb / number of dentries in the machine)

- ToDo
 - I have to measuring performance number and do stress tests.

 - When unmount occurs during prune_dcache(), scanning on same
  superblock, It is unable to reach next superblock because it is gone
  away.  We restart scannig superblock from first one, it causes
  unfairness of reclaim unused dentries on first superblock.  But I think
  this happens very rarely.

- Test Results

  Result on 6GB boxes with excessive unused dentries.

Without patch:

$ cat /proc/sys/fs/dentry-state
10181835        10180203        45      0       0       0
# mount -t nfs 10.124.60.70:/work/kernel-src nfs
real    0m1.830s
user    0m0.001s
sys     0m1.653s

 With this patch:
$ cat /proc/sys/fs/dentry-state
10236610        10234751        45      0       0       0
# mount -t nfs 10.124.60.70:/work/kernel-src nfs
real    0m0.106s
user    0m0.002s
sys     0m0.032s

[akpm@linux-foundation.org: fix comments]
Signed-off-by: Kentaro Makita <k-makita@np.css.fujitsu.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: David Chinner <dgc@sgi.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
..
acpi Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2008-07-16 17:25:46 -07:00
asm-alpha
asm-arm Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm 2008-07-23 18:24:08 -07:00
asm-avr32 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx 2008-07-23 12:03:18 -07:00
asm-blackfin
asm-cris
asm-frv termios: Termios defines for other platforms 2008-07-20 17:12:38 -07:00
asm-generic Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2008-07-16 17:25:46 -07:00
asm-h8300
asm-ia64 mm: remove double indirection on tlb parameter to free_pgd_range() & Co 2008-07-24 10:47:15 -07:00
asm-m32r
asm-m68k m68k: remove stale ARCH_SUN4 #define 2008-07-20 17:24:39 -07:00
asm-m68knommu
asm-mips [MIPS] Remove unused maltasmp.h. 2008-07-20 14:38:22 +01:00
asm-mn10300
asm-parisc
asm-powerpc mm: remove double indirection on tlb parameter to free_pgd_range() & Co 2008-07-24 10:47:15 -07:00
asm-s390 KVM: s390: rename private structures 2008-07-20 12:42:37 +03:00
asm-sh mm: remove double indirection on tlb parameter to free_pgd_range() & Co 2008-07-24 10:47:15 -07:00
asm-sparc mm: remove double indirection on tlb parameter to free_pgd_range() & Co 2008-07-24 10:47:15 -07:00
asm-sparc64 sparc: join the remaining header files 2008-07-17 21:55:51 -07:00
asm-um
asm-v850
asm-x86 mm: remove double indirection on tlb parameter to free_pgd_range() & Co 2008-07-24 10:47:15 -07:00
asm-xtensa
crypto crypto: hash - Move ahash functions into crypto/hash.h 2008-07-10 20:35:18 +08:00
drm drm/radeon: fixup issue with radeon and PAT support. 2008-07-15 15:48:05 +10:00
keys
linux fix soft lock up at NFS mount via per-SB LRU-list of unused dentries 2008-07-24 10:47:15 -07:00
math-emu
media V4L/DVB (8395): saa7134: Fix Kbuild dependency of ir-kbd-i2c 2008-07-20 07:29:03 -03:00
mtd
net ipv6: icmp6_dst_gc return change 2008-07-22 14:35:50 -07:00
pcmcia
rdma RDMA/addr: Keep pointer to netdevice in struct rdma_dev_addr 2008-07-14 23:48:53 -07:00
rxrpc
scsi driver core: remove KOBJ_NAME_LEN define 2008-07-21 21:54:52 -07:00
sound ALSA: Release v1.0.17 2008-07-14 09:54:43 +02:00
video
xen xen: implement Xen-specific spinlocks 2008-07-16 11:15:53 +02:00
Kbuild drm: reorganise drm tree to be more future proof. 2008-07-14 10:45:01 +10:00