linux/include
Fengguang Wu 8bc3be2751 writeback: speed up writeback of big dirty files
After making dirty a 100M file, the normal behavior is to start the
writeback for all data after 30s delays.  But sometimes the following
happens instead:

	- after 30s:    ~4M
	- after 5s:     ~4M
	- after 5s:     all remaining 92M

Some analyze shows that the internal io dispatch queues goes like this:

		s_io            s_more_io
		-------------------------
	1)	100M,1K         0
	2)	1K              96M
	3)	0               96M
1) initial state with a 100M file and a 1K file

2) 4M written, nr_to_write <= 0, so write more

3) 1K written, nr_to_write > 0, no more writes(BUG)

nr_to_write > 0 in (3) fools the upper layer to think that data have all
been written out.  The big dirty file is actually still sitting in
s_more_io.  We cannot simply splice s_more_io back to s_io as soon as s_io
becomes empty, and let the loop in generic_sync_sb_inodes() continue: this
may starve newly expired inodes in s_dirty.  It is also not an option to
draw inodes from both s_more_io and s_dirty, an let the loop go on: this
might lead to live locks, and might also starve other superblocks in sync
time(well kupdate may still starve some superblocks, that's another bug).

We have to return when a full scan of s_io completes.  So nr_to_write > 0
does not necessarily mean that "all data are written".  This patch
introduces a flag writeback_control.more_io to indicate that more io should
be done.  With it the big dirty file no longer has to wait for the next
kupdate invokation 5s later.

In sync_sb_inodes() we only set more_io on super_blocks we actually
visited.  This avoids the interaction between two pdflush deamons.

Also in __sync_single_inode() we don't blindly keep requeuing the io if the
filesystem cannot progress.  Failing to do so may lead to 100% iowait.

Tested-by: Mike Snitzer <snitzer@gmail.com>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Michael Rubin <mrubin@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:19 -08:00
..
acpi include/acpi/: Spelling fixes 2008-02-03 17:07:16 +02:00
asm-alpha add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-arm add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-avr32 add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-blackfin [NET]: Introducing socket mark socket option. 2008-01-31 19:27:19 -08:00
asm-cris add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-frv add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-generic add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-h8300 [NET]: Introducing socket mark socket option. 2008-01-31 19:27:19 -08:00
asm-ia64 add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-m32r add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-m68k add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-m68knommu include/asm-m68knommu/: Spelling fixes 2008-02-03 17:38:04 +02:00
asm-mips add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-parisc add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-powerpc add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-ppc add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-s390 add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-sh add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-sparc add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-sparc64 add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-um add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-v850 [NET]: Introducing socket mark socket option. 2008-01-31 19:27:19 -08:00
asm-x86 add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
asm-xtensa add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
crypto [CRYPTO] api: Include sched.h for cond_resched in scatterwalk.h 2008-01-11 08:16:59 +11:00
keys
linux writeback: speed up writeback of big dirty files 2008-02-05 09:44:19 -08:00
math-emu
media include/media/: Spelling fixes 2008-02-03 17:19:47 +02:00
mtd
net [IPV6]: Reorg struct ifmcaddr6 to save some bytes 2008-02-03 04:28:54 -08:00
pcmcia pcmcia: replace kio_addr_t with unsigned int everywhere 2008-02-05 09:44:08 -08:00
rdma RDMA/cma: add support for rdma_migrate_id() 2008-01-25 14:15:32 -08:00
rxrpc
scsi include/scsi/: Spelling fixes 2008-02-03 17:47:00 +02:00
sound [ALSA] version 1.0.16rc2 2008-01-31 17:40:18 +01:00
video
xen x86: page.h: make pte_t a union to always include 2008-01-30 13:32:57 +01:00
Kbuild