Although extent status is loaded on-demand, we also need to reclaim
extent from the tree when we are under a heavy memory pressure because
in some cases fragmented extent tree causes status tree costs too much
memory.
Here we maintain a lru list in super_block. When the extent status of
an inode is accessed and changed, this inode will be move to the tail
of the list. The inode will be dropped from this list when it is
cleared. In the inode, a counter is added to count the number of
cached objects in extent status tree. Here only written/unwritten/hole
extent is counted because delayed extent doesn't be reclaimed due to
fiemap, bigalloc and seek_data/hole need it. The counter will be
increased as a new extent is allocated, and it will be decreased as a
extent is freed.
In this commit we use normal shrinker framework to reclaim memory from
the status tree. ext4_es_reclaim_extents_count() traverses the lru list
to count the number of reclaimable extents. ext4_es_shrink() tries to
reclaim written/unwritten/hole extents from extent status tree. The
inode that has been shrunk is moved to the tail of lru list.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan kara <jack@suse.cz>
This commit changes some interfaces in extent status tree because we
need to use inode to count the cached objects in a extent status tree.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan kara <jack@suse.cz>
Single extent cache could be removed because we have extent status tree
as a extent cache, and it would be better.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan kara <jack@suse.cz>
After tracking all extent status, we already have a extent cache in
memory. Every time we want to lookup a block mapping, we can first
try to lookup it in extent status tree to avoid a potential disk I/O.
A new function called ext4_es_lookup_extent is defined to finish this
work. When we try to lookup a block mapping, we always call
ext4_map_blocks and/or ext4_da_map_blocks. So in these functions we
first try to lookup a block mapping in extent status tree.
A new flag EXT4_GET_BLOCKS_NO_PUT_HOLE is used in ext4_da_map_blocks
in order not to put a hole into extent status tree because this hole
will be converted to delayed extent in the tree immediately.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan kara <jack@suse.cz>
By recording the phycisal block and status, extent status tree is able
to track the status of every extents. When we call _map_blocks
functions to lookup an extent or create a new written/unwritten/delayed
extent, this extent will be inserted into extent status tree.
We don't load all extents from disk in alloc_inode() because it costs
too much memory, and if a file is opened and closed frequently it will
takes too much time to load all extent information. So currently when
we create/lookup an extent, this extent will be inserted into extent
status tree. Hence, the extent status tree may not comprehensively
contain all of the extents found in the file.
Here a condition we need to take care is that an extent might contains
unwritten and delayed status simultaneously because an extent is delayed
allocated and could be allocated by fallocate. At this time we need to
keep delayed status because later we need to update delayed reservation
space using it.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan kara <jack@suse.cz>
This commit lets ext4_ext_map_blocks return EXT4_MAP_UNWRITTEN flag
because in later commit ext4_map_blocks needs to use this flag to
determine the extent status.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
This commit renames ext4_es_find_extent with ext4_es_find_delayed_extent
and improve this function. First, we split input and output parameter.
Second, this function never return the first block of the next delayed
extent after 'es'.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan kara <jack@suse.cz>
This commit adds two members in extent_status structure to let it record
physical block and extent status. Here es_pblk is used to record both
of them because physical block only has 48 bits. So extent status could
be stashed into it so that we can save some memory. Now written,
unwritten, delayed and hole are defined as status.
Due to new member is added into extent status tree, all interfaces need
to be adjusted.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
This commit refines the extent status tree code.
1) A prefix 'es_' is added to to the extent status tree structure
members.
2) Refactored es_remove_extent() so that __es_remove_extent() can be
used by es_insert_extent() to remove the old extent entry(-ies) before
inserting a new one.
3) Rename extent_status_end() to ext4_es_end()
4) ext4_es_can_be_merged() is define to check whether two extents can
be merged or not.
5) Update and clarified comments.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
pm_idle appears in no generic Linux code,
it appears only in architecture-specific code.
Thus, pm_idle should not be declared in pm.h.
Architectures that use an idle function pointer
should delcare one local to their architecture,
and/or use cpuidle.
Signed-off-by: Len Brown <len.brown@intel.com>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Cc: linux-pm@vger.kernel.org
as pm_idle() has already been deleted from this code,
the comment was a stray.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
All paths on m32r lead to cpu_relax().
So delete the dead code and simply call cpu_relax() directly.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: linux-m32r@ml.linux-m32r.org
pm_idle() on ia64 was a synonym for default_idle().
So simply invoke default_idle() directly.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: linux-ia64@vger.kernel.org
pm_idle() and idle() served no purpose on cris --
invoke default_idle() directly.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
pm_idle() on arm64 was a synonym for default_idle(),
so remove it and invoke default_idle() directly.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
pm_idle() on ARM was a synonym for default_idle(),
so simply invoke default_idle() directly.
Signed-off-by: Len Brown <len.brown@intel.com>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
(pm_idle)() is being removed from linux/pm.h
because Linux does not have such a cross-architecture concept.
sparc uses an idle function pointer in its architecture
specific code. So we re-name sparc use of pm_idle to sparc_idle.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
SH idle code could use some simplification.
This patch enables that by guaranteeing
that "sh_idle" is local, and thus architecture specific.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: linux-sh@vger.kernel.org
(pm_idle)() is being removed from linux/pm.h
because Linux does not have such a cross-architecture concept.
x86 uses an idle function pointer in its architecture
specific code as a backup to cpuidle. So we re-name
x86 use of pm_idle to x86_idle, and make it static to x86.
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Update APM to register its local idle routine with cpuidle.
This allows us to stop exporting pm_idle to modules on x86.
The Kconfig sub-option, APM_CPU_IDLE, now depends on on CPU_IDLE.
Compile-tested only.
Signed-off-by: Len Brown <len.brown@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Jiri Kosina <jkosina@suse.cz>
the drivers modification are picked-up by MTD.
Changes the use ECC to hardware ECC (named PMECC) for
SoCs that are using it and their associated Evaluation Kits:
- at91sam9x5-ek
- at91sam9n12-ek
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJRG3bVAAoJEAf03oE53VmQ77IH/RAVjO0arNEg82jQjZ2/J0n6
qiuybxcd1SXgEdxMIauzSQ0rkt/7kgfw/Ezo5qILtmlFKKQnXInaqd0oB878z8HH
uO18wn/mwIdy1avJeRl3AdTCVKoJnlZ3BrpoqdNvmy8NbL7irba9Vlbq8cCGam5J
pZBiPDnoyiaa0Dd8qmgCmNaFIw7HuVpVicbTFVHIEsy1t3OPDyg/5qDFPFFIAbDz
rlPRDiInMoqd0f8LxZrD+kawyfMc3PUFrylqOGnP4/aWv1QBMkvvxku4t0/SMGr9
6xY4jtNWFu38nwrO9NqLFGS5aKTN1A8uluUoBTrooudzRmsZxZxVH2cIFe+vwCY=
=GE/y
-----END PGP SIGNATURE-----
Merge tag 'at91-dt-late' of git://github.com/at91linux/linux-at91 into next/dt
From Nicolas Ferre:
More DT modifications for AT91. Now that I am sure that
the drivers modification are picked-up by MTD.
Changes the use ECC to hardware ECC (named PMECC) for
SoCs that are using it and their associated Evaluation Kits:
- at91sam9x5-ek
- at91sam9n12-ek
* tag 'at91-dt-late' of git://github.com/at91linux/linux-at91:
ARM: at91: at91sam9n12: add DT parameters to enable PMECC
ARM: at91: at91sam9x5: add DT parameters to enable PMECC
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iEUEABECAAYFAlEbUvAACgkQGxsu9jQV9nYfWwCfY2/0W6jJWNWRYpTVNzw0o42j
l9kAl3BEJZGP5nU9wNRszxtFiaFMvtk=
=VSdC
-----END PGP SIGNATURE-----
Merge tag 'sunxi-dt-for-3.9' of https://github.com/mripard/linux into next/dt
From Maxime Ripard:
Allwinner sunXi DT additions for 3.9
* tag 'sunxi-dt-for-3.9' of https://github.com/mripard/linux:
sunxi: a13-olinuxino: Add user LED to the device tree
sunxi: a10-cubieboard: Add user LEDs to the device tree
ARM: sunxi: Add device tree for Miniand Hackberry
From Kukjin Kim:
Here is Samsung fixes for v3.9 and it is not a critical fixes.
* 'next/fixes-samsung' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: dts: Correct pin configuration of SD 4 for exynos4x12-pinctrl
ARM: SAMSUNG: Silence empty switch warning in fimc-core.h
ARM: SAMSUNG: Silence empty switch warning in sdhci.h
ARM: S5PV210: Fix early uart output in fifo mode
ARM: S3C24XX: Fix compile breakage for SMDK2410
ARM: S3C24XX: add missing platform_device.h include for osiris
ARM: S3C24XX: let S3C2412_PM select S3C2412_PM_SLEEP
ARM: SAMSUNG: Gracefully exit on suspend failure
ARM: SAMSUNG: using vsnprintf instead of vsprintf for the limit buffer length 256
ARM: S3C24XX: Make 'clk_msysclk' static
Simplify life for drivers using an encoder-slave, so that they can make
their drm_encoder_helper_funcs const, rather than needing to dynamically
allocate and populate them.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add helper to display fb's which can be used directly in drm_info_list:
static struct drm_info_list foo_debugfs_list[] = {
...
{ "fb", drm_fb_cma_debugfs_show, 0 },
};
to display information about CMA fb objects, as well as a
drm_gem_cma_describe() which can be used if the driver bothers to keep
a list of CMA GEM objects.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Initialize e->pipe.. some drivers set this themselves, others do not.
Setting it in drm_send_vblank_event() should help ensure more consistent
behavior with the different drivers.
Signed-off-by: Rob Clark <robdclark@gmail.com>
We need to clear the local variable to get the refcounting right
(since the reference drm_mode_setplane holds is transferred to the
plane->fb pointer). But should be done _after_ we update the pointer.
Breakage introduced in
commit 6c2a75325c
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Tue Dec 11 00:59:24 2012 +0100
drm: refcounting for sprite framebuffers
Reported-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
/me grabs a few brown paper bags
So it looks like I've broken compilation in
commit 6aed8ec3f7
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Sun Jan 20 17:32:21 2013 +0100
drm: review locking for drm_fb_helper_restore_fbdev_mode
Fix it up again.
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since commit 620038f6d2, gcc is throwing the following warning:
CC [M] net/sunrpc/auth_gss/auth_gss.o
In file included from include/linux/sunrpc/types.h:14:0,
from include/linux/sunrpc/sched.h:14,
from include/linux/sunrpc/clnt.h:18,
from net/sunrpc/auth_gss/auth_gss.c:45:
net/sunrpc/auth_gss/auth_gss.c: In function ‘gss_pipe_downcall’:
include/linux/sunrpc/debug.h:45:10: warning: ‘timeout’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
printk(KERN_DEFAULT args); \
^
net/sunrpc/auth_gss/auth_gss.c:194:15: note: ‘timeout’ was declared here
unsigned int timeout;
^
If simple_get_bytes returns an error, then we'll end up calling printk
with an uninitialized timeout value. Reasonably harmless, but fairly
simple to fix by removing the printout of the uninitialised parameters.
Cc: Andy Adamson <andros@netapp.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
[Trond: just remove the parameters rather than initialising timeout]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This reverts commit aac73f3454. That
commit causes two kinds of breakage; it breaks registration of AMBA
devices when one of the parent nodes already contains overlapping
resource regions, and it breaks calls to request_region() by device
drivers in certain conditions where there are overlapping memory
regions. Both of these problems can probably be fixed, but it is better
to back out the commit and get a proper fix designed before trying again.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
bootresponse in snd_usb_mbox2_boot_quirk is only 12 (decimal) u8's
long, but i9s passed to snd_usb_ctl_msg as it would be 0x12 (hexa)
long. Fix that by having proper size of the array, i.e. 0x12.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Rewrite server shutdown to remove the assumption that there are no
longer any threads running (no longer true, for example, when shutting
down the service in one network namespace while it's still running in
others).
Do that by doing what we'd do in normal circumstances: just CLOSE each
socket, then enqueue it.
Since there may not be threads to handle the resulting queued xprts,
also run a simplified version of the svc_recv() loop run by a server to
clean up any closed xprts afterwards.
Cc: stable@kernel.org
Tested-by: Jason Tibbitts <tibbs@math.uh.edu>
Tested-by: Paweł Sikora <pawel.sikora@agmk.net>
Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
svc_age_temp_xprts expires xprts in a two-step process: first it takes
the sv_lock and moves the xprts to expire off their server-wide list
(sv_tempsocks or sv_permsocks) to a local list. Then it drops the
sv_lock and enqueues and puts each one.
I see no reason for this: svc_xprt_enqueue() will take sp_lock, but the
sv_lock and sp_lock are not otherwise nested anywhere (and documentation
at the top of this file claims it's correct to nest these with sp_lock
inside.)
Cc: stable@kernel.org
Tested-by: Jason Tibbitts <tibbs@math.uh.edu>
Tested-by: Paweł Sikora <pawel.sikora@agmk.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
As we don't include kernel/Kconfig.hz as this defines HZ values
unsuitable for ARM platforms, add the SCHED_HRTICK to properly configure
the scheduler for hrtimer operation.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Always use to_twl() for converting into private data instead of
container_of().
Signed-off-by: Johannes Thumshirn <morbidrsa@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Commit 287ad220cd tried to set up the argument
to schedule_tail, but ended up using TI_STACK which isn't a defined symbol.
Sadly, the old openrisc compiler silently ignores this fact and it was first
discovered now when building with an updated toolchain.
Reported-by: Christian Svensson <blue@cmd.nu>
Signed-off-by: Jonas Bonn <jonas@southpole.se>
The self-modifying code that updates the TLB handler at start-up has
a subtle ordering requirement: the DTLB handler must be the last thing
changed.
What I was seeing was the following:
i) The DTLB handler was updated
ii) The following printk caused a TLB miss and the look-up resulted
in the page containing itlb_vector (0xc0000a00) being bounced from
the TLB.
iii) The subsequent access to itlb_vector caused a TLB miss and reload
of the page containing itlb_vector from the page tables.
iv) But this reload of the page in iii) was being done by the "new"
DTLB-miss handler which resulted (correctly) in the page flags being
set to read-only; the subsequent write-access to itlb_vector thus
resulted in a page (access) fault.
This is easily remedied if we ensure that the boot-time DTLB-miss handler
continues running until the very last bit of self-modifying code has been
executed. This patch should ensure that the very last thing updated is the
DTLB-handler itself.
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Acked-by: Julius Baxter <juliusbaxter@gmail.com>
Tested-by: Sebastian Macke <sebastian@macke.de>