EXT4_GET_BLOCKS_NO_LOCK flag is added to indicate that we don't need
to acquire i_data_sem lock in ext4_map_blocks. Meanwhile, it changes
ext4_get_block() to not start a new journal because when we do a
overwrite dio, there is no any metadata that needs to be modified.
We define a new function called ext4_get_block_write_nolock, which is
used in dio overwrite nolock. In this function, it doesn't try to
acquire i_data_sem lock and doesn't start a new journal as it does a
lookup.
CC: Tao Ma <tm@tao.ma>
CC: Eric Sandeen <sandeen@redhat.com>
CC: Robin Dong <hao.bigrat@gmail.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext4_file_dio_write is defined in order to split buffered IO and
direct IO in ext4. This patch just refactor some stuff in write path.
CC: Tao Ma <tm@tao.ma>
CC: Eric Sandeen <sandeen@redhat.com>
CC: Robin Dong <hao.bigrat@gmail.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In this patch, the statement "poff = block % blocks_per_page"
in ext4_mb_get_buddy_page_lock has no effect.
It will be optimized out by the compiler, but it's better to remove it.
Signed-off-by: Haibo Liu <HaiboLiu6@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In this patch, ext4_ext_try_to_merge has been change to merge
an extent both left and right. So we need to update the comment
in here.
Signed-off-by: HaiboLiu <HaiboLiu6@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In xattr block operation, we use h_refcount to indicate whether the
xattr block is shared among many inodes. And xattr block csum uses
s_csum_seed if it is shared and i_csum_seed if it belongs to
one inode. But this has a problem. So consider the block is shared
first bewteen inode A and B, and B has some xattr update and CoW
the xattr block. When it updates the *old* xattr block(because
of the h_refcount change) and calls ext4_xattr_release_block, we
has no idea that inode A is the real owner of the *old* xattr
block and we can't use the i_csum_seed of inode A either in xattr
block csum calculation. And I don't think we have an easy way to
find inode A.
So this patch just removes the tricky i_csum_seed and we now uses
s_csum_seed every time for the xattr block csum. The corresponding
patch for the e2fsprogs will be sent in another patch.
This is spotted by xfstests 117.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Darrick J. Wong <djwong@us.ibm.com>
In ext4_rename, when the old name is a dir, we need to
change ".." to its new parent and journal the change, so
with metadata_csum enabled, we have to re-calc the csum.
As the first block of the dir can be either a htree root
or a normal directory block and we have different csum
calculation for these 2 types, we have to choose the right
one in ext4_rename.
btw, it is found by xfstests 013.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Darrick J. Wong <djwong@us.ibm.com>
Commit f975d6bcc7 introduced bug which caused ext4_statfs() to
miscalculate the number of file system overhead blocks. This causes
the f_blocks field in the statfs structure to be larger than it should
be. This would in turn cause the "df" output to show the number of
data blocks in the file system and the number of data blocks used to
be larger than they should be.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
Make it possible for ext4_count_free to operate on buffers and not
just data in buffer_heads.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
Ext4 must make sure the transaction to be commited to the disk when
user opens a file with O_(D)SYNC flag and do a fallocate(2) call.
This problem had been reported by Christoph Hellwig in this thread:
http://www.spinics.net/lists/linux-btrfs/msg13621.html
Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Currently ext4_mb_load_buddy is called for every group, irrespective
of whether the group info is already in memory, while reading
/proc/fs/ext4/<partition>/mb_groups proc file. For the purpose of
mb_groups proc file, it is unnecessary to load the file group info
from disk if it was loaded in past. These calls to ext4_mb_load_buddy
make reading the mb_groups proc file expensive.
Also, the locks around ext4_get_group_info are not required.
This patch modifies the code to call ext4_mb_load_buddy only if the
group info had never been loaded into memory in past. It also removes
the mb group locking around ext4_get_group_info call.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Pull UDF fixes from Jan Kara:
"Make UDF more robust in presence of corrupted filesystem"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Fortify loading of sparing table
udf: Avoid run away loop when partition table length is corrupted
udf: Use 'ret' instead of abusing 'i' in udf_load_logicalvol()
'IS_ENABLED()' macro usage: should be 'IS_ENABLED(CONFIG_DEBUG_FS)',
but we had 'IS_ENABLED(DEBUG_FS)'. Also fix incorrect assertion.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJP7Hx+AAoJECmIfjd9wqK04FEP/3NC0qMlleuEcHv9AFwOJ1PB
rn1PDjz6kuEjZ1/xhGFoboBOUj577qyzIP6IvG7MqSH66Yc8gBF+sPbKWWglx7wB
i2y7/Fi4lHM3w59GfvnOhdI7McklhPyl3R183MZ3EBJk4V0LPi2rXsl7G5puLNgG
XkJuOjXLXZPgyeMR+DlBsoaaxBMihnh/pdpUAyLER1cQdzQwCzba82tNrMgnCp7i
TIFTPtn+LmEQyHcqXx5ub/FV6BiEUXIJbkKlp5Ajqyh/olSNtGdCHRrP5MXpU2kI
DmtyMvp+3PBHxrUYQjXT6uerL9uXhIUyRv49qO0tS68fUg44JCFflmPkoV9qEvvl
ADbOqklx1DWdVCiZdhXWe1GFhf6U+TOoUyeiIzGIy0fIIlycNl915F4LzxVUqQKm
yoqouEvzqd1LIAsopakF2DDIKoK6ViWmHBkuN04B+u+iab4DC4aX3vgxq+Ie8mqA
0QNIamovk/2MR2665XhbARu0yDSEmGZvD6dkuSAgXIxjw7tdvvlY7pkSouWTOSR0
fbqPrhgbRON+mT4Fcrb5dMq+PAiOTw5kp7az90+U6i1oLm5TY8CaHt3UvmdspOM/
UeHRhLR8o/RhfnvnexiOxIWtQGCX+CCuePe9oN/fQabj9dfvKI1iR+zoyheFLwiT
HxaU6I7oVYQVkeifNJVV
=iZTv
-----END PGP SIGNATURE-----
Merge tag 'upstream-3.5-rc5' of git://git.infradead.org/linux-ubifs
Pull ubi/ubifs fixes from Artem Bityutskiy:
"Fix the debugfs regression - we never enable it because incorrect
'IS_ENABLED()' macro usage: should be 'IS_ENABLED(CONFIG_DEBUG_FS)',
but we had 'IS_ENABLED(DEBUG_FS)'. Also fix incorrect assertion."
* tag 'upstream-3.5-rc5' of git://git.infradead.org/linux-ubifs:
UBI: correct usage of IS_ENABLED()
UBIFS: correct usage of IS_ENABLED()
UBIFS: fix assertion
Check provided length of partition table so that (possibly maliciously)
corrupted partition table cannot cause accessing data beyond current buffer.
Signed-off-by: Jan Kara <jack@suse.cz>
Commit "818039c UBIFS: fix debugfs-less systems support" fixed one
regression but introduced a different regression - the debugfs is now always
compiled out. Root cause: IS_ENABLED() arguments should be used with the
CONFIG_* prefix.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Pull Ceph fixes from Sage Weil:
"There are a couple of fixes from Yan for bad pointer dereferences in
the messenger code and when fiddling with page->private after page
migration, a fix from Alex for a use-after-free in the osd client
code, and a couple fixes for the message refcounting and shutdown
ordering."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: flush msgr queue during mon_client shutdown
rbd: Clear ceph_msg->bio_iter for retransmitted message
libceph: use con get/put ops from osd_client
libceph: osd_client: don't drop reply reference too early
ceph: check PG_Private flag before accessing page->private
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJP435OAAoJENaLyazVq6ZOms0P/38KYwNpgGgoeO57ZNXtGXen
C98aa0IwFkjNUFPIogJD4e6gcxfxI9d+626xFZpnkoIEXXpEco5xexjBSIRfg3d7
rNB4HgyQvybJrmRimqyIonTq5DVhQNnlxfYLJKtpM8dhSodNF3YGWmXMcXRSvoZO
D1gMXJxTCoZJ5HjVMFRUOfgKX8RYrM7zNmrzMnefUOkuyFN2Dll7ZxGil05GKnQn
IYds1DvRnyge8o4b7tRUrI50YM3j/w1HgtEDuquQ6UWvOCdbivpPH/+JaP5yD6kQ
H39UBmi+2orC5m8AbDGGaeNuWC844emsLHDkZ63YTrDlEPwo7XZZQ7q8PgKsfqWz
wzOUj9K0VAmcRmPjLsgwktsZYr+tyNYW+fYz6NzlSTtBK8fTyTPiyTNEJ4fEA42O
poRbwH1yoM0YLyII+I+dOb2gk7mOsJa7QMYUE+Art6QWapfUOvQEaIewIEoMyJL9
5Tl4jAZzso0aHmz/72s7CgV7j3gUrOCKGklPYIe5pGt5504CUxypjF2E01cqV5hJ
QNz3LuC8Koraj787DZqY/w7Kk+SNsyzBOidlgy7hgjJEmNrsXAtweyGXH1gKOF3G
ZstBsrgCacgPi+1ixJNJBCDbnM0p6XUZhqFT76aV78CaMgKfslzbgS9qFeCEBS2h
kMvFaTHC9obmAGikOD4f
=QLde
-----END PGP SIGNATURE-----
Merge tag 'for-linus-Jun-21-2012' of git://oss.sgi.com/xfs/xfs
Pull XFS fixes from Ben Myers:
- Fix stale data exposure with unwritten extents
- Fix a warning in xfs_alloc_vextent with ODEBUG
- Fix overallocation and alignment of pages for xfs_bufs
- Fix a cursor leak
- Fix a log hang
- Fix a crash related to xfs_sync_worker
- Rename xfs log structure from struct log to struct xlog so we can use
crash dumps effectively
* tag 'for-linus-Jun-21-2012' of git://oss.sgi.com/xfs/xfs:
xfs: rename log structure to xlog
xfs: shutdown xfs_sync_worker before the log
xfs: Fix overallocation in xfs_buf_allocate_memory()
xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near
xfs: check for stale inode before acquiring iflock on push
xfs: fix debug_object WARN at xfs_alloc_vextent()
xfs: xfs_vm_writepage clear iomap_valid when !buffer_uptodate (REV2)
Fixes include:
- Fix a write hang due to an uninitalised variable when !defined(CONFIG_NFS_V4)
- Address upcall races in the legacy NFSv4 idmapper
- Remove an O_DIRECT refcounting issue
- Fix a pNFS refcounting bug when the file layout metadata server is also
acting as a data server
- Fix a pNFS module loading race.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJP46PEAAoJEGcL54qWCgDyClwP/RlcSUAgTeFo3VFcedMdZqKN
cdYzzyT2r0rzxtEOkdE1aFqukspMTx6cU83opHYJKYP66stkx98JW0+LVcsg8vtm
SKjYZRAM/xsZVo+m8E3iQ9Z7K0kn1W/+OSwzJO7arIqo++fV8aiGn4+Gpgx9SrWS
FU5iC7p1LThOZks3Nis0VLbLDpS058vRJgyfCzTS1NyjABHOOEYb6JhhkYeXLH7G
vn0QRGXyq2sGxUYhR3BNWdRMn9XV5p5mOUoVjLxPBV84gm7wIjYKOGJHQRSn4Ew/
CEYAvBksGpT7ifJflkzgg8acVSvuq7HacMpHj9O9SpT5aesvVuhcm3pxUN1YJo6m
WNRj3kqgc6eCuIbiA+ENZuHIsLDzOFw3H4RhKCPj7C9HG88nFGIhrGJ9OIIut1AF
X81L5aTox3UASZXuieZ0dAqVyTH7n288SSTzYaYy5O++4cW4hqZt7wQegr8Sk6b9
8zrWXkLjTNGFZo3mAhlgZf5qV3UYt/yNCk9U/1JxvH+1tPTvfYpqavdXusMJ03rn
7z4LQxwD93YhkiD8NNGDHoBoZesmE3E0ucug+Cb1wLeT0b0C9ChOYdAphQkXxkNl
lJxN4TfoBCgwwQx88Z/UilNvIGffJwVZzRgX6y//WACPssCdM5S0Zlb8nGIGb98G
J2uFwuqP0WNhMPSbg+Wj
=uEPz
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Fix a write hang due to an uninitalised variable when
!defined(CONFIG_NFS_V4)
- Address upcall races in the legacy NFSv4 idmapper
- Remove an O_DIRECT refcounting issue
- Fix a pNFS refcounting bug when the file layout metadata server is
also acting as a data server
- Fix a pNFS module loading race.
* tag 'nfs-for-3.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Force the legacy idmapper to be single threaded
NFS: Initialise commit_info.rpc_out when !defined(CONFIG_NFS_V4)
NFS: Fix a refcounting issue in O_DIRECT
NFSv4.1: Fix a race in set_pnfs_layoutdriver
NFSv4.1: Fix umount when filelayout DS is also the MDS
Pull btrfs fixes from Chris Mason:
"This is a small pull with btrfs fixes. The biggest of the bunch is
another fix for the new backref walking code.
We're still hammering out one btrfs dio vs buffered reads problem, but
that one will have to wait for the next rc."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: delay iput with async extents
Btrfs: add a missing spin_lock
Btrfs: don't assume to be on the correct extent in add_all_parents
Btrfs: introduce btrfs_next_old_item
Rename the XFS log structure to xlog to help crash distinquish it from the
other logs in Linux.
Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Revert commit 1307bbd, which uses the s_umount semaphore to provide
exclusion between xfs_sync_worker and unmount, in favor of shutting down
the sync worker before freeing the log in xfs_log_unmount. This is a
cleaner way of resolving the race between xfs_sync_worker and unmount
than using s_umount.
Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Commit de1cbee which removed b_file_offset in favor of b_bn introduced a bug
causing xfs_buf_allocate_memory() to overestimate the number of necessary
pages. The problem is that xfs_buf_alloc() sets b_bn to -1 and thus effectively
every buffer is straddling a page boundary which causes
xfs_buf_allocate_memory() to allocate two pages and use vmalloc() for access
which is unnecessary.
Dave says xfs_buf_alloc() doesn't need to set b_bn to -1 anymore since the
buffer is inserted into the cache only after being fully initialized now.
So just make xfs_buf_alloc() fill in proper block number from the beginning.
CC: David Chinner <dchinner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
When we fail to find an matching extent near the requested extent
specification during a left-right distance search in
xfs_alloc_ag_vextent_near, we fail to free the original cursor that
we used to look up the XFS_BTNUM_CNT tree and hence leak it.
Reported-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
An inode in the AIL can be flush locked and marked stale if
a cluster free transaction occurs at the right time. The
inode item is then marked as flushing, which causes xfsaild
to spin and leaves the filesystem stalled. This is
reproduced by running xfstests 273 in a loop for an
extended period of time.
Check for stale inodes before the flush lock. This marks
the inode as pinned, leads to a log flush and allows the
filesystem to proceed.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
There is some concern that these iput()'s could be the final iputs and could
induce lockups on people waiting on writeback. This would happen in the
rare case that we don't create ordered extents because of an error, but it
is theoretically possible and we already have a mechanism to deal with this
so just make them delayed iputs to negate any worry.
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
When fixing up the locking in the delayed ref destruction work I accidently
broke the locking myself ;(. Add back a spin_lock that should be there and
we are now all set. Thanks,
Btrfs: add a missing spin_lock
When fixing up the locking in the delayed ref destruction work I accidently
broke the locking myself ;(. Add back a spin_lock that should be there and
we are now all set. Thanks,
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
add_all_parents did assume that path is already at a correct extent data
item, which may not be true in case of data extents that were partly
rewritten and splitted.
We need to check if we're on a matching extent for every item and only
for the ones after the first. The loop is changed to do this now.
This patch also fixes a bug introduced with commit 3b127fd8 "Btrfs:
remove obsolete btrfs_next_leaf call from __resolve_indirect_ref".
The removal of next_leaf did sometimes result in slot==nritems when
the above described case happens, and thus resulting in invalid values
(e.g. wanted_obejctid) in add_all_parents (leading to missed backrefs
or even crashes).
Signed-off-by: Alexander Block <ablock84@googlemail.com>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
We introduce btrfs_next_old_item that uses btrfs_next_old_leaf instead
of btrfs_next_leaf.
btrfs_next_item is also changed to simply call btrfs_next_old_item with
time_seq being 0.
Signed-off-by: Alexander Block <ablock84@googlemail.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Here are a number of small fixes for the drivers/staging tree, as well as iio
and pstore drivers (which came from the staging tree in the 3.5-rc1 merge).
All of these are tiny, but resolve issues that people have been reporting.
There's also a documentation update to reflect what the iio drivers really are
doing, which is good to get straightened out.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iEYEABECAAYFAk/iNeAACgkQMUfUDdst+ynNVwCdHCj6smC2JUbvN34gACNrpsYY
WggAoJzQn9mQhwq0pa/ZTpaUOvCFZ39L
=hDkC
-----END PGP SIGNATURE-----
Merge tag 'staging-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging tree fixes from Greg Kroah-Hartman:
"Here are a number of small fixes for the drivers/staging tree, as well
as iio and pstore drivers (which came from the staging tree in the
3.5-rc1 merge). All of these are tiny, but resolve issues that people
have been reporting.
There's also a documentation update to reflect what the iio drivers
really are doing, which is good to get straightened out.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'staging-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: r8712u: Add new USB IDs
staging: gdm72xx: Release netlink socket properly
iio: drop wrong reference from Kconfig
pstore/inode: Make pstore_fill_super() static
pstore/ram: Should zap persistent zone on unlink
pstore/ram_core: Factor persistent_ram_zap() out of post_init()
pstore/ram_core: Do not reset restored zone's position and size
pstore/ram: Should update old dmesg buffer before reading
staging:iio:ad7298: Fix linker error due to missing IIO kfifo buffer
Revert "staging: usbip: bugfix for stack corruption on 64-bit architectures"
staging: usbip: bugfix for stack corruption on 64-bit architectures
staging/comedi: fix build for USB not enabled
staging: omapdrm: fix crash when freeing bad fb
staging:iio:ad7606: Re-add missing scale attribute
iio: Fix potential use after free
staging:iio: remove num_interrupt_lines from documentation
iio: documentation: Add out_altvoltage and friends
Here are some fixes for 3.5-rc4 that resolve the kmsg problems that
people have reported showing up after the printk and kmsg changes went
into 3.5-rc1. There are also a smattering of other tiny fixes for the
extcon and hyper-v drivers that people have reported.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iEYEABECAAYFAk/iNQcACgkQMUfUDdst+yklTQCfZCXFlhA43bZo/8Joqd2pLIIW
2uoAoMze0SlfJeN6Qu7yY0P+qV/f/pc3
=UNFY
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core and printk fixes from Greg Kroah-Hartman:
"Here are some fixes for 3.5-rc4 that resolve the kmsg problems that
people have reported showing up after the printk and kmsg changes went
into 3.5-rc1. There are also a smattering of other tiny fixes for the
extcon and hyper-v drivers that people have reported.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'driver-core-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
extcon: max8997: Add missing kfree for info->edev in max8997_muic_remove()
extcon: Set platform drvdata in gpio_extcon_probe() and fix irq leak
extcon: Fix wrong index in max8997_extcon_cable[]
kmsg - kmsg_dump() fix CONFIG_PRINTK=n compilation
printk: return -EINVAL if the message len is bigger than the buf size
printk: use mutex lock to stop syslog_seq from going wild
kmsg - kmsg_dump() use iterator to receive log buffer content
vme: change maintainer e-mail address
Extcon: Don't try to create duplicate link names
driver core: fixup reversed deferred probe order
printk: Fix alignment of buf causing crash on ARM EABI
Tools: hv: verify origin of netlink connector message
* emailed from Andrew Morton <akpm@linux-foundation.org>: (21 patches)
mm/memblock: fix overlapping allocation when doubling reserved array
c/r: prctl: Move PR_GET_TID_ADDRESS to a proper place
pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper
pidns: guarantee that the pidns init will be the last pidns process reaped
fault-inject: avoid call to random32() if fault injection is disabled
Viresh has moved
get_maintainer: Fix --help warning
mm/memory.c: fix kernel-doc warnings
mm: fix kernel-doc warnings
mm: correctly synchronize rss-counters at exit/exec
mm, thp: print useful information when mmap_sem is unlocked in zap_pmd_range
h8300: use the declarations provided by <asm/sections.h>
h8300: fix use of extinct _sbss and _ebss
xtensa: use the declarations provided by <asm/sections.h>
xtensa: use "test -e" instead of bashism "test -a"
xtensa: replace xtensa-specific _f{data,text} by _s{data,text}
memcg: fix use_hierarchy css_is_ancestor oops regression
mm, oom: fix and cleanup oom score calculations
nilfs2: ensure proper cache clearing for gc-inodes
thp: avoid atomic64_read in pmd_read_atomic for 32bit PAE
...
do_exit() and exec_mmap() call sync_mm_rss() before mm_release() does
put_user(clear_child_tid) which can update task->rss_stat and thus make
mm->rss_stat inconsistent. This triggers the "BUG:" printk in check_mm().
Let's fix this bug in the safest way, and optimize/cleanup this later.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A gc-inode is a pseudo inode used to buffer the blocks to be moved by
garbage collection.
Block caches of gc-inodes must be cleared every time a garbage collection
function (nilfs_clean_segments) completes. Otherwise, stale blocks
buffered in the caches may be wrongly reused in successive calls of the GC
function.
For user files, this is not a problem because their gc-inodes are
distinguished by a checkpoint number as well as an inode number. They
never buffer different blocks if either an inode number, a checkpoint
number, or a block offset differs.
However, gc-inodes of sufile, cpfile and DAT file can store different data
for the same block offset. Thus, the nilfs_clean_segments function can
move incorrect block for these meta-data files if an old block is cached.
I found this is really causing meta-data corruption in nilfs.
This fixes the issue by ensuring cache clear of gc-inodes and resolves
reported GC problems including checkpoint file corruption, b-tree
corruption, and the following warning during GC.
nilfs_palloc_freev: entry number 307234 already freed.
...
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org> [2.6.37+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On filesytems with a block size smaller than PAGE_SIZE we currently have
a problem with unwritten extents. If a we have multi-block page for
which an unwritten extent has been allocated, and only some of the
buffers have been written to, and they are not contiguous, we can expose
stale data from disk in the blocks between the writes after extent
conversion.
Example of a page with unwritten and real data.
buffer content
0 empty b_state = 0
1 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
2 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
3 empty b_state = 0
4 empty b_state = 0
5 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
6 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
7 empty b_state = 0
Buffers 1, 2, 5, and 6 have been written to, leaving 0, 3, 4, and 7
empty. Currently buffers 1, 2, 5, and 6 are added to a single ioend,
and when IO has completed, extent conversion creates a real extent from
block 1 through block 6, leaving 0 and 7 unwritten. However buffers 3
and 4 were not written to disk, so stale data is exposed from those
blocks on a subsequent read.
Fix this by setting iomap_valid = 0 when we find a buffer that is not
Uptodate. This ensures that buffers 5 and 6 are not added to the same
ioend as buffers 1 and 2. Later these blocks will be converted into two
separate real extents, leaving the blocks in between unwritten.
Signed-off-by: Alain Renaud <arenaud@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
It was initially coded under the assumption that there would only be one
request at a time, so use a lock to enforce this requirement..
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
CC: stable@vger.kernel.org [3.4+]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
I got lots of NULL pointer dereference Oops when compiling kernel on ceph.
The bug is because the kernel page migration routine replaces some pages
in the page cache with new pages, these new pages' private can be non-zero.
Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 28c0254ede)
In nfs_direct_write_reschedule(), the requests from nfs_scan_commit_list
have a refcount of 2, whereas the operations in
nfs_direct_write_completion_ops expect them to have a refcount of 1.
This patch adds a call to release the extra references.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Fred Isaman <iisaman@netapp.com>
The call to try_module_get() dereferences ld_type outside the
spin locks, which means that it may be pointing to garbage if
a module unload was in progress.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently there is a 'chicken and egg' issue when the DS is also the mounted
MDS. The nfs_match_client() reference from nfs4_set_ds_client bumps the
cl_count, the nfs_client is not freed at umount, and nfs4_deviceid_purge_client
is not called to dereference the MDS usage of a deviceid which holds a
reference to the DS nfs_client. The result is the umount program returns,
but the nfs_client is not freed, and the cl_session hearbeat continues.
The MDS (and all other nfs mounts) lose their last nfs_client reference in
nfs_free_server when the last nfs_server (fsid) is umounted.
The file layout DS lose their last nfs_client reference in destroy_ds
when the last deviceid referencing the data server is put and destroy_ds is
called. This is triggered by a call to nfs4_deviceid_purge_client which
removes references to a pNFS deviceid used by an MDS mount.
The fix is to track how many pnfs enabled filesystems are mounted from
this server, and then to purge the device id cache once that count reaches
zero.
Reported-by: Jorge Mora <Jorge.Mora@netapp.com>
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The asserts here never check anything because it uses '|' instead of
'&'. Now if the flags are not set it prints a warning a a stack trace.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
HFS+ doesn't really implement hard links - instead, hardlinks are indicated
by a magic file type which refers to an indirect node in a hidden
directory. The spec indicates that stat() should return the inode number
of the indirect node, but it turns out that this doesn't satisfy the
firmware when it's looking for a bootloader - it wants the catalog ID of
the hardlink file instead. Fix up this case.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The variable io_size was unsigned int, which caused the wrong sector number
to be calculated after aligning it. This then caused mount to fail with big
volumes, as backup volume header information was searched from a
wrong sector.
Signed-off-by: Janne Kalliomäki <janne@tuxera.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull btrfs compile warning fixes from Chris Mason.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: cast devid to unsigned long long for printk %llu
Btrfs: init old_generation in get_old_root
Highlights include:
- Fix a couple of mount regressions due to the recent cleanups.
- Fix an Oops in the open recovery code
- Fix an rpc_pipefs upcall hang that results from some of the
net namespace work from 3.4.x (stable kernel candidate).
- Fix a couple of write and o_direct regressions that were found
at last weeks Bakeathon testing event in Ann Arbor.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJP2gmaAAoJEGcL54qWCgDyrBMP/RY/T++He8y5k3M9aEqiIv0q
D8ZVMwzID6f4Zgw4xRg96aYr02sBTw0q+0mP5x1EZmg8mK29rnBiVeKHE1iwSfXq
10/SYISlpIjhJC4I4kHXGd2KClgj7qRRCbDKFRWwoIIwYU+kJn8MRnPa9XqdL8kP
q68lrtayW8THSJDR8bk1GQn+ARxGeoY++qzHxm3vpQCbZVVb19VqKMWAWSN4VKqb
epWehOSAzB3iA7HrLRbf8Y8/sDdXewxCQpr9CC/wxuu++l5ifPphR0ToX+k9VZXI
BKFLUojCUZHTMAgCxuxjrFYehMeyClbzL2lLkz5Pgj0gQhOX6Myj+WMXoEg/uWfo
XNf51FH3yBbnfayTaOUs6Y50iuU+dQO7TUTAoWTPpW9V/iT5z/fWAKUVJhDtrPk5
DVDkR6SEgb4P1RqkehZKLq5k5GSAcTR+MZr452eDrFYXJrY8ORDE6o6kP4Rr3Nnd
n8gap0gHxzIYlhBghem6+nLN+HhpZQopWeD8mNub20VuXsChRDr9/+XWuMCSJaZF
2kleVdt2+rTDzi9bJTRYlsX397oaThL0NbRvshHAwnXIDtIQrzxx6+dUyOsEWMEu
go/EdSUUESXGNlsWTqewCBsOjPeE4L5ijI/QglfDkF+CzD5dDjrxl+5i57iMKVfc
Ydste3pQJkS7PiZu1sWA
=unbu
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.5-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
- Fix a couple of mount regressions due to the recent cleanups.
- Fix an Oops in the open recovery code
- Fix an rpc_pipefs upcall hang that results from some of the net
namespace work from 3.4.x (stable kernel candidate).
- Fix a couple of write and o_direct regressions that were found at
last weeks Bakeathon testing event in Ann Arbor."
* tag 'nfs-for-3.5-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: add an endian notation for sparse
NFSv4.1: integer overflow in decode_cb_sequence_args()
rpc_pipefs: allow rpc_purge_list to take a NULL waitq pointer
NFSv4 do not send an empty SETATTR compound
NFSv2: EOF incorrectly set on short read
NFS: Use the NFS_DEFAULT_VERSION for v2 and v3 mounts
NFS: fix directio refcount bug on commit
NFSv4: Fix unnecessary delegation returns in nfs4_do_open
NFSv4.1: Convert another trivial printk into a dprintk
NFS4: Fix open bug when pnfs module blacklisted
NFS: Remove incorrect BUG_ON in nfs_found_client
NFS: Map minor mismatch error to protocol not support error.
NFS: Fix a commit bug
NFS4: Set parsed mount data version to 4
NFSv4.1: Ensure we clear session state flags after a session creation
NFSv4.1: Convert a trivial printk into a dprintk
NFSv4: Fix up decode_attr_mdsthreshold
NFSv4: Fix an Oops in the open recovery code
NFSv4.1: Fix a request leak on the back channel
Pull two nfsd bugfixes from J. Bruce Fields.
* 'for-3.5' of git://linux-nfs.org/~bfields/linux:
nfsd4: BUG_ON(!is_spin_locked()) no good on UP kernels
NFS: hard-code init_net for NFS callback transports
gcc was giving an uninit variable warning here. Strictly
speaking we don't need to init it, but this will make things
much less error prone.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>