Commit Graph

387926 Commits

Author SHA1 Message Date
Zheng Liu
d7b2a00c2e ext4: isolate ext4_extents.h file
After applied the commit (4a092d73), we have reduced the number of
source files that need to #include ext4_extents.h.  But we can do
better.

This commit defines ext4_zeroout_es() in extents.c and move
EXT_MAX_BLOCKS into ext4.h in order not to include ext4_extents.h in
indirect.c and ioctl.c.  Meanwhile we just need to include this file in
extent_status.c when ES_AGGRESSIVE_TEST is defined.  Otherwise, this
commit removes a duplicated declaration in trace/events/ext4.h.

After applied this patch, we just need to include ext4_extents.h file
in {super,migrate,move_extents,extents}.c, and it is easy for us to
define a new extent disk layout.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-08-28 14:47:06 -04:00
Anatol Pomozov
70261f568f ext4: Fix misspellings using 'codespell' tool
Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-08-28 14:40:12 -04:00
Dmitry Monakhov
7afe5aa59e ext4: convert write_begin methods to stable_page_writes semantics
Use wait_for_stable_page() instead of wait_on_page_writeback()

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2013-08-28 14:30:47 -04:00
Andi Shyti
27b1b22882 ext4: fix use of potentially uninitialized variables in debugging code
If ext_debugging is enabled and path[depth].p_ext is NULL, len
and lblock are printed non initialized

Signed-off-by: Andi Shyti <andi@etezian.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-08-28 14:00:00 -04:00
Jan Kara
90e775b71a ext4: fix lost truncate due to race with writeback
The following race can lead to a loss of i_disksize update from truncate
thus resulting in a wrong inode size if the inode size isn't updated
again before inode is reclaimed:

ext4_setattr()				mpage_map_and_submit_extent()
  EXT4_I(inode)->i_disksize = attr->ia_size;
  ...					  ...
					  disksize = ((loff_t)mpd->first_page) << PAGE_CACHE_SHIFT
					  /* False because i_size isn't
					   * updated yet */
					  if (disksize > i_size_read(inode))
					  /* True, because i_disksize is
					   * already truncated */
					  if (disksize > EXT4_I(inode)->i_disksize)
					    /* Overwrite i_disksize
					     * update from truncate */
					    ext4_update_i_disksize()
  i_size_write(inode, attr->ia_size);

For other places updating i_disksize such race cannot happen because
i_mutex prevents these races. Writeback is the only place where we do
not hold i_mutex and we cannot grab it there because of lock ordering.

We fix the race by doing both i_disksize and i_size update in truncate
atomically under i_data_sem and in mpage_map_and_submit_extent() we move
the check against i_size under i_data_sem as well.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-08-17 10:09:31 -04:00
Jan Kara
5208386c50 ext4: simplify truncation code in ext4_setattr()
Merge conditions in ext4_setattr() handling inode size changes, also
move ext4_begin_ordered_truncate() call somewhat earlier because it
simplifies error recovery in case of failure. Also add error handling in
case i_disksize update fails.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-08-17 10:07:17 -04:00
Jan Kara
5f1132b2ba ext4: fix ext4_writepages() in presence of truncate
Inode size can arbitrarily change while writeback is in progress. When
ext4_writepages() has prepared a long extent for mapping and truncate
then reduces i_size, mpage_map_and_submit_buffers() will always map just
one buffer in a page instead of all of them due to lblk < blocks check.
So we end up not using all blocks we've allocated (thus leaking them)
and also delalloc accounting goes wrong manifesting as a warning like:

ext4_da_release_space:1333: ext4_da_release_space: ino 12, to_free 1
with only 0 reserved data blocks

Note that the problem can happen only when blocksize < pagesize because
otherwise we have only a single buffer in the page.

Fix the problem by removing the size check from the mapping loop. We
have an extent allocated so we have to use it all before checking for
i_size. We also rename add_page_bufs_to_extent() to
mpage_process_page_bufs() and make that function submit the page for IO
if all buffers (upto EOF) in it are mapped.

Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Zheng Liu <gnehzuil.liu@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-08-17 10:02:33 -04:00
Jan Kara
09930042a2 ext4: move test whether extent to map can be extended to one place
Currently the logic whether the current buffer can be added to an extent
of buffers to map is split between mpage_add_bh_to_extent() and
add_page_bufs_to_extent(). Move the whole logic to
mpage_add_bh_to_extent() which makes things a bit more straightforward
and make following i_size fixes easier.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-08-17 09:57:56 -04:00
Jan Kara
7d7345322d ext4: fix warning in ext4_da_update_reserve_space()
reaim workfile.dbase test easily triggers warning in
ext4_da_update_reserve_space():

EXT4-fs warning (device ram0): ext4_da_update_reserve_space:365:
ino 12, allocated 1 with only 0 reserved metadata blocks (releasing 1
blocks with reserved 9 data blocks)

The problem is that (one of) tests creates file and then randomly writes
to it with O_SYNC. That results in writing back pages of the file in
random order so we create extents for written blocks say 0, 2, 4, 6, 8
- this last allocation also allocates new block for extents. Then we
writeout block 1 so we have extents 0-2, 4, 6, 8 and we release
indirect extent block because extents fit in the inode again. Then we
writeout block 10 and we need to allocate indirect extent block again
which triggers the warning because we don't have the reservation
anymore.

Fix the problem by giving back freed metadata blocks resulting from
extent merging into inode's reservation pool.

Signed-off-by: Jan Kara <jack@suse.cz>
2013-08-17 09:36:54 -04:00
Jan Kara
1c8924eb10 quota: provide interface for readding allocated space into reserved space
ext4 needs to convert allocated (metadata) blocks back into blocks
reserved for delayed allocation. Add functions into quota code for
supporting such operation.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-08-17 09:32:32 -04:00
Theodore Ts'o
19883bd965 ext4: avoid reusing recently deleted inodes in no journal mode
In no journal mode, if an inode has recently been deleted, we
shouldn't reuse it right away.  Otherwise it's possible, after an
unclean shutdown, to hit a situation where a recently deleted inode
gets reused for some other purpose before the inode table block has
been written to disk.  However, if the directory entry has been
updated, then the directory entry will be pointing at the old inode
contents.

E2fsck will make sure the file system is consistent after the
unclean shutdown.  However, if the recently deleted inode is a
character mode device, or an inode with the immutable bit set, even
after the file system has been fixed up by e2fsck, it can be
possible for a *.pyc file to be pointing at a character mode
device, and when python tries to open the *.pyc file, Hilarity
Ensues.  We could change all of userspace to be very suspicious
about stat'ing files before opening them, and clearing the
immutable flag if necessary --- or we can just avoid reusing an
inode number if it has been recently deleted.

Google-Bug-Id: 10017573

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-08-16 22:06:55 -04:00
Theodore Ts'o
0e20270454 ext4: allocate delayed allocation blocks before rename
When ext4_rename() overwrites an already existing file, call
ext4_alloc_da_blocks() before starting the journal handle which
actually does the rename, instead of doing this afterwards.  This
improves the likelihood that the contents will survive a crash if an
application replaces a file using the sequence:

1)  write replacement contents to foo.new
2)  <omit fsync of foo.new>
3)  rename foo.new to foo

It is still not a guarantee, since ext4_alloc_da_blocks() is *not*
doing a file integrity sync; this means if foo.new is a very large
file, it may not be completely flushed out to disk.

However, for files smaller than a megabyte or so, any dirty pages
should be flushed out before we do the rename operation, and so at the
next journal commit, the CACHE FLUSH command will make sure al of
these pages are safely on the disk platter.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-08-16 22:06:53 -04:00
Theodore Ts'o
5b61de7575 ext4: start handle at least possible moment when renaming files
In ext4_rename(), don't start the journal handle until the the
directory entries have been successfully looked up.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-08-16 22:06:14 -04:00
Theodore Ts'o
7869a4a6c5 ext4: add support for extent pre-caching
Add a new fiemap flag which forces the all of the extents in an inode
to be cached in the extent_status tree.  This is critically important
when using AIO to a preallocated file, since if we need to read in
blocks from the extent tree, the io_submit(2) system call becomes
synchronous, and the AIO is no longer "A", which is bad.

In addition, for most files which have an external leaf tree block,
the cost of caching the information in the extent status tree will be
less than caching the entire 4k block in the buffer cache.  So it is
generally a win to keep the extent information cached.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-08-16 22:05:14 -04:00
Theodore Ts'o
107a7bd31a ext4: cache all of an extent tree's leaf block upon reading
When we read in an extent tree leaf block from disk, arrange to have
all of its entries cached.  In nearly all cases the in-memory
representation will be more compact than the on-disk representation in
the buffer cache, and it allows us to get the information without
having to traverse the extent tree for successive extents.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
2013-08-16 21:23:41 -04:00
Theodore Ts'o
3be78c7317 ext4: use unsigned int for es_status values
Don't use an unsigned long long for the es_status flags; this requires
that we pass 64-bit values around which is painful on 32-bit systems.
Instead pass the extent status flags around using the low 4 bits of an
unsigned int, and shift them into place when we are reading or writing
es_pblk.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
2013-08-16 21:22:41 -04:00
Theodore Ts'o
c349179b48 ext4: print the block number of invalid extent tree blocks
When we find an invalid extent tree block, report the block number of
the bad block for debugging purposes.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
2013-08-16 21:21:41 -04:00
Theodore Ts'o
7d7ea89e75 ext4: refactor code to read the extent tree block
Refactor out the code needed to read the extent tree block into a
single read_extent_tree_block() function.  In addition to simplifying
the code, it also makes sure that we call the ext4_ext_load_extent
tracepoint whenever we need to read an extent tree block from disk.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
2013-08-16 21:20:41 -04:00
Jan Kara
a361293f5f jbd2: Fix oops in jbd2_journal_file_inode()
Commit 0713ed0cde added
jbd2_journal_file_inode() call into ext4_block_zero_page_range().
However that function gets called from truncate path and thus inode
needn't have jinode attached - that happens in ext4_file_open() but
the file needn't be ever open since mount. Calling
jbd2_journal_file_inode() without jinode attached results in the oops.

We fix the problem by attaching jinode to inode also in ext4_truncate()
and ext4_punch_hole() when we are going to zero out partial blocks.

Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-08-16 21:19:41 -04:00
Jan Kara
91aa11fae1 jbd2: Fix use after free after error in jbd2_journal_dirty_metadata()
When jbd2_journal_dirty_metadata() returns error,
__ext4_handle_dirty_metadata() stops the handle. However callers of this
function do not count with that fact and still happily used now freed
handle. This use after free can result in various issues but very likely
we oops soon.

The motivation of adding __ext4_journal_stop() into
__ext4_handle_dirty_metadata() in commit 9ea7a0df seems to be only to
improve error reporting. So replace __ext4_journal_stop() with
ext4_journal_abort_handle() which was there before that commit and add
WARN_ON_ONCE() to dump stack to provide useful information.

Reported-by: Sage Weil <sage@inktank.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org	# 3.2+
2013-08-12 09:53:28 -04:00
Theodore Ts'o
cde2d7a796 ext4: flush the extent status cache during EXT4_IOC_SWAP_BOOT
Previously we weren't swapping only some of the extent_status LRU
fields during the processing of the EXT4_IOC_SWAP_BOOT ioctl.  The
much safer thing to do is to just completely flush the extent status
tree when doing the swap.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>
Cc: stable@vger.kernel.org
2013-08-12 09:29:30 -04:00
Piotr Sarna
6ae6514b33 ext4: fix mount/remount error messages for incompatible mount options
Commit 5688978 ("ext4: improve handling of conflicting mount options")
introduced incorrect messages shown while choosing wrong mount options.

First of all, both cases of incorrect mount options,
"data=journal,delalloc" and "data=journal,dioread_nolock" result in
the same error message.

Secondly, the problem above isn't solved for remount option: the
mismatched parameter is simply ignored.  Moreover, ext4_msg states
that remount with options "data=journal,delalloc" succeeded, which is
not true.

To fix it up, I added a simple check after parse_options() call to
ensure that data=journal and delalloc/dioread_nolock parameters are
not present at the same time.

Signed-off-by: Piotr Sarna <p.sarna@partner.samsung.com>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-08-08 23:02:24 -04:00
Theodore Ts'o
59d9fa5c2e ext4: allow the mount options nodelalloc and data=journal
Commit 26092bf ("ext4: use a table-driven handler for mount options")
wrongly disallows the specifying the mount options nodelalloc and
data=journal simultaneously.  This is incorrect; it should have only
disallowed the combination of delalloc and data=journal
simultaneously.

Reported-by: Piotr Sarna <p.sarna@partner.samsung.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-08-08 23:01:24 -04:00
Zheng Liu
44fb851dfb ext4: add WARN_ON to check the length of allocated blocks
In commit 921f266b: ext4: add self-testing infrastructure to do a
sanity check, some sanity checks were added in map_blocks to make sure
'retval == map->m_len'.

Enable these checks by default and report any assertion failures using
ext4_warning() and WARN_ON() since they can help us to figure out some
bugs that are otherwise hard to hit.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-07-29 12:51:42 -04:00
Theodore Ts'o
94eec0fc35 ext4: fix retry handling in ext4_ext_truncate()
We tested for ENOMEM instead of -ENOMEM.   Oops.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-07-29 12:12:56 -04:00
Eric Sandeen
dd12ed144e ext4: destroy ext4_es_cachep on module unload
Without this, module can't be reloaded.

[  500.521980] kmem_cache_sanity_check (ext4_extent_status): Cache name already exists.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org  # v3.8+
2013-07-26 15:21:11 -04:00
Theodore Ts'o
a34eb50374 ext4: make sure group number is bumped after a inode allocation race
When we try to allocate an inode, and there is a race between two
CPU's trying to grab the same inode, _and_ this inode is the last free
inode in the block group, make sure the group number is bumped before
we continue searching the rest of the block groups.  Otherwise, we end
up searching the current block group twice, and we end up skipping
searching the last block group.  So in the unlikely situation where
almost all of the inodes are allocated, it's possible that we will
return ENOSPC even though there might be free inodes in that last
block group.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
2013-07-26 15:15:46 -04:00
Linus Torvalds
3b2f64d00c Linux 3.11-rc2 2013-07-21 12:05:29 -07:00
Linus Torvalds
ea45ea70b6 ACPI video support fixes for 3.11
- Change from Aaron Lu makes ACPICA export a variable which can be
   used by driver code to determine whether or not the BIOS believes
   that we are compatible with Windows 8.
 
 - Change from Matthew Garrett makes the ACPI video driver initialize
   the ACPI backlight even if it is not going to be used afterward
   (that is needed for backlight control to work on Thinkpads).
 
 - Fix from Rafael J Wysocki implements Windows 8 backlight support
   workaround making i915 take over bakclight control if the firmware
   thinks it's dealing with Windows 8.  Based on the work of multiple
   developers including Matthew Garrett, Chun-Yi Lee, Seth Forshee,
   and Aaron Lu.
 
 - Fix from Aaron Lu makes the kernel follow Windows 8 by informing
   the firmware through the _DOS method that it should not carry out
   automatic brightness changes, so that brightness can be controlled
   by GUI.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJR6yY7AAoJEKhOf7ml8uNsbGUQAIVXwX8HF+9AOnqEIQYEaBiF
 HqfLDtHS5qobraK06auF/YmVaA17RdUnHssTuGiEbtIxpiUbuLPJaecZ9BeAf0Pz
 V4Y2IxF27aF9TDZrzkZXHcnYflzQ/kxj36eR9AmM2vSXmKZKxhfqLMeihVh2GgMv
 IlOs9PltK2GNX6C/CzjUQuUj4TYw8yxXsG93Gta0Z8scmxR7mpq9a9d0cPU/TjN/
 iatIhZLFU8ujp8xFiG9MDeG948AtperLu3g0v1D4mPnkmDJTuyMuE3FiioKL2zMY
 7jh6mDPkWUYdjdZkJcmyzgKZh5lAlZIJTZnJV6TrW5fjIpUz5F8XeD4ArMVU/u+A
 smro6XFcpgToRZTtmaEuraxzJHCS44FTjlXyH01FSIiN/Ll6YKyxDYsAzz4Z2sf6
 X5BJofAAiRelZh/o1MaMQzs3QeTUo44TaboGr2zka0cQ37KK9+8YRGYqcWo/BvGs
 sicgFKMeA6SANxHnCVNTslQzMYFrPp4yyMEu4gD7EE+U7cG6FSVhVHQjjTO9CmBd
 ZnX2EDX0Uy+oHTQ9BkyjWAD7rXF6StnOm37FPWHNZ+HnplHEoQQAn+vXsSfl9tbO
 7CPefZ/LQQEo1PZwLLkxruZ67NgxOd8/9I/aVjLUKgd8CDjez0UJVJ65gIXl1J4V
 kvaDy6faYTnUF8h/AYJf
 =vlwF
 -----END PGP SIGNATURE-----

Merge tag 'acpi-video-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI video support fixes from Rafael Wysocki:
 "I'm sending a separate pull request for this as it may be somewhat
  controversial.  The breakage addressed here is not really new and the
  fixes may not satisfy all users of the affected systems, but we've had
  so much back and forth dance in this area over the last several weeks
  that I think it's time to actually make some progress.

  The source of the problem is that about a year ago we started to tell
  BIOSes that we're compatible with Windows 8, which we really need to
  do, because some systems shipping with Windows 8 are tested with it
  and nothing else, so if we tell their BIOSes that we aren't compatible
  with Windows 8, we expose our users to untested BIOS/AML code paths.

  However, as it turns out, some Windows 8-specific AML code paths are
  not tested either, because Windows 8 actually doesn't use the ACPI
  methods containing them, so if we declare Windows 8 compatibility and
  attempt to use those ACPI methods, things break.  That occurs mostly
  in the backlight support area where in particular the _BCM and _BQC
  methods are plain unusable on some systems if the OS declares Windows
  8 compatibility.

  [ The additional twist is that they actually become usable if the OS
    says it is not compatible with Windows 8, but that may cause
    problems to show up elsewhere ]

  Investigation carried out by Matthew Garrett indicates that what
  Windows 8 does about backlight is to leave backlight control up to
  individual graphics drivers.  At least there's evidence that it does
  that if the Intel graphics driver is used, so we've decided to follow
  Windows 8 in that respect and allow i915 to control backlight (Daniel
  likes that part).

  The first commit from Aaron Lu makes ACPICA export the variable from
  which we can infer whether or not the BIOS believes that we are
  compatible with Windows 8.

  The second commit from Matthew Garrett prepares the ACPI video driver
  by making it initialize the ACPI backlight even if it is not going to
  be used afterward (that is needed for backlight control to work on
  Thinkpads).

  The third commit implements the actual workaround making i915 take
  over backlight control if the firmware thinks it's dealing with
  Windows 8 and is based on the work of multiple developers, including
  Matthew Garrett, Chun-Yi Lee, Seth Forshee, and Aaron Lu.

  The final commit from Aaron Lu makes us follow Windows 8 by informing
  the firmware through the _DOS method that it should not carry out
  automatic brightness changes, so that brightness can be controlled by
  GUI.

  Hopefully, this approach will allow us to avoid using blacklists of
  systems that should not declare Windows 8 compatibility just to avoid
  backlight control problems in the future.

   - Change from Aaron Lu makes ACPICA export a variable which can be
     used by driver code to determine whether or not the BIOS believes
     that we are compatible with Windows 8.

   - Change from Matthew Garrett makes the ACPI video driver initialize
     the ACPI backlight even if it is not going to be used afterward
     (that is needed for backlight control to work on Thinkpads).

   - Fix from Rafael J Wysocki implements Windows 8 backlight support
     workaround making i915 take over bakclight control if the firmware
     thinks it's dealing with Windows 8.  Based on the work of multiple
     developers including Matthew Garrett, Chun-Yi Lee, Seth Forshee,
     and Aaron Lu.

   - Fix from Aaron Lu makes the kernel follow Windows 8 by informing
     the firmware through the _DOS method that it should not carry out
     automatic brightness changes, so that brightness can be controlled
     by GUI"

* tag 'acpi-video-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / video: no automatic brightness changes by win8-compatible firmware
  ACPI / video / i915: No ACPI backlight if firmware expects Windows 8
  ACPI / video: Always call acpi_video_init_brightness() on init
  ACPICA: expose OSI version
2013-07-21 10:11:04 -07:00
Linus Torvalds
90db76e829 Fix regression caused by commit af51a2ac36 which added ->tmpfile()
support (along with a similar fix for ext3)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJR60efAAoJENNvdpvBGATwLnEQAMABvjWRdJ/u8mSVm/j5K17j
 y8+PqdZslLBq7NEfcpQBfeajXC0ouQETVWs6HhE9LeZcfgSQdqkR7/2fPJp/QJ7h
 eY/DUykNavuLVfWgvAkK3DyJz6WK+m2H8sBlqTblqVDsrVPtGytPxpPiGmzbcRlU
 9aKoJcGkR7YtYWy6lY/tPDHUYByHE54LWER/fSdrLrOu0F5h7bLTGImZclLRM8wO
 kWmjXjDVsgpBNGpn/6eNSaqWGXnECbPcE+b+DPNKEYhU0WnzvgPY0Xw2fV2E9+IK
 0eEp7Z248u+mxW5Na9V8SZ8EpSGTQvm7hpqHnDPQGwGNn89j4w9n7qX8IoeUKUf0
 0Rk72FHBXeDh3juaQekytrzXlHescpiNwYNlDQNmfezsDotHvq5h+29GGoI+Cv8X
 Tt5JoBsexoUJzHjRReAegTjR0B+MTwqmGSx7KH5Rm02KKzWjKXzKcUsefztEktDw
 Z4E8cKMG+rUGBgCGckjiBX00OMjlmph8PomXligCY9aaI2f5F1ZjXgnaYpCDq6ez
 xU16lxvhI9eWzAnn7kIA1I/6q2DRzLxjaTKYsqFm0Jf9sLILumPpusGd2sgq7BBD
 NAo8nC5Nuhq5EB4Ik0KQ/F+BUAkKRI6UB5zkC1sfmZFB6AfheO4HQ2Qp5PWafCJ5
 P6h0IDbrVZnF3c+4hW/m
 =CV0e
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext[34] tmpfile bugfix from Ted Ts'o:
 "Fix regression caused by commit af51a2ac36 which added ->tmpfile()
  support (along with a similar fix for ext3)"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext3: fix a BUG when opening a file with O_TMPFILE flag
  ext4: fix a BUG when opening a file with O_TMPFILE flag
2013-07-20 20:11:42 -07:00
Zheng Liu
dda5690def ext3: fix a BUG when opening a file with O_TMPFILE flag
When we try to open a file with O_TMPFILE flag, we will trigger a bug.
The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
this check always fails because we set ->i_nlink = 1 in
inode_init_always().  We can use the following program to trigger it:

int main(int argc, char *argv[])
{
	int fd;

	fd = open(argv[1], O_TMPFILE, 0666);
	if (fd < 0) {
		perror("open ");
		return -1;
	}
	close(fd);
	return 0;
}

The oops message looks like this:

kernel: kernel BUG at fs/ext3/namei.c:1992!
kernel: invalid opcode: 0000 [#1] SMP
kernel: Modules linked in: ext4 jbd2 crc16 cpufreq_ondemand ipv6 dm_mirror dm_region_hash dm_log dm_mod parport_pc parport serio_raw sg dcdbas pcspkr i2c_i801 ehci_pci ehci_hcd button acpi_cpufreq mperf e1000e ptp pps_core ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core ext3 jbd sd_mod ahci libahci libata scsi_mod uhci_hcd
kernel: CPU: 0 PID: 2882 Comm: tst_tmpfile Not tainted 3.11.0-rc1+ #4
kernel: Hardware name: Dell Inc. OptiPlex 780 /0V4W66, BIOS A05 08/11/2010
kernel: task: ffff880112d30050 ti: ffff8801124d4000 task.ti: ffff8801124d4000
kernel: RIP: 0010:[<ffffffffa00db5ae>] [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3]
kernel: RSP: 0018:ffff8801124d5cc8  EFLAGS: 00010202
kernel: RAX: 0000000000000000 RBX: ffff880111510128 RCX: ffff8801114683a0
kernel: RDX: 0000000000000000 RSI: ffff880111510128 RDI: ffff88010fcf65a8
kernel: RBP: ffff8801124d5d18 R08: 0080000000000000 R09: ffffffffa00d3b7f
kernel: R10: ffff8801114683a0 R11: ffff8801032a2558 R12: 0000000000000000
kernel: R13: ffff88010fcf6800 R14: ffff8801032a2558 R15: ffff8801115100d8
kernel: FS:  00007f5d172b5700(0000) GS:ffff880117c00000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
kernel: CR2: 00007f5d16df15d0 CR3: 0000000110b1d000 CR4: 00000000000407f0
kernel: Stack:
kernel: 000000000000000c ffff8801048a7dc8 ffff8801114685a8 ffffffffa00b80d7
kernel: ffff8801124d5e38 ffff8801032a2558 ffff88010ce24d68 0000000000000000
kernel: ffff88011146b300 ffff8801124d5d44 ffff8801124d5d78 ffffffffa00db7e1
kernel: Call Trace:
kernel: [<ffffffffa00b80d7>] ? journal_start+0x8c/0xbd [jbd]
kernel: [<ffffffffa00db7e1>] ext3_tmpfile+0xb2/0x13b [ext3]
kernel: [<ffffffff821076f8>] path_openat+0x11f/0x5e7
kernel: [<ffffffff821c86b4>] ? list_del+0x11/0x30
kernel: [<ffffffff82065fa2>] ?  __dequeue_entity+0x33/0x38
kernel: [<ffffffff82107cd5>] do_filp_open+0x3f/0x8d
kernel: [<ffffffff82112532>] ? __alloc_fd+0x50/0x102
kernel: [<ffffffff820f9296>] do_sys_open+0x13b/0x1cd
kernel: [<ffffffff820f935c>] SyS_open+0x1e/0x20
kernel: [<ffffffff82398c02>] system_call_fastpath+0x16/0x1b
kernel: Code: 39 c7 0f 85 67 01 00 00 0f b7 03 25 00 f0 00 00 3d 00 40 00 00 74 18 3d 00 80 00 00 74 11 3d 00 a0 00 00 74 0a 83 7b 48 00 74 04 <0f> 0b eb fe 49 8b 85 50 03 00 00 4c 89 f6 48 c7 c7 c0 99 0e a0
kernel: RIP  [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3]
kernel: RSP <ffff8801124d5cc8>

Here we couldn't call clear_nlink() directly because in d_tmpfile() we
will call inode_dec_link_count() to decrease ->i_nlink.  So this commit
tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
2013-07-20 22:03:20 -04:00
Zheng Liu
e94bd3490f ext4: fix a BUG when opening a file with O_TMPFILE flag
When we try to open a file with O_TMPFILE flag, we will trigger a bug.
The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
this check always fails because we set ->i_nlink = 1 in
inode_init_always().  We can use the following program to trigger it:

int main(int argc, char *argv[])
{
	int fd;

	fd = open(argv[1], O_TMPFILE, 0666);
	if (fd < 0) {
		perror("open ");
		return -1;
	}
	close(fd);
	return 0;
}

The oops message looks like this:

kernel BUG at fs/ext4/namei.c:2572!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: dlci bridge stp hidp cmtp kernelcapi l2tp_ppp l2tp_netlink l2tp_core sctp libcrc32c rfcomm tun fuse nfnetli
nk can_raw ipt_ULOG can_bcm x25 scsi_transport_iscsi ipx p8023 p8022 appletalk phonet psnap vmw_vsock_vmci_transport af_key vmw_vmci rose vsock atm can netrom ax25 af_rxrpc ir
da pppoe pppox ppp_generic slhc bluetooth nfc rfkill rds caif_socket caif crc_ccitt af_802154 llc2 llc snd_hda_codec_realtek snd_hda_intel snd_hda_codec serio_raw snd_pcm pcsp
kr edac_core snd_page_alloc snd_timer snd soundcore r8169 mii sr_mod cdrom pata_atiixp radeon backlight drm_kms_helper ttm
CPU: 1 PID: 1812571 Comm: trinity-child2 Not tainted 3.11.0-rc1+ #12
Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
task: ffff88007dfe69a0 ti: ffff88010f7b6000 task.ti: ffff88010f7b6000
RIP: 0010:[<ffffffff8125ce69>]  [<ffffffff8125ce69>] ext4_orphan_add+0x299/0x2b0
RSP: 0018:ffff88010f7b7cf8  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8800966d3020 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88007dfe70b8 RDI: 0000000000000001
RBP: ffff88010f7b7d40 R08: ffff880126a3c4e0 R09: ffff88010f7b7ca0
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801271fd668
R13: ffff8800966d2f78 R14: ffff88011d7089f0 R15: ffff88007dfe69a0
FS:  00007f70441a3740(0000) GS:ffff88012a800000(0000) knlGS:00000000f77c96c0
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000002834000 CR3: 0000000107964000 CR4: 00000000000007e0
DR0: 0000000000780000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Stack:
 0000000000002000 00000020810b6dde 0000000000000000 ffff88011d46db00
 ffff8800966d3020 ffff88011d7089f0 ffff88009c7f4c10 ffff88010f7b7f2c
 ffff88007dfe69a0 ffff88010f7b7da8 ffffffff8125cfac ffff880100000004
Call Trace:
 [<ffffffff8125cfac>] ext4_tmpfile+0x12c/0x180
 [<ffffffff811cba78>] path_openat+0x238/0x700
 [<ffffffff8100afc4>] ? native_sched_clock+0x24/0x80
 [<ffffffff811cc647>] do_filp_open+0x47/0xa0
 [<ffffffff811db73f>] ? __alloc_fd+0xaf/0x200
 [<ffffffff811ba2e4>] do_sys_open+0x124/0x210
 [<ffffffff81010725>] ? syscall_trace_enter+0x25/0x290
 [<ffffffff811ba3ee>] SyS_open+0x1e/0x20
 [<ffffffff816ca8d4>] tracesys+0xdd/0xe2
 [<ffffffff81001001>] ? start_thread_common.constprop.6+0x1/0xa0
Code: 04 00 00 00 89 04 24 31 c0 e8 c4 77 04 00 e9 43 fe ff ff 66 25 00 d0 66 3d 00 80 0f 84 0e fe ff ff 83 7b 48 00 0f 84 04 fe ff ff <0f> 0b 49 8b 8c 24 50 07 00 00 e9 88 fe ff ff 0f 1f 84 00 00 00

Here we couldn't call clear_nlink() directly because in d_tmpfile() we
will call inode_dec_link_count() to decrease ->i_nlink.  So this commit
tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Tested-by: Darrick J. Wong <darrick.wong@oracle.com>
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-20 21:58:38 -04:00
Linus Torvalds
f6a0d9d585 Staging tree fixes for 3.11-rc2
Here are a few iio driver fixes for 3.11-rc2.  They are still spread
 across drivers/iio and drivers/staging/iio so they are coming in through
 this tree.
 
 I've also removed the drivers/staging/csr/ driver as the developers who
 originally sent it to me have moved on to other companies, and CSR still
 will not send us the specs for the device, making the driver pretty much
 obsolete and impossible to fix up.  Deleting it now prevents people from
 sending in lots of tiny codingsyle fixes that will never go anywhere.
 
 It also helps to offset the large lustre filesystem merge that happened
 in 3.11-rc1 in the overall 3.11.0 diffstat. :)
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.20 (GNU/Linux)
 
 iEYEABECAAYFAlHq64AACgkQMUfUDdst+ynysQCgwUM8pbZ7iJGDJAW2TahKwVie
 f5MAnRR8DokyE7iWwXDEx5UYVPerMrpq
 =9ORA
 -----END PGP SIGNATURE-----

Merge tag 'staging-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging tree fixes from Greg KH:
 "Here are a few iio driver fixes for 3.11-rc2.  They are still spread
  across drivers/iio and drivers/staging/iio so they are coming in
  through this tree.

  I've also removed the drivers/staging/csr/ driver as the developers
  who originally sent it to me have moved on to other companies, and CSR
  still will not send us the specs for the device, making the driver
  pretty much obsolete and impossible to fix up.  Deleting it now
  prevents people from sending in lots of tiny codingsyle fixes that
  will never go anywhere.

  It also helps to offset the large lustre filesystem merge that
  happened in 3.11-rc1 in the overall 3.11.0 diffstat.  :)"

* tag 'staging-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: csr: remove driver
  iio: lps331ap: Fix wrong in_pressure_scale output value
  iio staging: fix lis3l02dq, read error handling
  staging:iio:ad7291: add missing .driver_module to struct iio_info
  iio: ti_am335x_adc: add missing .driver_module to struct iio_info
  iio: mxs-lradc: Remove useless check in read_raw
  iio: mxs-lradc: Fix misuse of iio->trig
  iio: inkern: fix iio_convert_raw_to_processed_unlocked
  iio: Fix iio_channel_has_info
  iio:trigger: device_unregister->device_del to avoid double free
  iio: dac: ad7303: fix error return code in ad7303_probe()
2013-07-20 15:42:38 -07:00
Linus Torvalds
36231d255b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "The sget() one is a long-standing bug and will need to go into -stable
  (in fact, it had been originally caught in RHEL6), the other two are
  3.11-only"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: constify dentry parameter in d_count()
  livelock avoidance in sget()
  allow O_TMPFILE to work with O_WRONLY
2013-07-20 10:50:01 -07:00
Linus Torvalds
19bf1c2c7b Fixes for 3.11-rc2, sent at 5pm, in the professoinal style. :-)
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJR6cmlAAoJENNvdpvBGATwZF0P/0a7ET511UJwQbgAIq5ftFlj
 86Bzvy28xo2T85t64L+Ib2XDehWHk0sZlQpB/gK8MLYn4rCRWCxkQAshKwoequsC
 AhuvQ7NtX9vJNCSR30+RrLhkvj6UKsMuM724adARLBUgMBoScABzZImR1e14ELah
 bN27a4Bk2aNUpNX68QYdQX3TGiHGZy//lNmh81JTxFS3Moqm6bIZAJbYpOslATsI
 Q5nti/TjQJKso2gF7Jx7NffXv0g5rGxaVQEZJPpfIv1Vs0b6vabK/sYp608ayM0K
 qKyjJABaHR1Pzb16V82ZqvSlsHm/ARhCF1nMM6gQ8nwl/plxcQ6Jvd/qJsNej3b/
 7Jfm86xLe+G0G5oeNEJXsoEFAsvxug6ZRMfyoRHaPlGIksmz+Jc9kzTtM3qzdzOB
 5OPJwlONlM4dRVA6rgb7KiuE3h/sRt4CctFejD0f6mUqKa+B+zyHq/a/8a+60IqQ
 /sDiTQrqrI6LWxECFasDNoGxtnvVtKC21jbg+MTzumZDvjgnJIFFe5NrinI6SB9x
 VQYVq/vVkE576VTwGAttTg3s4sRwQKd/iuQjuoP76iFFHvq/sNX6fBq0NW5gpsj2
 WAfH+fLQsMcVJ2MAcc3DwdBT1wQbLu+Y19hv4TDOZRmnKGhq9K08hzWR4tIUKdFJ
 UcjWk35Wuoz1IGpVlHJ5
 =ngfz
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 bugfixes from Ted Ts'o:
 "Fixes for 3.11-rc2, sent at 5pm, in the professoinal style.  :-)"

I'm not sure I like this new level of "professionalism".
9-5, people, 9-5.

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: call ext4_es_lru_add() after handling cache miss
  ext4: yield during large unlinks
  ext4: make the extent_status code more robust against ENOMEM failures
  ext4: simplify calculation of blocks to free on error
  ext4: fix error handling in ext4_ext_truncate()
2013-07-20 10:48:59 -07:00
Linus Torvalds
3be542d464 NFS client bugfixes for 3.11
- Fix a regression against NFSv4 FreeBSD servers when creating a new file
 - Fix another regression in rpc_client_register()
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR6ZKMAAoJEGcL54qWCgDyQX8P/19LKLNKcL+y2zVGjLbXMTq0
 TpyWdBO0ux7QcqnPEDg+Jpvu62IowYiKTtaSOXtHb5BNjQMBo2RKw3B0eMBoCp/z
 6gHmQRD2hMgqwBxBwHceV+dNwueCUiZW7GqaaNh6/3bpGQefegdONnLEifuPogEu
 oZmEuiVrGDfITEF7D4k5+shXCQN4eNH0LFuIQo4XXdCqmK6PwvOsidZ7YwHVC3Mg
 /Jzda2YsCxHj8kPi1xb9skPPAn6g4kdfYfyr/xSY7IviPixrkg/nEEK1b8xHU81e
 a0dd0Yx5kq6fR8LsBvQCHdj2m7doHM15jf5Np5G7VnnaWEjB2y+QftkxWc9lCNU3
 t2fr9YVD7ZG/GGNSFePHAHmBY0OqDB1Htp4vcwEQfzX6CAR3Hel82WVvut62Z6m4
 G5qHjwdqUFhmRN//SWlDpEqSn+pbeCvPhQS60ayN0TLivRsscm/I4yA75odAnn9b
 4su1IcUpqeJGeV6yDyMUqbx4kYZFyCZg/DNkThXiTKOs47A7ogSS9ev2fTB/V+jd
 rroNHNd/U508ze9D6D4ai9vR78uUp4wKNSSBZMCkBtNh0uSApOTgyGVhertB1EKS
 vgAr4T1tc+9t+0qg1Sb+hbKyBM/KaS5zUrPn+APHPoBXPh5PSVBzeNJkpxHRw/V0
 ZxkEgSQKLZSXYb5ab770
 =XE+7
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 - Fix a regression against NFSv4 FreeBSD servers when creating a new
   file
 - Fix another regression in rpc_client_register()

* tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4: Fix a regression against the FreeBSD server
  SUNRPC: Fix another issue with rpc_client_register()
2013-07-20 10:48:24 -07:00
Linus Torvalds
90290c4ebe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next
Pull btrfs fixes from Josef Bacik:
 "I'm playing the role of Chris Mason this week while he's on vacation.
  There are a few critical fixes for btrfs here, all regressions and
  have been tested well"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next:
  Btrfs: fix wrong write offset when replacing a device
  Btrfs: re-add root to dead root list if we stop dropping it
  Btrfs: fix lock leak when resuming snapshot deletion
  Btrfs: update drop progress before stopping snapshot dropping
2013-07-20 10:47:38 -07:00
Peng Tao
24924a20da vfs: constify dentry parameter in d_count()
so that it can be used in places like d_compare/d_hash
without causing a compiler warning.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-20 05:06:27 +04:00
Al Viro
acfec9a5a8 livelock avoidance in sget()
Eric Sandeen has found a nasty livelock in sget() - take a mount(2) about
to fail.  The superblock is on ->fs_supers, ->s_umount is held exclusive,
->s_active is 1.  Along comes two more processes, trying to mount the same
thing; sget() in each is picking that superblock, bumping ->s_count and
trying to grab ->s_umount.  ->s_active is 3 now.  Original mount(2)
finally gets to deactivate_locked_super() on failure; ->s_active is 2,
superblock is still ->fs_supers because shutdown will *not* happen until
->s_active hits 0.  ->s_umount is dropped and now we have two processes
chasing each other:
s_active = 2, A acquired ->s_umount, B blocked
A sees that the damn thing is stillborn, does deactivate_locked_super()
s_active = 1, A drops ->s_umount, B gets it
A restarts the search and finds the same superblock.  And bumps it ->s_active.
s_active = 2, B holds ->s_umount, A blocked on trying to get it
... and we are in the earlier situation with A and B switched places.

The root cause, of course, is that ->s_active should not grow until we'd
got MS_BORN.  Then failing ->mount() will have deactivate_locked_super()
shut the damn thing down.  Fortunately, it's easy to do - the key point
is that grab_super() is called only for superblocks currently on ->fs_supers,
so it can bump ->s_count and grab ->s_umount first, then check MS_BORN and
bump ->s_active; we must never increment ->s_count for superblocks past
->kill_sb(), but grab_super() is never called for those.

The bug is pretty old; we would've caught it by now, if not for accidental
exclusion between sget() for block filesystems; the things like cgroup or
e.g. mtd-based filesystems don't have anything of that sort, so they get
bitten.  The right way to deal with that is obviously to fix sget()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-20 04:58:58 +04:00
Al Viro
ba57ea64cb allow O_TMPFILE to work with O_WRONLY
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-20 03:11:32 +04:00
Linus Torvalds
d471ce53b1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
 "Special thanks goes to Toralf Föster for continuously testing UML and
  reporting issues!"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: remove dead code
  um: siginfo cleanup
  uml: Fix which_tmpdir failure when /dev/shm is a symlink, and in other edge cases
  um: Fix wait_stub_done() error handling
  um: Mark stub pages mapping with VM_PFNMAP
  um: Fix return value of strnlen_user()
2013-07-19 15:11:09 -07:00
Linus Torvalds
1b05018045 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "MIPS fixes for 3.11.  Half of then is for Netlogic the remainder
  touches things across arch/mips.

  Nothing really dramatic and by rc1 standards MIPS will be in fairly
  good shape with this applied.  Tested by building all MIPS defconfigs
  of which with this pull request four platforms won't build.  And yes,
  it boots also on my favorite test systems"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: kvm: Kconfig: Drop HAVE_KVM dependency from VIRTUALIZATION
  MIPS: Octeon: Fix DT pruning bug with pip ports
  MIPS: KVM: Mark KVM_GUEST (T&E KVM) as BROKEN_ON_SMP
  MIPS: tlbex: fix broken build in v3.11-rc1
  MIPS: Netlogic: Add XLP PIC irqdomain
  MIPS: Netlogic: Fix USB block's coherent DMA mask
  MIPS: tlbex: Fix typo in r3000 tlb store handler
  MIPS: BMIPS: Fix thinko to release slave TP from reset
  MIPS: Delete dead invocation of exception_exit().
2013-07-19 15:10:01 -07:00
Linus Torvalds
89d0abe3d6 - Post -rc1 update to the common reboot infrastructure.
- Fixes (user cache maintenance fault handling, !COMPAT compilation, CPU
   online and interrupt hanlding).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (GNU/Linux)
 
 iQIcBAABAgAGBQJR6WGUAAoJEGvWsS0AyF7xaMQP/3WSpcEUQ+k8wCbjzkbpnQJ3
 Z0Ufz0XeBUgmaZNwYFjnpVTm/R04F1gcsoA5qE//6iMkbcpbM2sWVN/uQvY3l+fQ
 haJmCWZ7PIXxm3vbKrcSBiJ+WKpvUkzlL+Q1upmQWFCxkP6noejCsvazM5lPgt1q
 1Vr5/9q87TOfxG8Udki3pPqRazd4YwQJx6JCZ46P+mlqQk9gxDoQ7RRXy9Viv+SV
 6Jq5Lt8Qj7iXmq8DDTYdF7DHE7rnIhdCUhJu7UN1G/O7U4s9nlUbv55fHMY76ZAn
 BddgNtIsMsvfIxlYZ6f0r2ccrgPHaDTLVW/q0VkwdP8jD2EQWAK7gI21OgQm0ebl
 OL+t29zdx9nidpI+cCTlwejh8i7vRsoxgYki5qfnYR3SHL5HhQSHrUTZvEFr+u4I
 ceXnDmTZ46HmPSCC6/5cFiXxsw1zbBxSB7rNFvXmF2Jr7F3TvAxCWvrIfmrmYdrC
 bw4UMBB15SaJud3maqVGhj6aVo4bEand4g++Dk0ytXbGq+5Ke5CktmPOmzNLVu9p
 R8tHDyp9szFP0eqqF1/XTZx2rsHvop7D1QUpQVgFC+mRWI3gjtFPtShW1GgbQQ+P
 t7gvHJQ2SF1BRTSf6KH/cqXDgHzpg3VuP2moM7YbR0/qvRf7Eh/M3Hrg/s6xCeKr
 ywmW7/+McIz3kDsk1gUM
 =M4kj
 -----END PGP SIGNATURE-----

Merge tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64

Pull arm64 fixes from Catalin Marinas:
 - Post -rc1 update to the common reboot infrastructure.
 - Fixes (user cache maintenance fault handling, !COMPAT compilation,
   CPU online and interrupt hanlding).

* tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
  arm64: use common reboot infrastructure
  arm64: mm: don't treat user cache maintenance faults as writes
  arm64: add '#ifdef CONFIG_COMPAT' for aarch32_break_handler()
  arm64: Only enable local interrupts after the CPU is marked online
2013-07-19 15:08:53 -07:00
Linus Torvalds
89a8c5940d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "An update for the BFP jit to the latest and greatest, two patches to
  get kdump working again, the random-abort ptrace extention for
  transactional execution, the z90crypt module alias for ap and a tiny
  cleanup"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/zcrypt: Alias for new zcrypt device driver base module
  s390/kdump: Allow copy_oldmem_page() copy to virtual memory
  s390/kdump: Disable mmap for s390
  s390/bpf,jit: add pkt_type support
  s390/bpf,jit: address randomize and write protect jit code
  s390/bpf,jit: use generic jit dumper
  s390/bpf,jit: call module_free() from any context
  s390/qdio: remove unused variable
  s390/ptrace: PTRACE_TE_ABORT_RAND
2013-07-19 15:08:12 -07:00
Stefan Behrens
115930cb2d Btrfs: fix wrong write offset when replacing a device
Miao Xie reported the following issue:

The filesystem was corrupted after we did a device replace.

Steps to reproduce:
 # mkfs.btrfs -f -m single -d raid10 <device0>..<device3>
 # mount <device0> <mnt>
 # btrfs replace start -rfB 1 <device4> <mnt>
 # umount <mnt>
 # btrfsck <device4>

The reason for the issue is that we changed the write offset by mistake,
introduced by commit 625f1c8dc.

We read the data from the source device at first, and then write the
data into the corresponding place of the new device. In order to
implement the "-r" option, the source location is remapped using
btrfs_map_block(). The read takes place on the mapped location, and
the write needs to take place on the unmapped location. Currently
the write is using the mapped location, and this commit changes it
back by undoing the change to the write address that the aforementioned
commit added by mistake.

Reported-by: Miao Xie <miaox@cn.fujitsu.com>
Cc: <stable@vger.kernel.org> # 3.10+
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-19 15:07:26 -04:00
Josef Bacik
d29a9f629e Btrfs: re-add root to dead root list if we stop dropping it
If we stop dropping a root for whatever reason we need to add it back to the
dead root list so that we will re-start the dropping next transaction commit.
The other case this happens is if we recover a drop because we will add a root
without adding it to the fs radix tree, so we can leak it's root and commit root
extent buffer, adding this to the dead root list makes this cleanup happen.
Thanks,

Cc: stable@vger.kernel.org
Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-19 15:07:19 -04:00
Josef Bacik
fec386ac14 Btrfs: fix lock leak when resuming snapshot deletion
We aren't setting path->locks[level] when we resume a snapshot deletion which
means we won't unlock the buffer when we free the path.  This causes deadlocks
if we happen to re-allocate the block before we've evicted the extent buffer
from cache.  Thanks,

Cc: stable@vger.kernel.org
Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-19 15:07:11 -04:00
Josef Bacik
3c8f242257 Btrfs: update drop progress before stopping snapshot dropping
Alex pointed out a problem and fix that exists in the drop one snapshot at a
time patch.  If we decide we need to exit for whatever reason (umount for
example) we will just exit the snapshot dropping without updating the drop
progress.  So the next time we go to resume we will BUG_ON() because we can't
find the extent we left off at because we never updated it.  This patch fixes
the problem.

Cc: stable@vger.kernel.org
Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-07-19 15:07:03 -04:00
Linus Torvalds
b8a33fc725 Fix for AMD processors.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR6Wm5AAoJEBvWZb6bTYbykGQP/Rg4fF+p7v8eko9dNKJWAqfq
 /RQ9mSIRKpwD+kZogPtrBNH/LKwSqaJCtciSp57d+cD9w/9S1pN7eu1nDwSl2RA6
 EyOVJ/Pd4JCdaRj+yGA3cjs7tEN3SvDk1BHHAxq830ZS5ekuXRw2QC+gBC2hI/Zq
 g442m2ELckcdQXwFcj+FuK4usxhkOL7MT5L64lBF962bAY87L7VL0XkFq9RZCp1B
 mJJd5uAok7gNl4LHH/Gu5DbuQTvY33ijJiGl9lT0CXwP9bmZ0WTA9lR2poy8P7eM
 R6qBuXaHN4qBm7tdRqPBsQ+S1+yr+JHRTCzYU7UsiXWg/fxu1QX2L77Q8ZbjnhNW
 rnKOSVlc3j11b594ygX9uc5Elgr2cl9OBBFdoDDormBo9A3Kp9pGkjbJnl1tunPi
 oA4Otx8GwxNsnL7aBWXEoPm70peQFDOAKNK+5p/PBasXsJrGCSqCc2x2+RpaveqR
 HhE2z0eMaCpC8ayPofMsP0nIu/Cf53m2Fr6BHZqea2KBz57WAhXOuSe4vzXm3nXg
 xBHBpPoNtBzTYjmWJ7vdlLGgzsMkZTOsRmppQT7wwrwGEjFCOV4kwHWOBsxNS5Z8
 4eWyRFPLJP4dGRj2FoMZNfMM7/6XpQz5JttzxHXWwpua+AnQ4+ay4A/6FxiNMm0a
 hRB9ev+0X2/9Qj3+GFIU
 =+0Dk
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fix from Paolo Bonzini:
 "This single patch fixes a regression caused by one of the
  optimizations introduced in 3.11, which is generally visible only on
  AMD processors"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: MMU: avoid fast page fault fixing mmio page fault
2013-07-19 10:17:12 -07:00
Linus Torvalds
b7356abb9f Power management and ACPI fixes for 3.11-rc2
- Two cpufreq commits from the 3.10 cycle introduced regressions.
   The first of them was buggy (it did way much more than it needed
   to do) and the second one attempted to fix an issue introduced by
   the first one.  Fixes from Srivatsa S Bhat revert both.
 
 - If autosleep triggers during system shutdown and the shutdown
   callbacks of some device drivers have been called already, it may
   crash the system.  Fix from Liu Shuo prevents that from happening
   by making try_to_suspend() check system_state.
 
 - The ACPI memory hotplug driver doesn't clear its driver_data on
   errors which may cause a NULL poiter dereference to happen later.
   Fix from Toshi Kani.
 
 - The ACPI namespace scanning code should not try to attach scan
   handlers to device objects that have them already, which may confuse
   things quite a bit, and it should rescan the whole namespace branch
   starting at the given node after receiving a bus check notify event
   even if the device at that particular node has been discovered
   already.  Fixes from Rafael J Wysocki.
 
 - New ACPI video blacklist entry for a system whose initial backlight
   setting from the BIOS doesn't make sense.  From Lan Tianyu.
 
 - Garbage string output avoindance for ACPI PNP from Liu Shuo.
 
 - Two Kconfig fixes for issues introduced recently in the s3c24xx
   cpufreq driver (when moving the driver to drivers/cpufreq) from
   Paul Bolle.
 
 - Trivial comment fix in pm_wakeup.h from Chanwoo Choi.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJR6Sq+AAoJEKhOf7ml8uNsrGQP/0HRDW+QmTGM8znDTHXngbn9
 X3pqlpjEOiCtcmJaSJlD7GwLHMscwWcHKEezteaZ7KUI4mcnysJX6EY5YVbNriDC
 xlLcDQn9c6Xx1maCSfp+WMygvqItxZwuc8veRjrT3XtOfCNWS/FlX40Voh63BCAe
 GbfQ/HesmUg5CKplyD8/XypLWh5OFXmHzCe8IhrKGfhsZukXdSgSBjwQZMRrEMsQ
 kJjDCF8zUu0JisiWqL+xE6IFSKme9i6LBlHpzU0Y1g4RqAqkIbuS0Z3vezOYzoTD
 oZjBNa9XAgCS3x0l5g3G0ChgDAU+Mpji/imXA7JysrwbirGFbtPHtQYh2HzpAtnw
 Hkah/0ocBM7/w7VTsUQiRsFPdIJTCBLlm6J38x8yh7n84h4nJgOpK69dBLrMwCUZ
 f3kid6KIPVLBvnC3QSULrCAKUcUcVVWYtNho+sfXBMjP+cPwTmc3DvATnpru6twa
 0KjR5o585UOcciq7EWAoMrCFCfZYF5C4XGaZAxHI/SWooxeCQH84S8vfNLL2epVC
 ixmLYo4X2ANDsnfbUV+ewhB0/L2905Et6NhPUgPD/1rm15MEZbowbB2K0pzr0QL9
 /1hTL61InXx3jLxducJJFKN+HZ0zfDQdTkyafKrR9jb+GsdmnzYJ/vnfDG8MfPjp
 GZ281YBqVmUeYJh5CPU+
 =IUmn
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
 "These are fixes collected over the last week, most importnatly two
  cpufreq reverts fixing regressions introduced in 3.10, an autoseelp
  fix preventing systems using it from crashing during shutdown and two
  ACPI scan fixes related to hotplug.

  Specifics:

   - Two cpufreq commits from the 3.10 cycle introduced regressions.
     The first of them was buggy (it did way much more than it needed to
     do) and the second one attempted to fix an issue introduced by the
     first one.  Fixes from Srivatsa S Bhat revert both.

   - If autosleep triggers during system shutdown and the shutdown
     callbacks of some device drivers have been called already, it may
     crash the system.  Fix from Liu Shuo prevents that from happening
     by making try_to_suspend() check system_state.

   - The ACPI memory hotplug driver doesn't clear its driver_data on
     errors which may cause a NULL poiter dereference to happen later.
     Fix from Toshi Kani.

   - The ACPI namespace scanning code should not try to attach scan
     handlers to device objects that have them already, which may
     confuse things quite a bit, and it should rescan the whole
     namespace branch starting at the given node after receiving a bus
     check notify event even if the device at that particular node has
     been discovered already.  Fixes from Rafael J Wysocki.

   - New ACPI video blacklist entry for a system whose initial backlight
     setting from the BIOS doesn't make sense.  From Lan Tianyu.

   - Garbage string output avoindance for ACPI PNP from Liu Shuo.

   - Two Kconfig fixes for issues introduced recently in the s3c24xx
     cpufreq driver (when moving the driver to drivers/cpufreq) from
     Paul Bolle.

   - Trivial comment fix in pm_wakeup.h from Chanwoo Choi"

* tag 'pm+acpi-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / video: ignore BIOS initial backlight value for Fujitsu E753
  PNP / ACPI: avoid garbage in resource name
  cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression
  cpufreq: s3c24xx: fix "depends on ARM_S3C24XX" in Kconfig
  cpufreq: s3c24xx: rename CONFIG_CPU_FREQ_S3C24XX_DEBUGFS
  PM / Sleep: Fix comment typo in pm_wakeup.h
  PM / Sleep: avoid 'autosleep' in shutdown progress
  cpufreq: Revert commit a66b2e to fix suspend/resume regression
  ACPI / memhotplug: Fix a stale pointer in error path
  ACPI / scan: Always call acpi_bus_scan() for bus check notifications
  ACPI / scan: Do not try to attach scan handlers to devices having them
2013-07-19 09:59:06 -07:00