Fix the callback jffs2 passes to read_cache_page to actually have the
proper type expected. Casting around function pointers can easily hide
typing bugs, and defeats control flow protection.
Link: http://lkml.kernel.org/r/20190520055731.24538-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hi Linus,
This is my very first pull-request. I've been working full-time as
a kernel developer for more than two years now. During this time I've
been fixing bugs reported by Coverity all over the tree and, as part
of my work, I'm also contributing to the KSPP. My work in the kernel
community has been supervised by Greg KH and Kees Cook.
OK. So, after the quick introduction above, please, pull the following
patches that mark switch cases where we are expecting to fall through.
These patches are part of the ongoing efforts to enable -Wimplicit-fallthrough.
They have been ignored for a long time (most of them more than 3 months,
even after pinging multiple times), which is the reason why I've created
this tree. Most of them have been baking in linux-next for a whole development
cycle. And with Stephen Rothwell's help, we've had linux-next nag-emails
going out for newly introduced code that triggers -Wimplicit-fallthrough
to avoid gaining more of these cases while we work to remove the ones
that are already present.
I'm happy to let you know that we are getting close to completing this
work. Currently, there are only 32 of 2311 of these cases left to be
addressed in linux-next. I'm auditing every case; I take a look into
the code and analyze it in order to determine if I'm dealing with an
actual bug or a false positive, as explained here:
https://lore.kernel.org/lkml/c2fad584-1705-a5f2-d63c-824e9b96cf50@embeddedor.com/
While working on this, I've found and fixed the following missing
break/return bugs, some of them introduced more than 5 years ago:
84242b82d87850b51b6c5e420fe63509186e5034b5be8531817264235ee7cc5034a5d2479826cc865340f23df8df997abeeb2f10d82373307b00c5e65d25ff7a54a7ed5b3e7dc24bfa8f21ad0eaee6199ba8376ce1dc586a60a1a8e9b186f14e57562b4860747828eac5b974bee9cc44ba91162c930e3d0a
Once this work is finish, we'll be able to universally enable
"-Wimplicit-fallthrough" to avoid any of these kinds of bugs from
entering the kernel again.
Thanks
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAlzQR2IACgkQRwW0y0cG
2zEJbQ//X930OcBtT/9DRW4XL1Jeq0Mjssz/GLX2Vpup5CwwcTROG65no80Zezf/
yQRWnUjGX0OBv/fmUK32/nTxI/7k7NkmIXJHe0HiEF069GEENB7FT6tfDzIPjU8M
qQkB8NsSUWJs3IH6BVynb/9MGE1VpGBDbYk7CBZRtRJT1RMM+3kQPucgiZMgUBPo
Yd9zKwn4i/8tcOCli++EUdQ29ukMoY2R3qpK4LftdX9sXLKZBWNwQbiCwSkjnvJK
I6FDiA7RaWH2wWGlL7BpN5RrvAXp3z8QN/JZnivIGt4ijtAyxFUL/9KOEgQpBQN2
6TBRhfTQFM73NCyzLgGLNzvd8awem1rKGSBBUvevaPbgesgM+Of65wmmTQRhFNCt
A7+e286X1GiK3aNcjUKrByKWm7x590EWmDzmpmICxNPdt5DHQ6EkmvBdNjnxCMrO
aGA24l78tBN09qN45LR7wtHYuuyR0Jt9bCmeQZmz7+x3ICDHi/+Gw7XPN/eM9+T6
lZbbINiYUyZVxOqwzkYDCsdv9+kUvu3e4rPs20NERWRpV8FEvBIyMjXAg6NAMTue
K+ikkyMBxCvyw+NMimHJwtD7ho4FkLPcoeXb2ZGJTRHixiZAEtF1RaQ7dA05Q/SL
gbSc0DgLZeHlLBT+BSWC2Z8SDnoIhQFXW49OmuACwCUC68NHKps=
=k30z
-----END PGP SIGNATURE-----
Merge tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull Wimplicit-fallthrough updates from Gustavo A. R. Silva:
"Mark switch cases where we are expecting to fall through.
This is part of the ongoing efforts to enable -Wimplicit-fallthrough.
Most of them have been baking in linux-next for a whole development
cycle. And with Stephen Rothwell's help, we've had linux-next
nag-emails going out for newly introduced code that triggers
-Wimplicit-fallthrough to avoid gaining more of these cases while we
work to remove the ones that are already present.
We are getting close to completing this work. Currently, there are
only 32 of 2311 of these cases left to be addressed in linux-next. I'm
auditing every case; I take a look into the code and analyze it in
order to determine if I'm dealing with an actual bug or a false
positive, as explained here:
https://lore.kernel.org/lkml/c2fad584-1705-a5f2-d63c-824e9b96cf50@embeddedor.com/
While working on this, I've found and fixed the several missing
break/return bugs, some of them introduced more than 5 years ago.
Once this work is finished, we'll be able to universally enable
"-Wimplicit-fallthrough" to avoid any of these kinds of bugs from
entering the kernel again"
* tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits)
memstick: mark expected switch fall-throughs
drm/nouveau/nvkm: mark expected switch fall-throughs
NFC: st21nfca: Fix fall-through warnings
NFC: pn533: mark expected switch fall-throughs
block: Mark expected switch fall-throughs
ASN.1: mark expected switch fall-through
lib/cmdline.c: mark expected switch fall-throughs
lib: zstd: Mark expected switch fall-throughs
scsi: sym53c8xx_2: sym_nvram: Mark expected switch fall-through
scsi: sym53c8xx_2: sym_hipd: mark expected switch fall-throughs
scsi: ppa: mark expected switch fall-through
scsi: osst: mark expected switch fall-throughs
scsi: lpfc: lpfc_scsi: Mark expected switch fall-throughs
scsi: lpfc: lpfc_nvme: Mark expected switch fall-through
scsi: lpfc: lpfc_nportdisc: Mark expected switch fall-through
scsi: lpfc: lpfc_hbadisc: Mark expected switch fall-throughs
scsi: lpfc: lpfc_els: Mark expected switch fall-throughs
scsi: lpfc: lpfc_ct: Mark expected switch fall-throughs
scsi: imm: mark expected switch fall-throughs
scsi: csiostor: csio_wr: mark expected switch fall-through
...
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
This patch fixes the following warnings:
fs/affs/affs.h:124:38: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/configfs/dir.c:1692:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/configfs/dir.c:1694:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ceph/file.c:249:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/hash.c:233:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/hash.c:246:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext2/inode.c:1237:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext2/inode.c:1244:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/indirect.c:1182:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/indirect.c:1188:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/indirect.c:1432:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/indirect.c:1440:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/f2fs/node.c:618:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/f2fs/node.c:620:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/btrfs/ref-verify.c:522:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/gfs2/bmap.c:711:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/gfs2/bmap.c:722:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/jffs2/fs.c:339:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/nfsd/nfs4proc.c:429:12: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ufs/util.h:62:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ufs/util.h:43:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/fcntl.c:770:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/seq_file.c:319:10: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/libfs.c:148:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/libfs.c:150:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/signalfd.c:178:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/locks.c:1473:16: warning: this statement may fall through [-Wimplicit-fallthrough=]
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enabling
-Wimplicit-fallthrough.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
free the symlink body after the same RCU delay we have for freeing the
struct inode itself, so that traversal during RCU pathwalk wouldn't step
into freed memory.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
jffs2_sync_fs makes the assumption that if CONFIG_JFFS2_FS_WRITEBUFFER
is defined then a write buffer is available and has been initialized.
However, this does is not the case when the mtd device has no
out-of-band buffer:
int jffs2_nand_flash_setup(struct jffs2_sb_info *c)
{
if (!c->mtd->oobsize)
return 0;
...
The resulting call to cancel_delayed_work_sync passing a uninitialized
(but zeroed) delayed_work struct forces lockdep to become disabled.
[ 90.050639] overlayfs: upper fs does not support tmpfile.
[ 90.652264] INFO: trying to register non-static key.
[ 90.662171] the code is fine but needs lockdep annotation.
[ 90.673090] turning off the locking correctness validator.
[ 90.684021] CPU: 0 PID: 1762 Comm: mount_root Not tainted 4.14.63 #0
[ 90.696672] Stack : 00000000 00000000 80d8f6a2 00000038 805f0000 80444600 8fe364f4 805dfbe7
[ 90.713349] 80563a30 000006e2 8068370c 00000001 00000000 00000001 8e2fdc48 ffffffff
[ 90.730020] 00000000 00000000 80d90000 00000000 00000106 00000000 6465746e 312e3420
[ 90.746690] 6b636f6c 03bf0000 f8000000 20676e69 00000000 80000000 00000000 8e2c2a90
[ 90.763362] 80d90000 00000001 00000000 8e2c2a90 00000003 80260dc0 08052098 80680000
[ 90.780033] ...
[ 90.784902] Call Trace:
[ 90.789793] [<8000f0d8>] show_stack+0xb8/0x148
[ 90.798659] [<8005a000>] register_lock_class+0x270/0x55c
[ 90.809247] [<8005cb64>] __lock_acquire+0x13c/0xf7c
[ 90.818964] [<8005e314>] lock_acquire+0x194/0x1dc
[ 90.828345] [<8003f27c>] flush_work+0x200/0x24c
[ 90.837374] [<80041dfc>] __cancel_work_timer+0x158/0x210
[ 90.847958] [<801a8770>] jffs2_sync_fs+0x20/0x54
[ 90.857173] [<80125cf4>] iterate_supers+0xf4/0x120
[ 90.866729] [<80158fc4>] sys_sync+0x44/0x9c
[ 90.875067] [<80014424>] syscall_common+0x34/0x58
Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
Reviewed-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Pull siginfo updates from Eric Biederman:
"I have been slowly sorting out siginfo and this is the culmination of
that work.
The primary result is in several ways the signal infrastructure has
been made less error prone. The code has been updated so that manually
specifying SEND_SIG_FORCED is never necessary. The conversion to the
new siginfo sending functions is now complete, which makes it
difficult to send a signal without filling in the proper siginfo
fields.
At the tail end of the patchset comes the optimization of decreasing
the size of struct siginfo in the kernel from 128 bytes to about 48
bytes on 64bit. The fundamental observation that enables this is by
definition none of the known ways to use struct siginfo uses the extra
bytes.
This comes at the cost of a small user space observable difference.
For the rare case of siginfo being injected into the kernel only what
can be copied into kernel_siginfo is delivered to the destination, the
rest of the bytes are set to 0. For cases where the signal and the
si_code are known this is safe, because we know those bytes are not
used. For cases where the signal and si_code combination is unknown
the bits that won't fit into struct kernel_siginfo are tested to
verify they are zero, and the send fails if they are not.
I made an extensive search through userspace code and I could not find
anything that would break because of the above change. If it turns out
I did break something it will take just the revert of a single change
to restore kernel_siginfo to the same size as userspace siginfo.
Testing did reveal dependencies on preferring the signo passed to
sigqueueinfo over si->signo, so bit the bullet and added the
complexity necessary to handle that case.
Testing also revealed bad things can happen if a negative signal
number is passed into the system calls. Something no sane application
will do but something a malicious program or a fuzzer might do. So I
have fixed the code that performs the bounds checks to ensure negative
signal numbers are handled"
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (80 commits)
signal: Guard against negative signal numbers in copy_siginfo_from_user32
signal: Guard against negative signal numbers in copy_siginfo_from_user
signal: In sigqueueinfo prefer sig not si_signo
signal: Use a smaller struct siginfo in the kernel
signal: Distinguish between kernel_siginfo and siginfo
signal: Introduce copy_siginfo_from_user and use it's return value
signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE
signal: Fail sigqueueinfo if si_signo != sig
signal/sparc: Move EMT_TAGOVF into the generic siginfo.h
signal/unicore32: Use force_sig_fault where appropriate
signal/unicore32: Generate siginfo in ucs32_notify_die
signal/unicore32: Use send_sig_fault where appropriate
signal/arc: Use force_sig_fault where appropriate
signal/arc: Push siginfo generation into unhandled_exception
signal/ia64: Use force_sig_fault where appropriate
signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn
signal/ia64: Use the generic force_sigsegv in setup_frame
signal/arm/kvm: Use send_sig_mceerr
signal/arm: Use send_sig_fault where appropriate
signal/arm: Use force_sig_fault where appropriate
...
When an invalid mount option is passed to jffs2, jffs2_parse_options()
will fail and jffs2_sb_info will be freed, but then jffs2_sb_info will
be used (use-after-free) and freeed (double-free) in jffs2_kill_sb().
Fix it by removing the buggy invocation of kfree() when getting invalid
mount options.
Fixes: 92abc475d8 ("jffs2: implement mount option parsing and compression overriding")
Cc: stable@kernel.org
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
None of the callers use the it so remove it.
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Most users of jffs2 are 32-bit systems that traditionally only support
timestamps using a 32-bit signed time_t, in the range from years 1902 to
2038. On 64-bit systems, jffs2 however interpreted the same timestamps
as unsigned values, reading back negative times (before 1970) as times
between 2038 and 2106.
Now that Linux supports 64-bit inode timestamps even on 32-bit systems,
let's use the second interpretation everywhere to allow jffs2 to be
used on 32-bit systems beyond 2038 without a fundamental change to the
inode format.
This has a slight risk of regressions, when existing files with timestamps
before 1970 are present in file system images and are now interpreted
as future time stamps. I considered moving the wraparound point a bit,
e.g. to 1960, in order to deal with timestamps that ended up on Dec 31,
1969 due to incorrect timezone handling. However, this would complicate
the implementation unnecessarily, so I went with the simplest possible
method of extending the timestamps.
Writing files with timestamps before 1970 or after 2106 now results
in those times being clamped in the file system.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
The VFS now uses timespec64 timestamps consistently, but jffs2 still
converts them to 32-bit numbers on the storage medium. As the helper
functions for the conversion (get_seconds() and timespec_to_timespec64())
are now deprecated, let's change them over to the more modern
replacements.
This keeps the traditional interpretation of those values, where
the on-disk 32-bit numbers are taken to be negative numbers, i.e.
dates before 1970, on 32-bit machines, but future numbers past 2038
on 64-bit machines.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
This is a late set of changes from Deepa Dinamani doing an automated
treewide conversion of the inode and iattr structures from 'timespec'
to 'timespec64', to push the conversion from the VFS layer into the
individual file systems.
There were no conflicts between this and the contents of linux-next
until just before the merge window, when we saw multiple problems:
- A minor conflict with my own y2038 fixes, which I could address
by adding another patch on top here.
- One semantic conflict with late changes to the NFS tree. I addressed
this by merging Deepa's original branch on top of the changes that
now got merged into mainline and making sure the merge commit includes
the necessary changes as produced by coccinelle.
- A trivial conflict against the removal of staging/lustre.
- Multiple conflicts against the VFS changes in the overlayfs tree.
These are still part of linux-next, but apparently this is no longer
intended for 4.18 [1], so I am ignoring that part.
As Deepa writes:
The series aims to switch vfs timestamps to use struct timespec64.
Currently vfs uses struct timespec, which is not y2038 safe.
The series involves the following:
1. Add vfs helper functions for supporting struct timepec64 timestamps.
2. Cast prints of vfs timestamps to avoid warnings after the switch.
3. Simplify code using vfs timestamps so that the actual
replacement becomes easy.
4. Convert vfs timestamps to use struct timespec64 using a script.
This is a flag day patch.
Next steps:
1. Convert APIs that can handle timespec64, instead of converting
timestamps at the boundaries.
2. Update internal data structures to avoid timestamp conversions.
Thomas Gleixner adds:
I think there is no point to drag that out for the next merge window.
The whole thing needs to be done in one go for the core changes which
means that you're going to play that catchup game forever. Let's get
over with it towards the end of the merge window.
[1] https://www.spinics.net/lists/linux-fsdevel/msg128294.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJbInZAAAoJEGCrR//JCVInReoQAIlVIIMt5ZX6wmaKbrjy9Itf
MfgbFihQ/djLnuSPVQ3nztcxF0d66BKHZ9puVjz6+mIHqfDvJTRwZs9nU+sOF/T1
g78fRkM1cxq6ZCkGYAbzyjyo5aC4PnSMP/NQLmwqvi0MXqqrbDoq5ZdP9DHJw39h
L9lD8FM/P7T29Fgp9tq/pT5l9X8VU8+s5KQG1uhB5hii4VL6pD6JyLElDita7rg+
Z7/V7jkxIGEUWF7vGaiR1QTFzEtpUA/exDf9cnsf51OGtK/LJfQ0oiZPPuq3oA/E
LSbt8YQQObc+dvfnGxwgxEg1k5WP5ekj/Wdibv/+rQKgGyLOTz6Q4xK6r8F2ahxs
nyZQBdXqHhJYyKr1H1reUH3mrSgQbE5U5R1i3My0xV2dSn+vtK5vgF21v2Ku3A1G
wJratdtF/kVBzSEQUhsYTw14Un+xhBLRWzcq0cELonqxaKvRQK9r92KHLIWNE7/v
c0TmhFbkZA+zR8HdsaL3iYf1+0W/eYy8PcvepyldKNeW2pVk3CyvdTfY2Z87G2XK
tIkK+BUWbG3drEGG3hxZ3757Ln3a9qWyC5ruD3mBVkuug/wekbI8PykYJS7Mx4s/
WNXl0dAL0Eeu1M8uEJejRAe1Q3eXoMWZbvCYZc+wAm92pATfHVcKwPOh8P7NHlfy
A3HkjIBrKW5AgQDxfgvm
=CZX2
-----END PGP SIGNATURE-----
Merge tag 'vfs-timespec64' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground
Pull inode timestamps conversion to timespec64 from Arnd Bergmann:
"This is a late set of changes from Deepa Dinamani doing an automated
treewide conversion of the inode and iattr structures from 'timespec'
to 'timespec64', to push the conversion from the VFS layer into the
individual file systems.
As Deepa writes:
'The series aims to switch vfs timestamps to use struct timespec64.
Currently vfs uses struct timespec, which is not y2038 safe.
The series involves the following:
1. Add vfs helper functions for supporting struct timepec64
timestamps.
2. Cast prints of vfs timestamps to avoid warnings after the switch.
3. Simplify code using vfs timestamps so that the actual replacement
becomes easy.
4. Convert vfs timestamps to use struct timespec64 using a script.
This is a flag day patch.
Next steps:
1. Convert APIs that can handle timespec64, instead of converting
timestamps at the boundaries.
2. Update internal data structures to avoid timestamp conversions'
Thomas Gleixner adds:
'I think there is no point to drag that out for the next merge
window. The whole thing needs to be done in one go for the core
changes which means that you're going to play that catchup game
forever. Let's get over with it towards the end of the merge window'"
* tag 'vfs-timespec64' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground:
pstore: Remove bogus format string definition
vfs: change inode times to use struct timespec64
pstore: Convert internal records to timespec64
udf: Simplify calls to udf_disk_stamp_to_time
fs: nfs: get rid of memcpys for inode times
ceph: make inode time prints to be long long
lustre: Use long long type to print inode time
fs: add timespec64_truncate()
Pull the timespec64 conversion from Deepa Dinamani:
"The series aims to switch vfs timestamps to use
struct timespec64. Currently vfs uses struct timespec,
which is not y2038 safe.
The flag patch applies cleanly. I've not seen the timestamps
update logic change often. The series applies cleanly on 4.17-rc6
and linux-next tip (top commit: next-20180517).
I'm not sure how to merge this kind of a series with a flag patch.
We are targeting 4.18 for this.
Let me know if you have other suggestions.
The series involves the following:
1. Add vfs helper functions for supporting struct timepec64 timestamps.
2. Cast prints of vfs timestamps to avoid warnings after the switch.
3. Simplify code using vfs timestamps so that the actual
replacement becomes easy.
4. Convert vfs timestamps to use struct timespec64 using a script.
This is a flag day patch.
I've tried to keep the conversions with the script simple, to
aid in the reviews. I've kept all the internal filesystem data
structures and function signatures the same.
Next steps:
1. Convert APIs that can handle timespec64, instead of converting
timestamps at the boundaries.
2. Update internal data structures to avoid timestamp conversions."
I've pulled it into a branch based on top of the NFS changes that
are now in mainline, so I could resolve the non-obvious conflict
between the two while merging.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Need to tell the compiler that the acl entries follow the acl header.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
For anything NFS-exported we do _not_ want to unlock new inode
before it has grown an alias; original set of fixes got the
ordering right, but missed the nasty complication in case of
lockdep being enabled - unlock_new_inode() does
lockdep_annotate_inode_mutex_key(inode)
which can only be done before anyone gets a chance to touch
->i_mutex. Unfortunately, flipping the order and doing
unlock_new_inode() before d_instantiate() opens a window when
mkdir can race with open-by-fhandle on a guessed fhandle, leading
to multiple aliases for a directory inode and all the breakage
that follows from that.
Correct solution: a new primitive (d_instantiate_new())
combining these two in the right order - lockdep annotate, then
d_instantiate(), then the rest of unlock_new_inode(). All
combinations of d_instantiate() with unlock_new_inode() should
be converted to that.
Cc: stable@kernel.org # 2.6.29 and later
Tested-by: Mike Marshall <hubcap@omnibond.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
->fail_addr and ->addr can be updated no matter the result of
parent->_erase(), we just need to remove the code doing the same thing
in mtd_erase_callback() to avoid adjusting those fields twice.
Note that this can be done because all MTD users have been converted to
not pass an erase_info->callback() and are thus only taking the
->addr_fail and ->addr fields into account after part_erase() has
returned.
While we're at it, get rid of the erase_info->mtd field which was only
needed to let mtd_erase_callback() get the partition device back.
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
None of the mtd->_erase() implementations work in an asynchronous manner,
so let's simplify MTD users that call mtd_erase(). All they need to do
is check the value returned by mtd_erase() and assume that != 0 means
failure.
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
documentation, errseq documentation, kernel-doc support for nested
structure definitions, the removal of lots of crufty kernel-doc support for
unused formats, SPDX tag documentation, the beginnings of a manual for
subsystem maintainers, and lots of fixes and updates.
As usual, some of the changesets reach outside of Documentation/ to effect
kerneldoc comment fixes. It also adds the new LICENSES directory, of which
Thomas promises I do not need to be the maintainer.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJab11TAAoJEI3ONVYwIuV6i1UP/1LgGPHW9Ygq5qaLFbReZd/u
Mx/orrhHX0PdkbCCE+CbL8Vm1m4UKFDTBdlpk3s542zxeeG0ZBXuTnvq4Kyk+cTN
p4/vsIEzk/Ih13/glGE5MlV+EjiEK+8hK69TIUj7bAyuHmpzofjRz9/1M6RLDGDC
HY6UI58AXG0yOQWMWCGRMYpQAFUGij2equ7Doe1ugXRq14dx7V4RsOhI140iRk7t
bquAq1rS2fXniiuPFmLBUe4dWW28isVa/Vl/aXcaWQDKMyT0OLhjOMW36wWKqtPi
WdVCpHv1NLZNyZZr9S3kvfOwW+BUqpEzfVwssyBLW4h0tsnIx0U0HVhSTY8/TvFZ
QD9yCSana4LB/e5CHXIX5lBHbjHxf+rETXqVV4MgwDaMvM3mCo4X6WUTJDmZADo6
vQISEKeb4su5uWAbc9T9xwRSLhZnFVdJ/QuYdNQ5+EpFJYLhzQ9eBvEz6JstSIXL
p9ASBiPNY3ulpVZ8q0JOHJRBhq5mHJH6Dy8achzbILy2l/ZI4b8lJ53mw9II04cp
puF96E6HpvuZ8Tgjjrg9U3ZdxXNrUgc/tjk2ZDkyTglk1XF2jKSq2tiNSZ3oLrJm
XqJPnpCeyJM5UDvwkIBzgC41WEHwe8uvoNbUnc4X7UJSZegFzcSLQXf5qaprHS5k
XeQ7sbd+S+jzVVjFi0W5
=Z15Z
-----END PGP SIGNATURE-----
Merge tag 'docs-4.16' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"Documentation updates for 4.16.
New stuff includes refcount_t documentation, errseq documentation,
kernel-doc support for nested structure definitions, the removal of
lots of crufty kernel-doc support for unused formats, SPDX tag
documentation, the beginnings of a manual for subsystem maintainers,
and lots of fixes and updates.
As usual, some of the changesets reach outside of Documentation/ to
effect kerneldoc comment fixes. It also adds the new LICENSES
directory, of which Thomas promises I do not need to be the
maintainer"
* tag 'docs-4.16' of git://git.lwn.net/linux: (65 commits)
linux-next: docs-rst: Fix typos in kfigure.py
linux-next: DOC: HWPOISON: Fix path to debugfs in hwpoison.txt
Documentation: Fix misconversion of #if
docs: add index entry for networking/msg_zerocopy
Documentation: security/credentials.rst: explain need to sort group_list
LICENSES: Add MPL-1.1 license
LICENSES: Add the GPL 1.0 license
LICENSES: Add Linux syscall note exception
LICENSES: Add the MIT license
LICENSES: Add the BSD-3-clause "Clear" license
LICENSES: Add the BSD 3-clause "New" or "Revised" License
LICENSES: Add the BSD 2-clause "Simplified" license
LICENSES: Add the LGPL-2.1 license
LICENSES: Add the LGPL 2.0 license
LICENSES: Add the GPL 2.0 license
Documentation: Add license-rules.rst to describe how to properly identify file licenses
scripts: kernel_doc: better handle show warnings logic
fs/*/Kconfig: drop links to 404-compliant http://acl.bestbits.at
doc: md: Fix a file name to md-fault.c in fault-injection.txt
errseq: Add to documentation tree
...
Pull misc vfs updates from Al Viro:
"All kinds of misc stuff, without any unifying topic, from various
people.
Neil's d_anon patch, several bugfixes, introduction of kvmalloc
analogue of kmemdup_user(), extending bitfield.h to deal with
fixed-endians, assorted cleanups all over the place..."
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
alpha: osf_sys.c: use timespec64 where appropriate
alpha: osf_sys.c: fix put_tv32 regression
jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path
dcache: delete unused d_hash_mask
dcache: subtract d_hash_shift from 32 in advance
fs/buffer.c: fold init_buffer() into init_page_buffers()
fs: fold __inode_permission() into inode_permission()
fs: add RWF_APPEND
sctp: use vmemdup_user() rather than badly open-coding memdup_user()
snd_ctl_elem_init_enum_names(): switch to vmemdup_user()
replace_user_tlv(): switch to vmemdup_user()
new primitive: vmemdup_user()
memdup_user(): switch to GFP_USER
eventfd: fold eventfd_ctx_get() into eventfd_ctx_fileget()
eventfd: fold eventfd_ctx_read() into eventfd_read()
eventfd: convert to use anon_inode_getfd()
nfs4file: get rid of pointless include of btrfs.h
uvc_v4l2: clean copyin/copyout up
vme_user: don't use __copy_..._user()
usx2y: don't bother with memdup_user() for 16-byte structure
...
This link is replicated in most filesystems' config stanzas. Referring
to an archived version of that site is pointless as it mostly deals with
patches; user documentation is available elsewhere.
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Acked-by: David Sterba <dsterba@suse.com>
Acked-by: "Yan, Zheng" <zyan@redhat.com>
Acked-by: Chao Yu <yuchao0@huawei.com>
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
Acked-by: Steve French <smfrench@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This is a pure automated search-and-replace of the internal kernel
superblock flags.
The s_flags are now called SB_*, with the names and the values for the
moment mirroring the MS_* flags that they're equivalent to.
Note how the MS_xyz flags are the ones passed to the mount system call,
while the SB_xyz flags are what we then use in sb->s_flags.
The script to do this was:
# places to look in; re security/*: it generally should *not* be
# touched (that stuff parses mount(2) arguments directly), but
# there are two places where we really deal with superblock flags.
FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
include/linux/fs.h include/uapi/linux/bfs_fs.h \
security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
# the list of MS_... constants
SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
ACTIVE NOUSER"
SED_PROG=
for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
# we want files that contain at least one of MS_...,
# with fs/namespace.c and fs/pnode.c excluded.
L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
for f in $L; do sed -i $f $SED_PROG; done
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull mount flag updates from Al Viro:
"Another chunk of fmount preparations from dhowells; only trivial
conflicts for that part. It separates MS_... bits (very grotty
mount(2) ABI) from the struct super_block ->s_flags (kernel-internal,
only a small subset of MS_... stuff).
This does *not* convert the filesystems to new constants; only the
infrastructure is done here. The next step in that series is where the
conflicts would be; that's the conversion of filesystems. It's purely
mechanical and it's better done after the merge, so if you could run
something like
list=$(for i in MS_RDONLY MS_NOSUID MS_NODEV MS_NOEXEC MS_SYNCHRONOUS MS_MANDLOCK MS_DIRSYNC MS_NOATIME MS_NODIRATIME MS_SILENT MS_POSIXACL MS_KERNMOUNT MS_I_VERSION MS_LAZYTIME; do git grep -l $i fs drivers/staging/lustre drivers/mtd ipc mm include/linux; done|sort|uniq|grep -v '^fs/namespace.c$')
sed -i -e 's/\<MS_RDONLY\>/SB_RDONLY/g' \
-e 's/\<MS_NOSUID\>/SB_NOSUID/g' \
-e 's/\<MS_NODEV\>/SB_NODEV/g' \
-e 's/\<MS_NOEXEC\>/SB_NOEXEC/g' \
-e 's/\<MS_SYNCHRONOUS\>/SB_SYNCHRONOUS/g' \
-e 's/\<MS_MANDLOCK\>/SB_MANDLOCK/g' \
-e 's/\<MS_DIRSYNC\>/SB_DIRSYNC/g' \
-e 's/\<MS_NOATIME\>/SB_NOATIME/g' \
-e 's/\<MS_NODIRATIME\>/SB_NODIRATIME/g' \
-e 's/\<MS_SILENT\>/SB_SILENT/g' \
-e 's/\<MS_POSIXACL\>/SB_POSIXACL/g' \
-e 's/\<MS_KERNMOUNT\>/SB_KERNMOUNT/g' \
-e 's/\<MS_I_VERSION\>/SB_I_VERSION/g' \
-e 's/\<MS_LAZYTIME\>/SB_LAZYTIME/g' \
$list
and commit it with something along the lines of 'convert filesystems
away from use of MS_... constants' as commit message, it would save a
quite a bit of headache next cycle"
* 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
VFS: Differentiate mount flags (MS_*) from internal superblock flags
VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)
vfs: Add sb_rdonly(sb) to query the MS_RDONLY flag on s_flags
General updates:
* Constify pci_device_id in various drivers
* Constify device_type
* Remove pad control code from the Gemini driver
* Use %pOF to print OF node full_name
* Various fixes in the physmap_of driver
* Remove unused vars in mtdswap
* Check devm_kzalloc() return value in the spear_smi driver
* Check clk_prepare_enable() return code in the st_spi_fsm driver
* Create per MTD device debugfs enties
NAND updates, from Boris Brezillon:
* Fix memory leaks in the core
* Remove unused NAND locking support
* Rename nand.h into rawnand.h (preparing support for spi NANDs)
* Use NAND_MAX_ID_LEN where appropriate
* Fix support for 20nm Hynix chips
* Fix support for Samsung and Hynix SLC NANDs
* Various cleanup, improvements and fixes in the qcom driver
* Fixes for bugs detected by various static code analysis tools
* Fix mxc ooblayout definition
* Add a new part_parsers to tmio and sharpsl platform data in order to
define a custom list of partition parsers
* Request the reset line in exclusive mode in the sunxi driver
* Fix a build error in the orion-nand driver when compiled for ARMv4
* Allow 64-bit mvebu platforms to select the PXA3XX driver
SPI NOR updates, from Cyrille Pitchen and Marek Vasut:
* add support to the JEDEC JESD216B specification (SFDP tables).
* add support to the Intel Denverton SPI flash controller.
* fix error recovery for Spansion/Cypress SPI NOR memories.
* fix 4-byte address management for the Aspeed SPI controller.
* add support to some Microchip SST26 memory parts
* remove unneeded pinctrl header Write a message for tag:
-----BEGIN PGP SIGNATURE-----
iQJABAABCAAqBQJZrav6Ixxib3Jpcy5icmV6aWxsb25AZnJlZS1lbGVjdHJvbnMu
Y29tAAoJEGXtNgF+CLcABwkP/joDrq09RIC9n5gP+ubJe6O1jKvNWDd6bIVXD3Ke
73R0a0ANwwWlNYWTChTdrb8UeewVS1bzutyy5O2Sbdb6Jc6s7xkfQDTsbET2HWOK
S7Lt/zjlC6/6cow59B6h43PGS6wmIFaZD3K+70sGhvFnV8epVUzS2Aa783xS8LXm
so2djZOdUYnW+yE0eho24VQR6nS4YP4Vc+7Mm9skjU0ifjB9mJiWRkzoQnqIgORO
M+Iab+qjDs9KR/edWh6mZtnvjps0VSW4I40YsClpcgIn550w1DSXe4u6/8Nk+2Bp
gfrALls91gob0ocxmEdIyLID+M0410HcN/Lvh36nw+tkkGTaXf0D6mkqzdKNrZ3w
yz+UV9uf19kr1c6zFGcCvUlD0btn9KT+F2legnhgURtwUyDFQcaYQlkpDIeEzUMV
ZrtzKbSE2v9810YKXjtCnseewdP+Eph/ewN6ODX5yg/fs8K0fyQYTRtYYM50U69X
md8zznBBDPhJVu5T2Of7my9V1SxvCP8a7LrKjAXuFHpZ/CHiPe+QOWBgG2L+zXXT
e10/rTg7T2pcyKpBvL/3/mCYeJ+Iup3lKT1EHGCXcKnLGecVgOsbvdG+JnvQMI2J
FLmu1exvrzi0Gcrs/05hqwyUvkHZ5FB1a+heNOtmQ+h1U0ElXqILyu7brzghupRe
3phO
=UgCd
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20170904' of git://git.infradead.org/linux-mtd
Pull MTD updates from Boris Brezillon:
"General updates:
- Constify pci_device_id in various drivers
- Constify device_type
- Remove pad control code from the Gemini driver
- Use %pOF to print OF node full_name
- Various fixes in the physmap_of driver
- Remove unused vars in mtdswap
- Check devm_kzalloc() return value in the spear_smi driver
- Check clk_prepare_enable() return code in the st_spi_fsm driver
- Create per MTD device debugfs enties
NAND updates, from Boris Brezillon:
- Fix memory leaks in the core
- Remove unused NAND locking support
- Rename nand.h into rawnand.h (preparing support for spi NANDs)
- Use NAND_MAX_ID_LEN where appropriate
- Fix support for 20nm Hynix chips
- Fix support for Samsung and Hynix SLC NANDs
- Various cleanup, improvements and fixes in the qcom driver
- Fixes for bugs detected by various static code analysis tools
- Fix mxc ooblayout definition
- Add a new part_parsers to tmio and sharpsl platform data in order
to define a custom list of partition parsers
- Request the reset line in exclusive mode in the sunxi driver
- Fix a build error in the orion-nand driver when compiled for ARMv4
- Allow 64-bit mvebu platforms to select the PXA3XX driver
SPI NOR updates, from Cyrille Pitchen and Marek Vasut:
- add support to the JEDEC JESD216B specification (SFDP tables).
- add support to the Intel Denverton SPI flash controller.
- fix error recovery for Spansion/Cypress SPI NOR memories.
- fix 4-byte address management for the Aspeed SPI controller.
- add support to some Microchip SST26 memory parts
- remove unneeded pinctrl header Write a message for tag:"
* tag 'for-linus-20170904' of git://git.infradead.org/linux-mtd: (74 commits)
mtd: nand: complain loudly when chip->bits_per_cell is not correctly initialized
mtd: nand: make Samsung SLC NAND usable again
mtd: nand: tmio: Register partitions using the parsers
mfd: tmio: Add partition parsers platform data
mtd: nand: sharpsl: Register partitions using the parsers
mtd: nand: sharpsl: Add partition parsers platform data
mtd: nand: qcom: Support for IPQ8074 QPIC NAND controller
mtd: nand: qcom: support for IPQ4019 QPIC NAND controller
dt-bindings: qcom_nandc: IPQ8074 QPIC NAND documentation
dt-bindings: qcom_nandc: IPQ4019 QPIC NAND documentation
dt-bindings: qcom_nandc: fix the ipq806x device tree example
mtd: nand: qcom: support for different DEV_CMD register offsets
mtd: nand: qcom: QPIC data descriptors handling
mtd: nand: qcom: enable BAM or ADM mode
mtd: nand: qcom: erased codeword detection configuration
mtd: nand: qcom: support for read location registers
mtd: nand: qcom: support for passing flags in DMA helper functions
mtd: nand: qcom: add BAM DMA descriptor handling
mtd: nand: qcom: allocate BAM transaction
mtd: nand: qcom: DMA mapping support for register read buffer
...
We are planning to share more code between different NAND based
devices (SPI NAND, OneNAND and raw NANDs), but before doing that
we need to move the existing include/linux/mtd/nand.h file into
include/linux/mtd/rawnand.h so we can later create a nand.h header
containing all common structure and function prototypes.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Peter Pan <peterpandong@micron.com>
Acked-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Acked-by: Wenyou Yang <wenyou.yang@microchip.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Han Xu <han.xu@nxp.com>
Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-By: Harvey Hunt <harveyhuntnexus@gmail.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Krzysztof Halasa <khalasa@piap.pl>
This patch converts most of the in-kernel filesystems that do writeback
out of the pagecache to report errors using the errseq_t-based
infrastructure that was recently added. This allows them to report
errors once for each open file description.
Most filesystems have a fairly straightforward fsync operation. They
call filemap_write_and_wait_range to write back all of the data and
wait on it, and then (sometimes) sync out the metadata.
For those filesystems this is a straightforward conversion from calling
filemap_write_and_wait_range in their fsync operation to calling
file_write_and_wait_range.
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Firstly by applying the following with coccinelle's spatch:
@@ expression SB; @@
-SB->s_flags & MS_RDONLY
+sb_rdonly(SB)
to effect the conversion to sb_rdonly(sb), then by applying:
@@ expression A, SB; @@
(
-(!sb_rdonly(SB)) && A
+!sb_rdonly(SB) && A
|
-A != (sb_rdonly(SB))
+A != sb_rdonly(SB)
|
-A == (sb_rdonly(SB))
+A == sb_rdonly(SB)
|
-!(sb_rdonly(SB))
+!sb_rdonly(SB)
|
-A && (sb_rdonly(SB))
+A && sb_rdonly(SB)
|
-A || (sb_rdonly(SB))
+A || sb_rdonly(SB)
|
-(sb_rdonly(SB)) != A
+sb_rdonly(SB) != A
|
-(sb_rdonly(SB)) == A
+sb_rdonly(SB) == A
|
-(sb_rdonly(SB)) && A
+sb_rdonly(SB) && A
|
-(sb_rdonly(SB)) || A
+sb_rdonly(SB) || A
)
@@ expression A, B, SB; @@
(
-(sb_rdonly(SB)) ? 1 : 0
+sb_rdonly(SB)
|
-(sb_rdonly(SB)) ? A : B
+sb_rdonly(SB) ? A : B
)
to remove left over excess bracketage and finally by applying:
@@ expression A, SB; @@
(
-(A & MS_RDONLY) != sb_rdonly(SB)
+(bool)(A & MS_RDONLY) != sb_rdonly(SB)
|
-(A & MS_RDONLY) == sb_rdonly(SB)
+(bool)(A & MS_RDONLY) == sb_rdonly(SB)
)
to make comparisons against the result of sb_rdonly() (which is a bool)
work correctly.
Signed-off-by: David Howells <dhowells@redhat.com>
trivial fix to spelling mistake in JFFS2_ERROR message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
[Brian: also fix 'an' -> 'a']
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Fix up affected files that include this signal functionality via sched.h.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add #include <linux/cred.h> dependencies to all .c files rely on sched.h
doing that for them.
Note that even if the count where we need to add extra headers seems high,
it's still a net win, because <linux/sched.h> is included in over
2,200 files ...
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If .readlink == NULL implies generic_readlink().
Generated by:
to_del="\.readlink.*=.*generic_readlink"
for i in `git grep -l $to_del`; do sed -i "/$to_del"/d $i; done
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Pull more vfs updates from Al Viro:
">rename2() work from Miklos + current_time() from Deepa"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: Replace current_fs_time() with current_time()
fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
fs: Replace CURRENT_TIME with current_time() for inode timestamps
fs: proc: Delete inode time initializations in proc_alloc_inode()
vfs: Add current_time() api
vfs: add note about i_op->rename changes to porting
fs: rename "rename2" i_op to "rename"
vfs: remove unused i_op->rename
fs: make remaining filesystems use .rename2
libfs: support RENAME_NOREPLACE in simple_rename()
fs: support RENAME_NOREPLACE for local filesystems
ncpfs: fix unused variable warning
Pull vfs xattr updates from Al Viro:
"xattr stuff from Andreas
This completes the switch to xattr_handler ->get()/->set() from
->getxattr/->setxattr/->removexattr"
* 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: Remove {get,set,remove}xattr inode operations
xattr: Stop calling {get,set,remove}xattr inode operations
vfs: Check for the IOP_XATTR flag in listxattr
xattr: Add __vfs_{get,set,remove}xattr helpers
libfs: Use IOP_XATTR flag for empty directory handling
vfs: Use IOP_XATTR flag for bad-inode handling
vfs: Add IOP_XATTR inode operations flag
vfs: Move xattr_resolve_name to the front of fs/xattr.c
ecryptfs: Switch to generic xattr handlers
sockfs: Get rid of getxattr iop
sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
kernfs: Switch to generic xattr handlers
hfs: Switch to generic xattr handlers
jffs2: Remove jffs2_{get,set,remove}xattr macros
xattr: Remove unnecessary NULL attribute name check
These inode operations are no longer used; remove them.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
When CONFIG_JFFS2_FS_XATTR is off, jffs2_xattr_handlers is defined as
NULL. With sb->s_xattr == NULL, the generic_{get,set,remove}xattr
functions produce the same result as setting the {get,set,remove}xattr
inode operations to NULL, so there is no need for these macros.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
CURRENT_TIME_SEC is not y2038 safe. current_time() will
be transitioned to use 64 bit time along with vfs in a
separate patch.
There is no plan to transistion CURRENT_TIME_SEC to use
y2038 safe time interfaces.
current_time() will also be extended to use superblock
range checking parameters when range checking is introduced.
This works because alloc_super() fills in the the s_time_gran
in super block to NSEC_PER_SEC.
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
When file permissions are modified via chmod(2) and the user is not in
the owning group or capable of CAP_FSETID, the setgid bit is cleared in
inode_change_ok(). Setting a POSIX ACL via setxattr(2) sets the file
permissions as well as the new ACL, but doesn't clear the setgid bit in
a similar way; this allows to bypass the check in chmod(2). Fix that.
References: CVE-2016-7097
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
We always mixed in the parent pointer into the dentry name hash, but we
did it late at lookup time. It turns out that we can simplify that
lookup-time action by salting the hash with the parent pointer early
instead of late.
A few other users of our string hashes also wanted to mix in their own
pointers into the hash, and those are updated to use the same mechanism.
Hash users that don't have any particular initial salt can just use the
NULL pointer as a no-salt.
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When get_acl() is called for an inode whose ACL is not cached yet, the
get_acl inode operation is called to fetch the ACL from the filesystem.
The inode operation is responsible for updating the cached acl with
set_cached_acl(). This is done without locking at the VFS level, so
another task can call set_cached_acl() or forget_cached_acl() before the
get_acl inode operation gets to calling set_cached_acl(), and then
get_acl's call to set_cached_acl() results in caching an outdate ACL.
Prevent this from happening by setting the cached ACL pointer to a
task-specific sentinel value before calling the get_acl inode operation.
Move the responsibility for updating the cached ACL from the get_acl
inode operations to get_acl(). There, only set the cached ACL if the
sentinel value hasn't changed.
The sentinel values are chosen to have odd values. Likewise, the value
of ACL_NOT_CACHED is odd. In contrast, ACL object pointers always have
an even value (ACLs are aligned in memory). This allows to distinguish
uncached ACLs values from ACL objects.
In addition, switch from guarding inode->i_acl and inode->i_default_acl
upates by the inode->i_lock spinlock to using xchg() and cmpxchg().
Filesystems that do not want ACLs returned from their get_acl inode
operations to be cached must call forget_cached_acl() to prevent the VFS
from doing so.
(Patch written by Al Viro and Andreas Gruenbacher.)
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
NAND:
* Add sunxi_nand randomizer support
* begin refactoring NAND ecclayout structs
* fix pxa3xx_nand dmaengine usage
* brcmnand: fix support for v7.1 controller
* add Qualcomm NAND controller driver
SPI NOR:
* add new ls1021a, ls2080a support to Freescale QuadSPI
* add new flash ID entries
* support bottom-block protection for Winbond flash
* support Status Register Write Protect
* remove broken QPI support for Micron SPI flash
JFFS2:
* improve post-mount CRC scan efficiency
General:
* refactor bcm63xxpart parser, to later extend for NAND
* add writebuf size parameter to mtdram
Other minor code quality improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW9CzVAAoJEFySrpd9RFgtQFwQAJdH0wnsZTYfeqToIaD8yMM4
rtakV/oIMSvMSWuqK+Mx0k6OjGwswgnGZ+tfQLRAYIhb33P8UD0F8Dv5D0x/+zRo
EgiDlnss/lliXpbh2u4fsANSpFF/JUPXFqU6NanjqQ1rtvR60LUeKOFEz1NRciuV
Ib6oDLFeXQFxwG0J+EBDo5MrT8aiPODtx4TS8VVo0o0y/WLkEujQPP5592TnCPha
zX0n9azi26pARo7VLqWjVD8GigY5PadqJAWOZcQr0dGMQv5URtWcCCdThiNsCEzY
SW9cYSr4CBdy1FIeoJ47yoBg8aFzhyeeuF1efb1U0MoYVL0rdIbznop3Kwilj48L
Rnh4hvKkrTH16rO6RfKm1lIJaJQYKMErXyEceYMIjV91fEL3qhfbU9W6+Q5HT4hY
oJmlH+4e/I1Jtf+vW4xFGMYclmYwCO6GJ4HHqnNpby/iH/nZ07hNX3lbxrlqHMwh
MrSIidqLTsseXcyHBFc+42AsWs8unaYWVB0N3VFkEgl0BFyPObAtvwnHA6zywMvp
EqJijXFG8VPcztE3eTIMbd0WOkxTjpMT6YHzpZqli/ENxCgu79OWELYrJ0/vC5Uj
HK0qxgvIzUyJgmikkySDvd/Hc6HWItYonlcAht0VErNfTTfkMwWgRz1W4ZRB6bOJ
7M83aytLyRYaPGEbwaoR
=xOlP
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20160324' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"NAND:
- Add sunxi_nand randomizer support
- begin refactoring NAND ecclayout structs
- fix pxa3xx_nand dmaengine usage
- brcmnand: fix support for v7.1 controller
- add Qualcomm NAND controller driver
SPI NOR:
- add new ls1021a, ls2080a support to Freescale QuadSPI
- add new flash ID entries
- support bottom-block protection for Winbond flash
- support Status Register Write Protect
- remove broken QPI support for Micron SPI flash
JFFS2:
- improve post-mount CRC scan efficiency
General:
- refactor bcm63xxpart parser, to later extend for NAND
- add writebuf size parameter to mtdram
Other minor code quality improvements"
* tag 'for-linus-20160324' of git://git.infradead.org/linux-mtd: (72 commits)
mtd: nand: remove kerneldoc for removed function parameter
mtd: nand: Qualcomm NAND controller driver
dt/bindings: qcom_nandc: Add DT bindings
mtd: nand: don't select chip in nand_chip's block_bad op
mtd: spi-nor: support lock/unlock for a few Winbond chips
mtd: spi-nor: add TB (Top/Bottom) protect support
mtd: spi-nor: add SPI_NOR_HAS_LOCK flag
mtd: spi-nor: use BIT() for flash_info flags
mtd: spi-nor: disallow further writes to SR if WP# is low
mtd: spi-nor: make lock/unlock bounds checks more obvious and robust
mtd: spi-nor: silently drop lock/unlock for already locked/unlocked region
mtd: spi-nor: wait for SR_WIP to clear on initial unlock
mtd: nand: simplify nand_bch_init() usage
mtd: mtdswap: remove useless if (!mtd->ecclayout) test
mtd: create an mtd_oobavail() helper and make use of it
mtd: kill the ecclayout->oobavail field
mtd: nand: check status before reporting timeout
mtd: bcm63xxpart: give width specifier an 'int', not 'size_t'
mtd: mtdram: Add parameter for setting writebuf size
mtd: nand: pxa3xx_nand: kill unused field 'drcmr_cmd'
...
Pull vfs fixes from Al Viro:
"A couple of fixes: Fix for my dumb braino in ncpfs and a long-standing
breakage on recovery from failed rename() in jffs2"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
jffs2: reduce the breakage on recovery from halfway failed rename()
ncpfs: fix a braino in OOM handling in ncp_fill_cache()
d_instantiate(new_dentry, old_inode) is absolutely wrong thing to
do - it will oops if new_dentry used to be positive, for starters.
What we need is d_invalidate() the target and be done with that.
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ecclayout->oobavail is just redundant with the mtd->oobavail field.
Moreover, it prevents static const definition of ecc layouts since the
NAND framework is calculating this value based on the ecclayout->oobfree
field.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
We need to finish doing the CRC checks before we can allow writes to
happen, and we currently process the inodes in order. This means a call
to jffs2_get_ino_cache() for each possible inode# up to c->highest_ino.
There may be a lot of lookups which fail, if the inode# space is used
sparsely. And the inode# space is *often* used sparsely, if a file
system contains a lot of stuff that was put there in the original
image, followed by lots of creation and deletion of new files.
Instead of processing them numerically with a lookup each time, just
walk the hash buckets instead.
[fix locking typo reported by Dan Carpenter]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
When a directory is deleted, we don't take too much care about killing off
all the dirents that belong to it — on the basis that on remount, the scan
will conclude that the directory is dead anyway.
This doesn't work though, when the deleted directory contained a child
directory which was moved *out*. In the early stages of the fs build
we can then end up with an apparent hard link, with the child directory
appearing both in its true location, and as a child of the original
directory which are this stage of the mount process we don't *yet* know
is defunct.
To resolve this, take out the early special-casing of the "directories
shall not have hard links" rule in jffs2_build_inode_pass1(), and let the
normal nlink processing happen for directories as well as other inodes.
Then later in the build process we can set ic->pino_nlink to the parent
inode#, as is required for directories during normal operaton, instead
of the nlink. And complain only *then* about hard links which are still
in evidence even after killing off all the unreachable paths.
Reported-by: Liu Song <liu.song11@zte.com.cn>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
With this fix, all code paths should now be obtaining the page lock before
f->sem.
Reported-by: Szabó Tamás <sztomi89@gmail.com>
Tested-by: Thomas Betker <thomas.betker@rohde-schwarz.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
This reverts commit 5ffd3412ae
("jffs2: Fix lock acquisition order bug in jffs2_write_begin").
The commit modified jffs2_write_begin() to remove a deadlock with
jffs2_garbage_collect_live(), but this introduced new deadlocks found
by multiple users. page_lock() actually has to be called before
mutex_lock(&c->alloc_sem) or mutex_lock(&f->sem) because
jffs2_write_end() and jffs2_readpage() are called with the page locked,
and they acquire c->alloc_sem and f->sem, resp.
In other words, the lock order in jffs2_write_begin() was correct, and
it is the jffs2_garbage_collect_live() path that has to be changed.
Revert the commit to get rid of the new deadlocks, and to clear the way
for a better fix of the original deadlock.
Reported-by: Deng Chao <deng.chao1@zte.com.cn>
Reported-by: Ming Liu <liu.ming50@gmail.com>
Reported-by: wangzaiwei <wangzaiwei@top-vision.cn>
Signed-off-by: Thomas Betker <thomas.betker@rohde-schwarz.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
Pull final vfs updates from Al Viro:
- The ->i_mutex wrappers (with small prereq in lustre)
- a fix for too early freeing of symlink bodies on shmem (they need to
be RCU-delayed) (-stable fodder)
- followup to dedupe stuff merged this cycle
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: abort dedupe loop if fatal signals are pending
make sure that freeing shmem fast symlinks is RCU-delayed
wrappers for ->i_mutex access
lustre: remove unused declaration
There are many locations that do
if (memory_was_allocated_by_vmalloc)
vfree(ptr);
else
kfree(ptr);
but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory
using is_vmalloc_addr(). Unless callers have special reasons, we can
replace this branch with kvfree(). Please check and reply if you found
problems.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jan Kara <jack@suse.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Acked-by: David Rientjes <rientjes@google.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Boris Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).
Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg. For the list, see below:
- threadinfo
- task_struct
- task_delay_info
- pid
- cred
- mm_struct
- vm_area_struct and vm_region (nommu)
- anon_vma and anon_vma_chain
- signal_struct
- sighand_struct
- fs_struct
- files_struct
- fdtable and fdtable->full_fds_bits
- dentry and external_name
- inode for all filesystems. This is the most tedious part, because
most filesystems overwrite the alloc_inode method.
The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds. Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Generic MTD
* populate the MTD device 'of_node' field (and get a proper 'of_node' symlink
in sysfs)
- This yielded some new helper functions, and changes across a variety of
drivers
* partitioning cleanups, to prepare for better device-tree based partitioning
in the future
- Eliminate a lot of boilerplate for drivers that want to use OF-based
partition parsing
- The DT bindings for this didn't settle yet, so most non-cleanup portions
are deferred for a future release
NAND
* embed a struct mtd_info inside struct nand_chip
- This is really long overdue; too many drivers have to do the same silly
boilerplate to allocate and link up two "independent" structs, when in
fact, everyone is assuming there is an exact 1:1 relationship between a
NAND chips struct and its underlying MTD. This aids improved helpers and
should make certain abstractions easier in the future.
- Also causes a lot of churn, helped along by some automated code
transformations
* add more core support for detecting (and "correcting") bitflips in erased
pages; requires opt-in by drivers, but at least we kill a few bad
implementations and hopefully stave off future ones
* pxa3xx_nand: cleanups, a few fixes, and PM improvements
* new JZ4780 NAND driver
SPI NOR
* provide default erase function, for controllers that just want to send the
SECTOR_ERASE command directly
* fix some module auto-loading issues with device tree ("jedec,spi-nor")
* error handling fixes
* new Mediatek QSPI flash driver
Other
* cfi: force valid geometry Kconfig (finally!)
- this one used to trip up randconfigs occasionally, since bots aren't
deterred by big scary "advanced configuration" menus
More? Probably. See the commit logs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWlZSyAAoJEFySrpd9RFgtVkEP/096SsFs+71jPgc6ARHIKt6j
b8vnGeMpjYErIjRZXjeLJvD4whYcooM1IDXiy7HPFEF/DrmN+dBjF6gYNHhC9n+X
bpIVS+TJknBKqcAClPVArZI77Fugzn1jBuo8t8rH5bpHNvTfucOt0VmQAu/NRxXe
qUHNXDdQ+nb7qUy47Us66ygDiUh6tXYD2gp3duE8W49FnoSxcXNTq467RGyVBREl
U+5mxsD4y49mktG3xatYdwmvz9UTWOYa+u4NTG282kP6GrqTPTw0ESaxfViP9nyd
WSexhylV/PFJjxyMRazy1EZJ3kiQCQvza/VMmZ6fDSzwZ4JephuCDGLd3MhE3/E7
C275DNHC6U0s79gSlihrYHKEkMN8uNUDwNfeZhY05jXL56kFxUlUvvq14bt4gDDK
xrPLNKLOpcoAMiRFOM5NdBeyQx9REq1G09I8MBjBN0kW254lIjGm6naCOl0L/Fu1
Y4c7hJCLFUTrki9DeNcP6dtpSCVdOap4C+920FZyrBnUzpt5G/rUGdSQu0iY5i4j
BHe/Pr6ybkQ+b7XZbJlbFOy/u6TL/3T9Z4L52RK4j6F3r2EkNiH0+892L4v4qih6
Bt2wsFw8gGHC3dnnh5yfyst/QaG2Mu7V2JJjrrlERjWQJepWqYnUjlH/PgwKj7t0
/pnD40iDmUG2aYe+Jmcv
=wFQI
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20160112' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"Generic MTD:
- populate the MTD device 'of_node' field (and get a proper 'of_node'
symlink in sysfs)
This yielded some new helper functions, and changes across a
variety of drivers
- partitioning cleanups, to prepare for better device-tree based
partitioning in the future
Eliminate a lot of boilerplate for drivers that want to use
OF-based partition parsing
The DT bindings for this didn't settle yet, so most non-cleanup
portions are deferred for a future release
NAND:
- embed a struct mtd_info inside struct nand_chip
This is really long overdue; too many drivers have to do the same
silly boilerplate to allocate and link up two "independent"
structs, when in fact, everyone is assuming there is an exact 1:1
relationship between a NAND chips struct and its underlying MTD.
This aids improved helpers and should make certain abstractions
easier in the future.
Also causes a lot of churn, helped along by some automated code
transformations
- add more core support for detecting (and "correcting") bitflips in
erased pages; requires opt-in by drivers, but at least we kill a
few bad implementations and hopefully stave off future ones
- pxa3xx_nand: cleanups, a few fixes, and PM improvements
- new JZ4780 NAND driver
SPI NOR:
- provide default erase function, for controllers that just want to
send the SECTOR_ERASE command directly
- fix some module auto-loading issues with device tree
("jedec,spi-nor")
- error handling fixes
- new Mediatek QSPI flash driver
Other:
- cfi: force valid geometry Kconfig (finally!)
This one used to trip up randconfigs occasionally, since bots
aren't deterred by big scary "advanced configuration" menus
More? Probably. See the commit logs"
* tag 'for-linus-20160112' of git://git.infradead.org/linux-mtd: (168 commits)
mtd: jz4780_nand: replace if/else blocks with switch/case
mtd: nand: jz4780: Update ecc correction error codes
mtd: nandsim: use nand_get_controller_data()
mtd: jz4780_nand: remove useless mtd->priv = chip assignment
staging: mt29f_spinand: make use of nand_set/get_controller_data() helpers
mtd: nand: make use of nand_set/get_controller_data() helpers
ARM: make use of nand_set/get_controller_data() helpers
mtd: nand: add helpers to access ->priv
mtd: nand: jz4780: driver for NAND devices on JZ4780 SoCs
mtd: nand: jz4740: remove custom 'erased check' implementation
mtd: nand: diskonchip: remove custom 'erased check' implementation
mtd: nand: davinci: remove custom 'erased check' implementation
mtd: nand: use nand_check_erased_ecc_chunk in default ECC read functions
mtd: nand: return consistent error codes in ecc.correct() implementations
doc: dt: mtd: new binding for jz4780-{nand,bch}
mtd: cfi_cmdset_0001: fixing memory leak and handling failed kmalloc
mtd: spi-nor: wait until lock/unlock operations are ready
mtd: tests: consolidate kmalloc/memset 0 call to kzalloc
jffs2: use to_delayed_work
mtd: nand: assign reasonable default name for NAND drivers
...
Pull vfs xattr updates from Al Viro:
"Andreas' xattr cleanup series.
It's a followup to his xattr work that went in last cycle; -0.5KLoC"
* 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
xattr handlers: Simplify list operation
ocfs2: Replace list xattr handler operations
nfs: Move call to security_inode_listsecurity into nfs_listxattr
xfs: Change how listxattr generates synthetic attributes
tmpfs: listxattr should include POSIX ACL xattrs
tmpfs: Use xattr handler infrastructure
btrfs: Use xattr handler infrastructure
vfs: Distinguish between full xattr names and proper prefixes
posix acls: Remove duplicate xattr name definitions
gfs2: Remove gfs2_xattr_acl_chmod
vfs: Remove vfs_xattr_cmp
Use to_delayed_work() instead of open-coding it.
Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Change the list operation to only return whether or not an attribute
should be listed. Copying the attribute names into the buffer is moved
to the callers.
Since the result only depends on the dentry and not on the attribute
name, we do not pass the attribute name to list operations.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
new method: ->get_link(); replacement of ->follow_link(). The differences
are:
* inode and dentry are passed separately
* might be called both in RCU and non-RCU mode;
the former is indicated by passing it a NULL dentry.
* when called that way it isn't allowed to block
and should return ERR_PTR(-ECHILD) if it needs to be called
in non-RCU mode.
It's a flagday change - the old method is gone, all in-tree instances
converted. Conversion isn't hard; said that, so far very few instances
do not immediately bail out when called in RCU mode. That'll change
in the next commits.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add an additional "name" field to struct xattr_handler. When the name
is set, the handler matches attributes with exactly that name. When the
prefix is set instead, the handler matches attributes with the given
prefix and with a non-empty suffix.
This patch should avoid bugs like the one fixed in commit c361016a in
the future.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The xattr_handler operations are currently all passed a file system
specific flags value which the operations can use to disambiguate between
different handlers; some file systems use that to distinguish the xattr
namespace, for example. In some oprations, it would be useful to also have
access to the handler prefix. To allow that, pass a pointer to the handler
to operations instead of the flags value alone.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The vfs checks if a task has the appropriate access for get and set
operations, but it cannot do that for the list operation; the file system
must check for that itself.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: linux-mtd@lists.infradead.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Merge second patch-bomb from Andrew Morton:
- most of the rest of MM
- procfs
- lib/ updates
- printk updates
- bitops infrastructure tweaks
- checkpatch updates
- nilfs2 update
- signals
- various other misc bits: coredump, seqfile, kexec, pidns, zlib, ipc,
dma-debug, dma-mapping, ...
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (102 commits)
ipc,msg: drop dst nil validation in copy_msg
include/linux/zutil.h: fix usage example of zlib_adler32()
panic: release stale console lock to always get the logbuf printed out
dma-debug: check nents in dma_sync_sg*
dma-mapping: tidy up dma_parms default handling
pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode
kexec: use file name as the output message prefix
fs, seqfile: always allow oom killer
seq_file: reuse string_escape_str()
fs/seq_file: use seq_* helpers in seq_hex_dump()
coredump: change zap_threads() and zap_process() to use for_each_thread()
coredump: ensure all coredumping tasks have SIGNAL_GROUP_COREDUMP
signal: remove jffs2_garbage_collect_thread()->allow_signal(SIGCONT)
signal: introduce kernel_signal_stop() to fix jffs2_garbage_collect_thread()
signal: turn dequeue_signal_lock() into kernel_dequeue_signal()
signals: kill block_all_signals() and unblock_all_signals()
nilfs2: fix gcc uninitialized-variable warnings in powerpc build
nilfs2: fix gcc unused-but-set-variable warnings
MAINTAINERS: nilfs2: add header file for tracing
nilfs2: add tracepoints for analyzing reading and writing metadata files
...
jffs2_garbage_collect_thread() does allow_signal(SIGCONT) for no reason,
SIGCONT will wake a stopped task up even if it is ignored.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Markus Pargmann <mpa@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
jffs2_garbage_collect_thread() can race with SIGCONT and sleep in
TASK_STOPPED state after it was already sent. Add the new helper,
kernel_signal_stop(), which does this correctly.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Markus Pargmann <mpa@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1. Rename dequeue_signal_lock() to kernel_dequeue_signal(). This
matches another "for kthreads only" kernel_sigaction() helper.
2. Remove the "tsk" and "mask" arguments, they are always current
and current->blocked. And it is simply wrong if tsk != current.
3. We could also remove the 3rd "siginfo_t *info" arg but it looks
potentially useful. However we can simplify the callers if we
change kernel_dequeue_signal() to accept info => NULL.
4. Remove _irqsave, it is never called from atomic context.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Markus Pargmann <mpa@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Need to free the memory allocated for 'fd' if failed to read all
of the remainder name.
Signed-off-by: Wei Fang <fangwei1@huawei.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
c->oobbuf hasn't been kmalloced in jffs2_dataflash_setup, so
there is no need to free it.
Signed-off-by: Wei Fang <fangwei1@huawei.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
As new_valid_dev always returns 1, so !new_valid_dev check is not
needed, remove it.
Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Remove unneeded NULL test.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@ expression x; @@
-if (x != NULL)
\(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Pull more vfs updates from Al Viro:
"Assorted VFS fixes and related cleanups (IMO the most interesting in
that part are f_path-related things and Eric's descriptor-related
stuff). UFS regression fixes (it got broken last cycle). 9P fixes.
fs-cache series, DAX patches, Jan's file_remove_suid() work"
[ I'd say this is much more than "fixes and related cleanups". The
file_table locking rule change by Eric Dumazet is a rather big and
fundamental update even if the patch isn't huge. - Linus ]
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
9p: cope with bogus responses from server in p9_client_{read,write}
p9_client_write(): avoid double p9_free_req()
9p: forgetting to cancel request on interrupted zero-copy RPC
dax: bdev_direct_access() may sleep
block: Add support for DAX reads/writes to block devices
dax: Use copy_from_iter_nocache
dax: Add block size note to documentation
fs/file.c: __fget() and dup2() atomicity rules
fs/file.c: don't acquire files->file_lock in fd_install()
fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
vfs: avoid creation of inode number 0 in get_next_ino
namei: make set_root_rcu() return void
make simple_positive() public
ufs: use dir_pages instead of ufs_dir_pages()
pagemap.h: move dir_pages() over there
remove the pointless include of lglock.h
fs: cleanup slight list_entry abuse
xfs: Correctly lock inode when removing suid and file capabilities
fs: Call security_ops->inode_killpriv on truncate
fs: Provide function telling whether file_remove_privs() will do anything
...
JFFS2
* fix a theoretical unbalanced locking issue; the lock handling was a bit
unclean, but AFAICT, it didn't actually lead to real deadlocks
NAND
* brcmnand driver: new driver supporting NAND controller found originally on
Broadcom STB SoCs (BCM7xxx), but now also found on BCM63xxx, iProc (e.g.,
Cygnus, BCM5301x), BCM3xxx, and more
* Begin factoring out BBT code so it can be shared between traditional
(parallel) NAND drivers and upcoming SPI NAND drivers (WIP)
* Add common DT-based init support, so nand_base can pick up some flash
properties automatically, using established common NAND DT properties
* mxc_nand: support 8-bit ECC
* pxa3xx_nand:
- fix build for ARM64
- use a jiffies-based timeout
SPI NOR
* Add a few new IDs
* Clear out some unnecessary entries
* Make sure SECT_4K flags are correct for all (?) entries
Core
* Fix mtd->usecount race conditions (BUG_ON())
* Switch to modern PM ops
Other
* CFI: save code space by de-inlining large functions
* Clean up some partition parser selection code across several drivers
* Various miscellaneous changes, mostly minor
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJViZdKAAoJEFySrpd9RFgtaqoQAINkYUdxa8mOlmXQSIhWz19K
5FJqJN+/lgrYmIGI1SzVy/TTB/V4hWH+9h1snU98ToaHFACqbKKeo6FLs6GRd+WJ
adR9H4PUjtAZOt8trKzzygJnqvRPDWnn6H+0urxFv7kpyPTEorifiiI6+o3AM995
vh+YsLJNFYHucIzi4TVkgwssLYi8I9TOb35pDnW2uT/ikDi+4Uu+Ocq6u7SuMPXV
Zch6DcmBhtTQ9ghBwF/dMMAhyyMLyY4q0+CEzmJGxNU+g2o1njOTUDBv+lLqxxMB
fUXMAPXIT3ndhWjMmzklG7AjOorbg44aG2UlqJWXx00VYJKNf+pljpgdHOXKiuWo
xYkqOYnEM8gZ4qYNInKbbv34Q2+EjCxg+aAWxh+yRCJrfnF3p5/QSMVL2P2JNrc3
Kg8W7pW42xOeGxTYpydBtvpq+41qRtdWMgNp47PHvKF0mhpeSCCvRf/YfU4cNcfv
WfQzBLy/d7RMVFyvACdvzerF/fh9YX3yxV0B0LstytCSZO13617F6HHpm1pYfP0v
d9GpLpdiFktoG/AXpPzlSl992ALf0SQaedamuxApfvLVymDkG9xoO0P5p6SbOiJ0
QnBvEMkswEf7ExPzr5SYufSE93wikAEsAfjsLOHo+FQQ57eaRgpCOZHHgfuwcTra
xUr6u2Rq9iDenhVYDqJU
=JN5N
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20150623' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"JFFS2:
- fix a theoretical unbalanced locking issue; the lock handling was a
bit unclean, but AFAICT, it didn't actually lead to real deadlocks
NAND:
- brcmnand driver: new driver supporting NAND controller found
originally on Broadcom STB SoCs (BCM7xxx), but now also found on
BCM63xxx, iProc (e.g., Cygnus, BCM5301x), BCM3xxx, and more
- begin factoring out BBT code so it can be shared between
traditional (parallel) NAND drivers and upcoming SPI NAND drivers
(WIP)
- add common DT-based init support, so nand_base can pick up some
flash properties automatically, using established common NAND DT
properties
- mxc_nand: support 8-bit ECC
- pxa3xx_nand:
* fix build for ARM64
* use a jiffies-based timeout
SPI NOR:
- add a few new IDs
- clear out some unnecessary entries
- make sure SECT_4K flags are correct for all (?) entries
Core:
- fix mtd->usecount race conditions (BUG_ON())
- switch to modern PM ops
Other:
- CFI: save code space by de-inlining large functions
- clean up some partition parser selection code across several
drivers
- various miscellaneous changes, mostly minor"
* tag 'for-linus-20150623' of git://git.infradead.org/linux-mtd: (57 commits)
mtd: docg3: Fix kasprintf() usage
mtd: docg3: Don't leak docg3->bbt in error path
mtd: nandsim: Fix kasprintf() usage
mtd: cs553x_nand: Fix kasprintf() usage
mtd: r852: Fix device_create_file() usage
mtd: brcmnand: drop unnecessary initialization
mtd: propagate error codes from add_mtd_device()
mtd: diskonchip: remove two-phase partitioning / registration
mtd: dc21285: use raw spinlock functions for nw_gpio_lock
mtd: chips: fixup dependencies, to prevent build error
mtd: cfi_cmdset_0002: Initialize datum before calling map_word_load_partial
mtd: cfi: deinline large functions
mtd: lantiq-flash: use default partition parsers
mtd: plat_nand: use default partition probe
mtd: nand: correct indentation within conditional
mtd: remove incorrect file name
mtd: blktrans: use better error code for unimplemented ioctl()
mtd: maps: Spelling s/reseved/reserved/
mtd: blktrans: change blktrans_getgeo return value
mtd: mxc_nand: generate nand_ecclayout for 8 bit ECC
...
list_entry is just a wrapper for container_of, but it is arguably
wrong (and slightly confusing) to use it when the pointed-to struct
member is not a struct list_head. Use container_of directly instead.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Li Zefan reported an unbalanced locking issue, found by his
internal debugging feature on runtime. The particular case he was
looking at doesn't lead to a deadlock, as the structure that this lock
is embedded in is freed on error. But we should straighten out the error
handling.
Because several callers of jffs2_do_read_inode_internal() /
jffs2_do_read_inode() already handle the locking/unlocking and inode
clearing at their own level, let's just push any unlocks/clearing down
to the caller. This consistency is much easier to verify.
Reported-by: Li Zefan <lizefan@huawei.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Pull fourth vfs update from Al Viro:
"d_inode() annotations from David Howells (sat in for-next since before
the beginning of merge window) + four assorted fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
RCU pathwalk breakage when running into a symlink overmounting something
fix I_DIO_WAKEUP definition
direct-io: only inc/dec inode->i_dio_count for file systems
fs/9p: fix readdir()
VFS: assorted d_backing_inode() annotations
VFS: fs/inode.c helpers: d_inode() annotations
VFS: fs/cachefiles: d_backing_inode() annotations
VFS: fs library helpers: d_inode() annotations
VFS: assorted weird filesystems: d_inode() annotations
VFS: normal filesystems (and lustre): d_inode() annotations
VFS: security/: d_inode() annotations
VFS: security/: d_backing_inode() annotations
VFS: net/: d_inode() annotations
VFS: net/unix: d_backing_inode() annotations
VFS: kernel/: d_inode() annotations
VFS: audit: d_backing_inode() annotations
VFS: Fix up some ->d_inode accesses in the chelsio driver
VFS: Cachefiles should perform fs modifications on the top layer only
VFS: AF_UNIX sockets should call mknod on the top layer only
* Add Kconfig option for keeping both the 'master' and 'partition' MTDs
registered as devices. This would really make a better default if we could
do it over, as it allows a lot more flexibility in (1) determining the flash
topology of the system from user-space and (2) adding temporary partitions
at runtime (ioctl(BLKPG)). Unfortunately, this would possibly cause
user-space breakage, as it will cause renumbering of the /dev/mtdX devices.
We'll see if we can change this in the future, as there have already been a
few people looking for this feature, and I know others have just been
working around our current limitations instead of fixing them this way.
* Along with the previous change, add some additional information to sysfs, so
user-space can read the offset of each partition within its master device
SPI NOR:
* add new device tree compatible binding to represent the mostly-compatible
class of SPI NOR flash which can be detected by their extended JEDEC ID
bytes, cutting down the duplication of our ID tables
* misc. new IDs
Various other miscellaneous fixes and changes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVN9ypAAoJEFySrpd9RFgtaUQQAKmlCVMrxAKtF6U5jpzf07hA
7ZrcMdUTSwS++dBIAgDl6JSuSGT5KRLrS1FOp60p+VAjbD9VFcRLUUQxahXW1tAh
Dr8a3Akwd+lgIp77bZhWBY35dXmjIJ1GSzo7jdbJMDwAeDd3gBeSFTDoePsrCt6K
0/NPOsQzCFDDr1lwuQh1LzkLLQfVAC3ImNCBm5smvyEfhxXqzC02HOLf8Z9VMGnY
OxM9i0T6Ik3xeaaP/vH91sApmdn598gP5DB5cNr61YrZeVZmEoI4EWlOmagcYVC2
Tef9Ng4YmHGXo65k7XcKRykAVWECYAGr4HKCDZ8tsbvpfdbQMS5wHEgxMsAdvb01
aChcBNxf4w/Mh49fzjZppTlPN25FERRMnXt7CkUqQkqet9uDkD/5RNPl65ermeC7
EKx2MoxnpXrfZ0EkSxqrfdzP0oQx0AqAkbCyLIN42Vbxl7ckFMN3WAPQ2NR2Aaoh
SUiKwwaFFiK+C9qEytj0s+cmKPzsTzeQVYgp9NX64EfVQumqpsfbu6XIPV+FGy2i
DvHvmTEvm4SpqMPSnhkmZ6DFSjuzvQdqzKtDyZmRppxHKgWUsXYdftGPMG0+ZbaG
t4zysWfJG897TMVYLKY9pGqvouMuAVJ4kX1+iZbJc8dr4bwIzXIYuEGPLVv58gUO
KjjlYk91/jFNmBW5anxC
=aIsV
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20150422' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"Common MTD:
- Add Kconfig option for keeping both the 'master' and 'partition'
MTDs registered as devices. This would really make a better
default if we could do it over, as it allows a lot more flexibility
in (1) determining the flash topology of the system from user-space
and (2) adding temporary partitions at runtime (ioctl(BLKPG)).
Unfortunately, this would possibly cause user-space breakage, as it
will cause renumbering of the /dev/mtdX devices. We'll see if we
can change this in the future, as there have already been a few
people looking for this feature, and I know others have just been
working around our current limitations instead of fixing them this
way.
- Along with the previous change, add some additional information to
sysfs, so user-space can read the offset of each partition within
its master device
SPI NOR:
- add new device tree compatible binding to represent the
mostly-compatible class of SPI NOR flash which can be detected by
their extended JEDEC ID bytes, cutting down the duplication of our
ID tables
- misc. new IDs
Various other miscellaneous fixes and changes"
* tag 'for-linus-20150422' of git://git.infradead.org/linux-mtd: (53 commits)
mtd: spi-nor: Add support for Macronix mx25u6435f serial flash
mtd: spi-nor: Add support for Winbond w25q64dw serial flash
mtd: spi-nor: add support for the Winbond W25X05 flash
mtd: spi-nor: support en25s64 device
mtd: m25p80: bind to "nor-jedec" ID, for auto-detection
Documentation: devicetree: m25p80: add "nor-jedec" binding
mtd: Make MTD tests cancelable
mtd: mtd_oobtest: Fix bitflip_limit usage in test case 3
mtd: docg3: remove invalid __exit annotations
mtd: fsl_ifc_nand: use msecs_to_jiffies for time conversion
mtd: atmel_nand: don't map the ROM table if no pmecc table offset in DT
mtd: atmel_nand: add a definition for the oob reserved bytes
mtd: part: Remove partition overlap checks
mtd: part: Add sysfs variable for offset of partition
mtd: part: Create the master device node when partitioned
mtd: ts5500_flash: Fix typo in MODULE_DESCRIPTION in ts5500_flash.c
mtd: denali: Disable sub-page writes in Denali NAND driver
mtd: pxa3xx_nand: cleanup wait_for_completion handling
mtd: nand: gpmi: Check for scan_bbt() error
mtd: nand: gpmi: fixup return type of wait_for_completion_timeout
...
Pull second vfs update from Al Viro:
"Now that net-next went in... Here's the next big chunk - killing
->aio_read() and ->aio_write().
There'll be one more pile today (direct_IO changes and
generic_write_checks() cleanups/fixes), but I'd prefer to keep that
one separate"
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
->aio_read and ->aio_write removed
pcm: another weird API abuse
infinibad: weird APIs switched to ->write_iter()
kill do_sync_read/do_sync_write
fuse: use iov_iter_get_pages() for non-splice path
fuse: switch to ->read_iter/->write_iter
switch drivers/char/mem.c to ->read_iter/->write_iter
make new_sync_{read,write}() static
coredump: accept any write method
switch /dev/loop to vfs_iter_write()
serial2002: switch to __vfs_read/__vfs_write
ashmem: use __vfs_read()
export __vfs_read()
autofs: switch to __vfs_write()
new helper: __vfs_write()
switch hugetlbfs to ->read_iter()
coda: switch to ->read_iter/->write_iter
ncpfs: switch to ->read_iter/->write_iter
net/9p: remove (now-)unused helpers
p9_client_attach(): set fid->uid correctly
...
that's the bulk of filesystem drivers dealing with inodes of their own
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull trivial tree from Jiri Kosina:
"Usual trivial tree updates. Nothing outstanding -- mostly printk()
and comment fixes and unused identifier removals"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
goldfish: goldfish_tty_probe() is not using 'i' any more
powerpc: Fix comment in smu.h
qla2xxx: Fix printks in ql_log message
lib: correct link to the original source for div64_u64
si2168, tda10071, m88ds3103: Fix firmware wording
usb: storage: Fix printk in isd200_log_config()
qla2xxx: Fix printk in qla25xx_setup_mode
init/main: fix reset_device comment
ipwireless: missing assignment
goldfish: remove unreachable line of code
coredump: Fix do_coredump() comment
stacktrace.h: remove duplicate declaration task_struct
smpboot.h: Remove unused function prototype
treewide: Fix typo in printk messages
treewide: Fix typo in printk messages
mod_devicetable: fix comment for match_flags
All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We know "rc" is set so there is no need to check again.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Pull more vfs updates from Al Viro:
"Assorted stuff from this cycle. The big ones here are multilayer
overlayfs from Miklos and beginning of sorting ->d_inode accesses out
from David"
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (51 commits)
autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation
procfs: fix race between symlink removals and traversals
debugfs: leave freeing a symlink body until inode eviction
Documentation/filesystems/Locking: ->get_sb() is long gone
trylock_super(): replacement for grab_super_passive()
fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
SELinux: Use d_is_positive() rather than testing dentry->d_inode
Smack: Use d_is_positive() rather than testing dentry->d_inode
TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()
Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode
Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb
VFS: Split DCACHE_FILE_TYPE into regular and special types
VFS: Add a fallthrough flag for marking virtual dentries
VFS: Add a whiteout dentry type
VFS: Introduce inode-getting helpers for layered/unioned fs environments
Infiniband: Fix potential NULL d_inode dereference
posix_acl: fix reference leaks in posix_acl_create
autofs4: Wrong format for printing dentry
...
Convert the following where appropriate:
(1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
(2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
(3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
complicated than it appears as some calls should be converted to
d_can_lookup() instead. The difference is whether the directory in
question is a real dir with a ->lookup op or whether it's a fake dir with
a ->d_automount op.
In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).
Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer. In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.
However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.
There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
intended for special directory entry types that don't have attached inodes.
The following perl+coccinelle script was used:
use strict;
my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
die "Can't grep for S_ISDIR and co. callers";
@callers = <$fd>;
close($fd);
unless (@callers) {
print "No matches\n";
exit(0);
}
my @cocci = (
'@@',
'expression E;',
'@@',
'',
'- S_ISLNK(E->d_inode->i_mode)',
'+ d_is_symlink(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISDIR(E->d_inode->i_mode)',
'+ d_is_dir(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISREG(E->d_inode->i_mode)',
'+ d_is_reg(E)' );
my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);
foreach my $file (@callers) {
chomp $file;
print "Processing ", $file, "\n";
system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
die "spatch failed";
}
[AV: overlayfs parts skipped]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Remove the function pulledbits() that is not used anywhere.
This was partially found by using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
schedule_delayed_work() happening when the work is already pending is
a cheap no-op. Don't bother with ->wbuf_queued logics - it's both
broken (cancelling ->wbuf_dwork leaves it set, as spotted by Jeff Harris)
and pointless. It's cheaper to let schedule_delayed_work() handle that
case.
Reported-by: Jeff Harris <jefftharris@gmail.com>
Tested-by: Jeff Harris <jefftharris@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
AMD-compatible CFI driver:
- Support OTP programming for Micron M29EW family
- Increase buffer write timeout, according to detected flash parameter info
NAND
- Add helpers for retrieving ONFI timing modes
- GPMI: provide option to disable bad block marker swapping (required for
Ka-On electronics platforms)
SPI NOR
- EON EN25QH128 support
- Support new Flag Status Register (FSR) on a few Micron flash
Common
- New sysfs entries for bad block and ECC stats
And a few miscellaneous refactorings, cleanups, and driver improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJT5WCXAAoJEFySrpd9RFgt0rwP/1anAulAcve/QzVF9LDFPec8
jSvK8WWFcHLVb9EvTHtUjRz2RSRNhe0eeyEld3WpdKZZ73VVVaHnGdJv8J8Ys8jn
kfNBDfgDLFrVzycYCqjQ2gdvidCyrjQgtPP0E/Q/RN6FBur0/rp2WKoJ2FvuT6SS
kOz5f3TOe+iNtxQoJwkFvs/IjfFMThGs+YMJ8Z9s4LcJHD65T6hF+zDwl8xF2xfG
b104PsG3I58kJdYjKhRQ2/ol+YCPoVhQorhhuaqeouZum/Hb/2g3rKHVZpAv2n6m
JWnTpbdJDqGoPFVPyJr5Vm/UYOwxEBSWimuNp+2WN7EsXux1x9JZZl5+ZNUMmb4q
vxYhIDul2+Sg1lN+ruBe+xi6d8DI8Y5WIc9xJgn3YHLC8YSkiZ11bhQyyeHA9i5h
jZYKSkN/ERl8iAA4ULD6tsZv4ds8LVI/XOxrcSM7myQ4p8oY5QBxEWEuGPgyH6A6
qCVkc0TAriSPfcCBvs4o8s2uNUocq7x6ZT1xfdlJ0KVstCmeEJBJnBYwWIXR3tu7
3+I/bI41Q29ZGV8x5PEGDSgLVDNAJZnfGPuCwdMWuVD7UQuEEOJkCvy8o7C2Fold
hRXh3SlUICCGDzd0JdNmdt5hPuB0tzsG0YkRoRj2sS30TlHi77nN/m3HYi/JQ4UW
gA21laizfJ+z7g/5Kabb
=Wkmz
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20140808' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"AMD-compatible CFI driver:
- Support OTP programming for Micron M29EW family
- Increase buffer write timeout, according to detected flash
parameter info
NAND
- Add helpers for retrieving ONFI timing modes
- GPMI: provide option to disable bad block marker swapping (required
for Ka-On electronics platforms)
SPI NOR
- EON EN25QH128 support
- Support new Flag Status Register (FSR) on a few Micron flash
Common
- New sysfs entries for bad block and ECC stats
And a few miscellaneous refactorings, cleanups, and driver
improvements"
* tag 'for-linus-20140808' of git://git.infradead.org/linux-mtd: (31 commits)
mtd: gpmi: make blockmark swapping optional
mtd: gpmi: remove line breaks from error messages and improve wording
mtd: gpmi: remove useless (void *) type casts and spaces between type casts and variables
mtd: atmel_nand: NFC: support multiple interrupt handling
mtd: atmel_nand: implement the nfc_device_ready() by checking the R/B bit
mtd: atmel_nand: add NFC status error check
mtd: atmel_nand: make ecc parameters same as definition
mtd: nand: add ONFI timing mode to nand_timings converter
mtd: nand: define struct nand_timings
mtd: cfi_cmdset_0002: fix do_write_buffer() timeout error
mtd: denali: use 8 bytes for READID command
mtd/ftl: fix the double free of the buffers allocated in build_maps()
mtd: phram: Fix whitespace issues
mtd: spi-nor: add support for EON EN25QH128
mtd: cfi_cmdset_0002: Add support for locking OTP memory
mtd: cfi_cmdset_0002: Add support for writing OTP memory
mtd: cfi_cmdset_0002: Invalidate cache after entering/exiting OTP memory
mtd: cfi_cmdset_0002: Add support for reading OTP
mtd: spi-nor: add support for flag status register on Micron chips
mtd: Account for BBT blocks when a partition is being allocated
...
Now with 64bit bzImage and kexec tools, we support ramdisk that size is
bigger than 2g, as we could put it above 4G.
Found compressed initramfs image could not be decompressed properly. It
turns out that image length is int during decompress detection, and it
will become < 0 when length is more than 2G. Furthermore, during
decompressing len as int is used for inbuf count, that has problem too.
Change len to long, that should be ok as on 32 bit platform long is
32bits.
Tested with following compressed initramfs image as root with kexec.
gzip, bzip2, xz, lzma, lzop, lz4.
run time for populate_rootfs():
size name Nehalem-EX Westmere-EX Ivybridge-EX
9034400256 root_img : 26s 24s 30s
3561095057 root_img.lz4 : 28s 27s 27s
3459554629 root_img.lzo : 29s 29s 28s
3219399480 root_img.gz : 64s 62s 49s
2251594592 root_img.xz : 262s 260s 183s
2226366598 root_img.lzma: 386s 376s 277s
2901482513 root_img.bz2 : 635s 599s
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rashika Kheria <rashika.kheria@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: P J P <ppandit@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix checkpatch warning:
WARNING: kfree(NULL) is safe this check is probably not required
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: linux-mtd@lists.infradead.org
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Fix checkpatch warning:
WARNING: kfree(NULL) is safe this check is probably not required
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: linux-mtd@lists.infradead.org
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Pull vfs updates from Al Viro:
"This the bunch that sat in -next + lock_parent() fix. This is the
minimal set; there's more pending stuff.
In particular, I really hope to get acct.c fixes merged this cycle -
we need that to deal sanely with delayed-mntput stuff. In the next
pile, hopefully - that series is fairly short and localized
(kernel/acct.c, fs/super.c and fs/namespace.c). In this pile: more
iov_iter work. Most of prereqs for ->splice_write with sane locking
order are there and Kent's dio rewrite would also fit nicely on top of
this pile"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits)
lock_parent: don't step on stale ->d_parent of all-but-freed one
kill generic_file_splice_write()
ceph: switch to iter_file_splice_write()
shmem: switch to iter_file_splice_write()
nfs: switch to iter_splice_write_file()
fs/splice.c: remove unneeded exports
ocfs2: switch to iter_file_splice_write()
->splice_write() via ->write_iter()
bio_vec-backed iov_iter
optimize copy_page_{to,from}_iter()
bury generic_file_aio_{read,write}
lustre: get rid of messing with iovecs
ceph: switch to ->write_iter()
ceph_sync_direct_write: stop poking into iov_iter guts
ceph_sync_read: stop poking into iov_iter guts
new helper: copy_page_from_iter()
fuse: switch to ->write_iter()
btrfs: switch to ->write_iter()
ocfs2: switch to ->write_iter()
xfs: switch to ->write_iter()
...
jffs2_garbage_collect_thread() does disallow_signal(SIGHUP) around
jffs2_garbage_collect_pass() and the comment says "We don't want SIGHUP
to interrupt us".
But disallow_signal() can't ensure that jffs2_garbage_collect_pass()
won't be interrupted by SIGHUP, the problem is that SIGHUP can be
already pending when disallow_signal() is called, and in this case any
interruptible sleep won't block.
Note: this is in fact because disallow_signal() is buggy and should be
fixed, see the next changes.
But there is another reason why disallow_signal() is wrong: SIG_IGN set
by disallow_signal() silently discards any SIGHUP which can be sent
before the next allow_signal(SIGHUP).
Change this code to use sigprocmask(SIG_UNBLOCK/SIG_BLOCK, SIGHUP).
This even matches the old (and wrong) semantics allow/disallow had when
this logic was written.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- A few SPI NOR ID definitions
- Kill the NAND "max pagesize" restriction
- Fix some x16 bus-width NAND support
- Add NAND JEDEC parameter page support
- DT bindings for NAND ECC
- GPMI NAND updates (subpage reads)
- More OMAP NAND refactoring
- New STMicro SPI NOR driver (now in 40 patches!)
- A few other random bugfixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJTP6x+AAoJEFySrpd9RFgtit0P/jLWsjMK8G2ldPC4bMZsXmDF
n3c71GcCRlUq4Qzb4rtZx9DANLAh+JyRrMOKCPg6dAMegFdmUqDOpZpNp0vF57KG
myFjqTk+n5y0tfSkWLMUFt0tQ8ArDp3IBkQCUWkD5LgG50EWmjveIQGH0kFnkE39
Kytqvw17RV7f81tIs+WvKt8++YWD2X1VTpTi0S4fx2bJ99bJDBf/GgdoQpj2oirt
igXmloUFEsob9JHZ3qumcUm9vaHwv2TiouZTvRyGdJCCoPdpJEZO4Ka6e4uAVarT
6kMKXBk3lj2GsilOSFFCNetXfy5Bf0TkJkv4rDjh3R1Y4J/hSgraVCbWXdKhb6tj
RmwesdFMjsyS4f/Rhk5PXwJgGL9uK2mi6bk/SmXU0AMgCDSa5zjshY8Wq6C6uXwk
LqlnK8l3h8Txotbc/XJIL+QGMbMkYQI8gxWTHFaqzDtkMe36mnGs9Zec3oso/s2d
CNRpq5+dMZ6qF0z3zpOQHmFbaOekivMy7kCKMXer6ONsrQBNwTdmkwy+SdqnWsLF
YdJttwV/RRcE0SRvK6GrhvzkGlV83Z8RPny6hC1kbrgQ0ffoy2CmIqyWNObK1RXf
sYqoF8TCtR6Y8rHHi5dzZ1lria+nm8pb4+UfQLRI0mK8og7YW+fIfHXxqRrZZIrt
No8NCPBKWzyew2UE0AiQ
=CKih
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20140405' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
- A few SPI NOR ID definitions
- Kill the NAND "max pagesize" restriction
- Fix some x16 bus-width NAND support
- Add NAND JEDEC parameter page support
- DT bindings for NAND ECC
- GPMI NAND updates (subpage reads)
- More OMAP NAND refactoring
- New STMicro SPI NOR driver (now in 40 patches!)
- A few other random bugfixes
* tag 'for-linus-20140405' of git://git.infradead.org/linux-mtd: (120 commits)
Fix index regression in nand_read_subpage
mtd: diskonchip: mem resource name is not optional
mtd: nand: fix mention to CONFIG_MTD_NAND_ECC_BCH
mtd: nand: fix GET/SET_FEATURES address on 16-bit devices
mtd: omap2: Use devm_ioremap_resource()
mtd: denali_dt: Use devm_ioremap_resource()
mtd: devices: elm: update DRIVER_NAME as "omap-elm"
mtd: devices: elm: configure parallel channels based on ecc_steps
mtd: devices: elm: clean elm_load_syndrome
mtd: devices: elm: check for hardware engine's design constraints
mtd: st_spi_fsm: Succinctly reorganise .remove()
mtd: st_spi_fsm: Allow loop to run at least once before giving up CPU
mtd: st_spi_fsm: Correct vendor name spelling issue - missing "M"
mtd: st_spi_fsm: Avoid duplicating MTD core code
mtd: st_spi_fsm: Remove useless consts from function arguments
mtd: st_spi_fsm: Convert ST SPI FSM (NOR) Flash driver to new DT partitions
mtd: st_spi_fsm: Move runtime configurable msg sequences into device's struct
mtd: st_spi_fsm: Supply the W25Qxxx chip specific configuration call-back
mtd: st_spi_fsm: Supply the S25FLxxx chip specific configuration call-back
mtd: st_spi_fsm: Supply the MX25xxx chip specific configuration call-back
...
and COLLAPSE_RANGE fallocate operations, and scalability improvements
in the jbd2 layer and in xattr handling when the extended attributes
spill over into an external block.
Other than that, the usual clean ups and minor bug fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJTPbD2AAoJENNvdpvBGATwDmUQANSfGYIQazB8XKKgtNTMiG/Y
Ky7n1JzN9lTX/6nMsqQnbfCweLRmxqpWUBuyKDRHUi8IG0/voXSTFsAOOgz0R15A
ERRRWkVvHixLpohuL/iBdEMFHwNZYPGr3jkm0EIgzhtXNgk5DNmiuMwvHmCY27kI
kdNZIw9fip/WRNoFLDBGnLGC37aanoHhCIbVlySy5o9LN1pkC8BgXAYV0Rk19SVd
bWCudSJEirFEqWS5H8vsBAEm/ioxTjwnNL8tX8qms6orZ6h8yMLFkHoIGWPw3Q15
a0TSUoMyav50Yr59QaDeWx9uaPQVeK41wiYFI2rZOnyG2ts0u0YXs/nLwJqTovgs
rzvbdl6cd3Nj++rPi97MTA7iXK96WQPjsDJoeeEgnB0d/qPyTk6mLKgftzLTNgSa
ZmWjrB19kr6CMbebMC4L6eqJ8Fr66pCT8c/iue8wc4MUHi7FwHKH64fqWvzp2YT/
+165dqqo2JnUv7tIp6sUi1geun+bmDHLZFXgFa7fNYFtcU3I+uY1mRr3eMVAJndA
2d6ASe/KhQbpVnjKJdQ8/b833ZS3p+zkgVPrd68bBr3t7gUmX91wk+p1ct6rUPLr
700F+q/pQWL8ap0pU9Ht/h3gEJIfmRzTwxlOeYyOwDseqKuS87PSB3BzV3dDunSU
DrPKlXwIgva7zq5/S0Vr
=4s1Z
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"Major changes for 3.14 include support for the newly added ZERO_RANGE
and COLLAPSE_RANGE fallocate operations, and scalability improvements
in the jbd2 layer and in xattr handling when the extended attributes
spill over into an external block.
Other than that, the usual clean ups and minor bug fixes"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (42 commits)
ext4: fix premature freeing of partial clusters split across leaf blocks
ext4: remove unneeded test of ret variable
ext4: fix comment typo
ext4: make ext4_block_zero_page_range static
ext4: atomically set inode->i_flags in ext4_set_inode_flags()
ext4: optimize Hurd tests when reading/writing inodes
ext4: kill i_version support for Hurd-castrated file systems
ext4: each filesystem creates and uses its own mb_cache
fs/mbcache.c: doucple the locking of local from global data
fs/mbcache.c: change block and index hash chain to hlist_bl_node
ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate
ext4: refactor ext4_fallocate code
ext4: Update inode i_size after the preallocation
ext4: fix partial cluster handling for bigalloc file systems
ext4: delete path dealloc code in ext4_ext_handle_uninitialized_extents
ext4: only call sync_filesystm() when remounting read-only
fs: push sync_filesystem() down to the file system's remount_fs()
jbd2: improve error messages for inconsistent journal heads
jbd2: minimize region locked by j_list_lock in jbd2_journal_forget()
jbd2: minimize region locked by j_list_lock in journal_get_create_access()
...
This patch removes read_cache_page_async() which wasn't really needed
anywhere and simplifies the code around it a bit.
read_cache_page_async() is useful when we want to read a page into the
cache without waiting for it to complete. This happens when the
appropriate callback 'filler' doesn't complete its read operation and
releases the page lock immediately, and instead queues a different
completion routine to do that. This never actually happened anywhere in
the code.
read_cache_page_async() had 3 different callers:
- read_cache_page() which is the sync version, it would just wait for
the requested read to complete using wait_on_page_read().
- JFFS2 would call it from jffs2_gc_fetch_page(), but the filler
function it supplied doesn't do any async reads, and would complete
before the filler function returns - making it actually a sync read.
- CRAMFS would call it using the read_mapping_page_async() wrapper, with
a similar story to JFFS2 - the filler function doesn't do anything that
reminds async reads and would always complete before the filler function
returns.
To sum it up, the code in mm/filemap.c never took advantage of having
read_cache_page_async(). While there are filler callbacks that do async
reads (such as the block one), we always called it with the
read_cache_page().
This patch adds a mandatory wait for read to complete when adding a new
page to the cache, and removes read_cache_page_async() and its wrappers.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reclaim will be leaving shadow entries in the page cache radix tree upon
evicting the real page. As those pages are found from the LRU, an
iput() can lead to the inode being freed concurrently. At this point,
reclaim must no longer install shadow pages because the inode freeing
code needs to ensure the page tree is really empty.
Add an address_space flag, AS_EXITING, that the inode freeing code sets
under the tree lock before doing the final truncate. Reclaim will check
for this flag before installing shadow pages.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If jffs2_new_inode() succeeds, it returns with f->sem held, and the caller
is responsible for releasing the lock. If it fails, it still returns with
the lock held, but the caller won't release the lock, which will lead to
deadlock.
Fix it by releasing the lock in jffs2_new_inode() on error.
Signed-off-by: Wang Guoli <andy.wangguoli@huawei.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Wang Guoli <andy.wangguoli@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[Brian: not marked for stable; no one observed deadlock, and I don't
think it can happen here]
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
@wait is a local variable, so if we don't remove it from the wait queue
list, later wake_up() may end up accessing invalid memory.
This was spotted by eyes.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
- Add me (Brian Norris) as an additional MTD maintainer (it'd be nice to get
David's "ack" for this; I'm sure he approves, but he's been pretty silent
lately)
- Add Ezequiel Garcie as maintainer for the pxa3xx NAND driver
- Last (?) round of pxa3xx improvements for supporting Armada 370/XP
- Typical churn in driver boilerplate (OOM messages, printk()'s, devm_*, etc.)
- Quad read mode support for SPI NOR driver (m25p80)
- Update Davinci NAND driver to prepare for use on new platforms
- Begin to kill off NAND_MAX_{PAGE,OOB}SIZE macros; more work is pending
- Miscellaneous NAND device support (new IDs)
- Add READ RETRY support for Micron MLC NAND
- Support new GPMI NAND ECC layout device-tree binding
- Avoid mapping stack/vmalloc() memory for GPMI NAND DMA
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJS50wMAAoJEFySrpd9RFgt1I0P/3fxzmHMmTRPjNRmtVqE5buW
L3CjI1sPypA5PgCjjwABQdiw8WztBt6KVlMv5Gs6gaoi/WYAyAWz40PfwuBWFklD
L8+cuKuuslsFfqh+gdkrBkNKnrFfMJpTDx57E5+dciInmWWB1w14sHMAYk5A/ZHK
MBfXCcgfIECeQp1zSKmBtQxkhqn20hOS232gsCRWu2NWWu4UCzEcRLHGCBP1qqi3
6joQhrICiQJwH09D+I9YfyS/oisM+75Df+79xephcjMaJYAAE3tLVrKM84CBso+R
R/wjOvX+xFVyvf3n+CftxbmbFVXtRAWqCEEHu2yKffiF9oipp9xOm6gqJb9SYX/7
n54fexx5cM1AE8OeDCxgbVCNfgoqDS6hk03d5oVMRmASShr0Ye4irG77Dy/KTMGC
UdB2gb5HoKPC2N2IWWRlpwRoyLR9Xhf/+iCHjac0kB7oYQ0IdMyC0d4h7Prp5r/4
zaz/pBNNr1lL8DOFrIog02U1DL7jOBtMmlaCUnfrPkXOiwG04Inkfy7Gfug0uTIh
o1JhwCWmrY/CLYmpMqKFK9V4gr72CZTGIjMeNpb2NUAM0XlRbf3AQ8x7P7Q9UjK6
ARZC/3lPrkSE8BDgSP8cT/oXB5zsIAU0jJbu3yIixD/v9WwMyKjoS4E8crl7qVB4
KDAzQNR3rjwmE9lOjtsx
=3nT9
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20140127' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
- Add me (Brian Norris) as an additional MTD maintainer (it'd be nice to get
David's "ack" for this; I'm sure he approves, but he's been pretty silent
lately)
- Add Ezequiel Garcie as maintainer for the pxa3xx NAND driver
- Last (?) round of pxa3xx improvements for supporting Armada 370/XP
- Typical churn in driver boilerplate (OOM messages, printk()'s, devm_*, etc.)
- Quad read mode support for SPI NOR driver (m25p80)
- Update Davinci NAND driver to prepare for use on new platforms
- Begin to kill off NAND_MAX_{PAGE,OOB}SIZE macros; more work is pending
- Miscellaneous NAND device support (new IDs)
- Add READ RETRY support for Micron MLC NAND
- Support new GPMI NAND ECC layout device-tree binding
- Avoid mapping stack/vmalloc() memory for GPMI NAND DMA
* tag 'for-linus-20140127' of git://git.infradead.org/linux-mtd: (151 commits)
mtd: gpmi: add sanity check when mapping DMA for read_buf/write_buf
mtd: gpmi: allocate a proper buffer for non ECC read/write
mtd: m25p80: Set rx_nbits for Quad SPI transfers
mtd: m25p80: Enable Quad SPI read transfers for s25fl512s
mtd: s3c2410: Merge plat/regs-nand.h into s3c2410.c
mtd: mtdram: add missing 'const'
mtd: m25p80: assign default read command
mtd: nuc900_nand: remove redundant return value check of platform_get_resource()
mtd: plat_nand: remove redundant return value check of platform_get_resource()
mtd: nand: add Intel manufacturer ID
mtd: nand: add SanDisk manufacturer ID
mtd: nand: add support for Samsung K9LCG08U0B
mtd: nand: pxa3xx: Add support for 2048 bytes page size devices
mtd: m25p80: Use OPCODE_QUAD_READ_4B for 4-byte addressing
mtd: nand: don't use {read,write}_buf for 8-bit transfers
mtd: nand: use __packed shorthand
mtd: nand: support Micron READ RETRY
mtd: nand: add generic READ RETRY support
mtd: nand: add ONFI vendor block for Micron
mtd: nand: localize ECC failures per page
...
Pull vfs updates from Al Viro:
"Assorted stuff; the biggest pile here is Christoph's ACL series. Plus
assorted cleanups and fixes all over the place...
There will be another pile later this week"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (43 commits)
__dentry_path() fixes
vfs: Remove second variable named error in __dentry_path
vfs: Is mounted should be testing mnt_ns for NULL or error.
Fix race when checking i_size on direct i/o read
hfsplus: remove can_set_xattr
nfsd: use get_acl and ->set_acl
fs: remove generic_acl
nfs: use generic posix ACL infrastructure for v3 Posix ACLs
gfs2: use generic posix ACL infrastructure
jfs: use generic posix ACL infrastructure
xfs: use generic posix ACL infrastructure
reiserfs: use generic posix ACL infrastructure
ocfs2: use generic posix ACL infrastructure
jffs2: use generic posix ACL infrastructure
hfsplus: use generic posix ACL infrastructure
f2fs: use generic posix ACL infrastructure
ext2/3/4: use generic posix ACL infrastructure
btrfs: use generic posix ACL infrastructure
fs: make posix_acl_create more useful
fs: make posix_acl_chmod more useful
...
Also don't bother to set up a .get_acl method for symlinks as we do not
support access control (ACLs or even mode bits) for symlinks in Linux.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Rename the current posix_acl_created to __posix_acl_create and add
a fully featured helper to set up the ACLs on file creation that
uses get_acl().
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Rename the current posix_acl_chmod to __posix_acl_chmod and add
a fully featured ACL chmod helper that uses the ->set_acl inode
operation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead
of opencoding an alternate postorder iteration that modifies the tree
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NULL return of kmem_cache_zalloc should be handled in jffs2_alloc_xattr_datum
and jff2_alloc_xattr_ref.
Signed-off-by: Zhouyi Zhou <yizhouzhou@ict.ac.cn>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
We should not support the MLC nand for jffs2. So if the nand type is
MLC, we quit immediatly.
Signed-off-by: Huang Shijie <b32955@freescale.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Modify the request_module to prefix the file system type with "fs-"
and add aliases to all of the filesystems that can be built as modules
to match.
A common practice is to build all of the kernel code and leave code
that is not commonly needed as modules, with the result that many
users are exposed to any bug anywhere in the kernel.
Looking for filesystems with a fs- prefix limits the pool of possible
modules that can be loaded by mount to just filesystems trivially
making things safer with no real cost.
Using aliases means user space can control the policy of which
filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
with blacklist and alias directives. Allowing simple, safe,
well understood work-arounds to known problematic software.
This also addresses a rare but unfortunate problem where the filesystem
name is not the same as it's module name and module auto-loading
would not work. While writing this patch I saw a handful of such
cases. The most significant being autofs that lives in the module
autofs4.
This is relevant to user namespaces because we can reach the request
module in get_fs_type() without having any special permissions, and
people get uncomfortable when a user specified string (in this case
the filesystem type) goes all of the way to request_module.
After having looked at this issue I don't think there is any
particular reason to perform any filtering or permission checks beyond
making it clear in the module request that we want a filesystem
module. The common pattern in the kernel is to call request_module()
without regards to the users permissions. In general all a filesystem
module does once loaded is call register_filesystem() and go to sleep.
Which means there is not much attack surface exposed by loading a
filesytem module unless the filesystem is mounted. In a user
namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
which most filesystems do not set today.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reported-by: Kees Cook <keescook@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.
The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.
Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.
PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.
CC: David Woodhouse <dwmw2@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Users of jffs2_do_reserve_space() expect they still held
erase_completion_lock after call to it. But there is a path
where jffs2_do_reserve_space() leaves erase_completion_lock unlocked.
The patch fixes it.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: stable@vger.kernel.org
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
jffs2_write_begin() first acquires the page lock, then f->sem. This
causes an AB-BA deadlock with jffs2_garbage_collect_live(), which first
acquires f->sem, then the page lock:
jffs2_garbage_collect_live
mutex_lock(&f->sem) (A)
jffs2_garbage_collect_dnode
jffs2_gc_fetch_page
read_cache_page_async
do_read_cache_page
lock_page(page) (B)
jffs2_write_begin
grab_cache_page_write_begin
find_lock_page
lock_page(page) (B)
mutex_lock(&f->sem) (A)
We fix this by restructuring jffs2_write_begin() to take f->sem before
the page lock. However, we make sure that f->sem is not held when
calling jffs2_reserve_space(), as this is not permitted by the locking
rules.
The deadlock above was observed multiple times on an SoC with a dual
ARMv7 (Cortex-A9), running the long-term 3.4.11 kernel; it occurred
when using scp to copy files from a host system to the ARM target
system. The fix was heavily tested on the same target system.
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Betker <thomas.betker@rohde-schwarz.com>
Acked-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
rbtree users must use the documented APIs to manipulate the tree
structure. Low-level helpers to manipulate node colors and parenthood are
not part of that API, so move them to lib/rbtree.c
[dwmw2@infradead.org: fix jffs2 build issue due to renamed __rb_parent_color field]
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull vfs update from Al Viro:
- big one - consolidation of descriptor-related logics; almost all of
that is moved to fs/file.c
(BTW, I'm seriously tempted to rename the result to fd.c. As it is,
we have a situation when file_table.c is about handling of struct
file and file.c is about handling of descriptor tables; the reasons
are historical - file_table.c used to be about a static array of
struct file we used to have way back).
A lot of stray ends got cleaned up and converted to saner primitives,
disgusting mess in android/binder.c is still disgusting, but at least
doesn't poke so much in descriptor table guts anymore. A bunch of
relatively minor races got fixed in process, plus an ext4 struct file
leak.
- related thing - fget_light() partially unuglified; see fdget() in
there (and yes, it generates the code as good as we used to have).
- also related - bits of Cyrill's procfs stuff that got entangled into
that work; _not_ all of it, just the initial move to fs/proc/fd.c and
switch of fdinfo to seq_file.
- Alex's fs/coredump.c spiltoff - the same story, had been easier to
take that commit than mess with conflicts. The rest is a separate
pile, this was just a mechanical code movement.
- a few misc patches all over the place. Not all for this cycle,
there'll be more (and quite a few currently sit in akpm's tree)."
Fix up trivial conflicts in the android binder driver, and some fairly
simple conflicts due to two different changes to the sock_alloc_file()
interface ("take descriptor handling from sock_alloc_file() to callers"
vs "net: Providing protocol type via system.sockprotoname xattr of
/proc/PID/fd entries" adding a dentry name to the socket)
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
MAX_LFS_FILESIZE should be a loff_t
compat: fs: Generic compat_sys_sendfile implementation
fs: push rcu_barrier() from deactivate_locked_super() to filesystems
btrfs: reada_extent doesn't need kref for refcount
coredump: move core dump functionality into its own file
coredump: prevent double-free on an error path in core dumper
usb/gadget: fix misannotations
fcntl: fix misannotations
ceph: don't abuse d_delete() on failure exits
hypfs: ->d_parent is never NULL or negative
vfs: delete surplus inode NULL check
switch simple cases of fget_light to fdget
new helpers: fdget()/fdput()
switch o2hb_region_dev_write() to fget_light()
proc_map_files_readdir(): don't bother with grabbing files
make get_file() return its argument
vhost_set_vring(): turn pollstart/pollstop into bool
switch prctl_set_mm_exe_file() to fget_light()
switch xfs_find_handle() to fget_light()
switch xfs_swapext() to fget_light()
...
There's no reason to call rcu_barrier() on every
deactivate_locked_super(). We only need to make sure that all delayed rcu
free inodes are flushed before we destroy related cache.
Removing rcu_barrier() from deactivate_locked_super() affects some fast
paths. E.g. on my machine exit_group() of a last process in IPC
namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
JFFS2 was designed without thought for OOB bitflips, it seems, but they
can occur and will be reported to JFFS2 via mtd_read_oob()[1]. We don't
want to fail on these transactions, since the data was corrected.
[1] Few drivers report bitflips for OOB-only transactions. With such
drivers, this patch should have no effect.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This patch fixes regression introduced by
"8bdc81c jffs2: get rid of jffs2_sync_super". We submit a delayed work in order
to make sure the write-buffer is synchronized at some point. But we do not
flush it when we unmount, which causes an oops when we unmount the file-system
and then the delayed work is executed.
This patch fixes the issue by adding a "cancel_delayed_work_sync()" infocation
in the '->sync_fs()' handler. This will make sure the delayed work is canceled
on sync, unmount and re-mount. And because VFS always callse 'sync_fs()' before
unmounting or remounting, this fixes the issue.
Reported-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: stable@vger.kernel.org [3.5+]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
- General routine uid/gid conversion work
- When storing posix acls treat ACL_USER and ACL_GROUP separately
so I can call from_kuid or from_kgid as appropriate.
- When reading posix acls treat ACL_USER and ACL_GROUP separately
so I can call make_kuid or make_kgid as appropriate.
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
- Pass the user namespace the uid and gid values in the xattr are stored
in into posix_acl_from_xattr.
- Pass the user namespace kuid and kgid values should be converted into
when storing uid and gid values in an xattr in posix_acl_to_xattr.
- Modify all callers of posix_acl_from_xattr and posix_acl_to_xattr to
pass in &init_user_ns.
In the short term this change is not strictly needed but it makes the
code clearer. In the longer term this change is necessary to be able to
mount filesystems outside of the initial user namespace that natively
store posix acls in the linux xattr format.
Cc: Theodore Tso <tytso@mit.edu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
boolean "does it have to be exclusive?" flag is passed instead;
Local filesystem should just ignore it - the object is guaranteed
not to be there yet.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Just the flags; only NFS cares even about that, but there are
legitimate uses for such argument. And getting rid of that
completely would require splitting ->lookup() into a couple
of methods (at least), so let's leave that alone for now...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>