Commit Graph

3179 Commits

Author SHA1 Message Date
Eric Dumazet
5beb5c90c1 rhashtable: use cond_resched()
If a hash table has 128 slots and 16384 elems, expand to 256 slots
takes more than one second. For larger sets, a soft lockup is detected.

Holding cpu for that long, even in a work queue is a show stopper
for non preemptable kernels.

cond_resched() at strategic points to allow process scheduler
to reschedule us.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 17:55:14 -05:00
Daniel Borkmann
4c4b52d9b2 rhashtable: remove indirection for grow/shrink decision functions
Currently, all real users of rhashtable default their grow and shrink
decision functions to rht_grow_above_75() and rht_shrink_below_30(),
so that there's currently no need to have this explicitly selectable.

It can/should be generic and private inside rhashtable until a real
use case pops up. Since we can make this private, we'll save us this
additional indirection layer and can improve insertion/deletion time
as well.

Reference: http://patchwork.ozlabs.org/patch/443040/
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 16:06:02 -05:00
Daniel Borkmann
8331de75cb rhashtable: unconditionally grow when max_shift is not specified
While commit c0c09bfdc4 ("rhashtable: avoid unnecessary wakeup for
worker queue") rightfully moved part of the decision making of
whether we should expand or shrink from the expand/shrink functions
themselves into insert/delete functions in order to avoid unnecessary
worker wake-ups, it however introduced a regression by doing so.

Before that change, if no max_shift was specified (= 0) on rhashtable
initialization, rhashtable_expand() would just grow unconditionally
and lets the available memory be the limiting factor. After that
change, if no max_shift was specified, there would be _no_ expansion
step at all.

Given that netlink and tipc have a max_shift specified, it was not
visible there, but Josh Hunt reported that if nft that starts out
with a default element hint of 3 if not otherwise provided, would
slow i.e. inserts down trememdously as it cannot grow larger to
relax table occupancy.

Given that the test case verifies shrinks/expands manually, we also
must remove pointer to the helper functions to explicitly avoid
parallel resizing on insertions/deletions. test_bucket_stats() and
test_rht_lookup() could also be wrapped around rhashtable mutex to
explicitly synchronize a walk from resizing, but I think that defeats
the actual test case which intended to have explicit test steps,
i.e. 1) inserts, 2) expands, 3) shrinks, 4) deletions, with object
verification after each stage.

Reported-by: Josh Hunt <johunt@akamai.com>
Fixes: c0c09bfdc4 ("rhashtable: avoid unnecessary wakeup for worker queue")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Josh Hunt <johunt@akamai.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 16:06:02 -05:00
Sasha Levin
71bb0012c3 rhashtable: initialize all rhashtable walker members
Commit f2dba9c6ff ("rhashtable: Introduce rhashtable_walk_*") forgot to
initialize the members of struct rhashtable_walker after allocating it, which
caused an undefined value for 'resize' which is used later on.

Fixes: f2dba9c6ff ("rhashtable: Introduce rhashtable_walk_*")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23 15:23:19 -05:00
Daniel Borkmann
6dd0c1655b rhashtable: allow to unload test module
There's no good reason why to disallow unloading of the rhashtable
test case module.

Commit 9d6dbe1bba moved the code from a boot test into a stand-alone
module, but only converted the subsys_initcall() handler into a
module_init() function without a related exit handler, and thus
preventing the test module from unloading.

Fixes: 9d6dbe1bba ("rhashtable: Make selftest modular")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 17:38:10 -05:00
Daniel Borkmann
eb6d1abf1b rhashtable: better high order allocation attempts
When trying to allocate future tables via bucket_table_alloc(), it seems
overkill on large table shifts that we probe for kzalloc() unconditionally
first, as it's likely to fail.

Only probe with kzalloc() for more reasonable table sizes and use vzalloc()
either as a fallback on failure or directly in case of large table sizes.

Fixes: 7e1e77636e ("lib: Resizable, Scalable, Concurrent Hash Table")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 17:38:09 -05:00
Daniel Borkmann
342100d937 rhashtable: don't test for shrink on insert, expansion on delete
Restore pre 54c5b7d311 behaviour and only probe for expansions on inserts
and shrinks on deletes. Currently, it will happen that on initial inserts
into a sparse hash table, we may i.e. shrink it first simply because it's
not fully populated yet, only to later realize that we need to grow again.

This however is counter intuitive, e.g. an initial default size of 64
elements is already small enough, and in case an elements size hint is given
to the hash table by a user, we should avoid unnecessary expansion steps,
so a shrink is clearly unintended here.

Fixes: 54c5b7d311 ("rhashtable: introduce rhashtable_wakeup_worker helper function")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ying Xue <ying.xue@windriver.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 17:38:09 -05:00
Daniel Borkmann
b7f5e5c7f8 rhashtable: don't allocate ht structure on stack in test_rht_init
With object runtime debugging enabled, the rhashtable test suite
will rightfully throw a warning "ODEBUG: object is on stack, but
not annotated" from rhashtable_init().

This is because run_work is (correctly) being initialized via
INIT_WORK(), and not annotated by INIT_WORK_ONSTACK(). Meaning,
rhashtable_init() is okay as is, we just need to move ht e.g.,
into global scope.

It never triggered anything, since test_rhashtable is rather a
controlled environment and effectively runs to completion, so
that stack memory is not vanishing underneath us, we shouldn't
confuse any testers with it though.

Fixes: 7e1e77636e ("lib: Resizable, Scalable, Concurrent Hash Table")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 16:33:30 -05:00
Linus Torvalds
b11a278397 Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kconfig updates from Michal Marek:
 "Yann E Morin was supposed to take over kconfig maintainership, but
  this hasn't happened.  So I'm sending a few kconfig patches that I
  collected:

   - Fix for missing va_end in kconfig
   - merge_config.sh displays used if given too few arguments
   - s/boolean/bool/ in Kconfig files for consistency, with the plan to
     only support bool in the future"

* 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kconfig: use va_end to match corresponding va_start
  merge_config.sh: Display usage if given too few arguments
  kconfig: use bool instead of boolean for type definition attributes
2015-02-19 10:36:45 -08:00
Linus Torvalds
53861af9a1 OK, this has the big virtio 1.0 implementation, as specified by OASIS.
On top of tht is the major rework of lguest, to use PCI and virtio 1.0, to
 double-check the implementation.
 
 Then comes the inevitable fixes and cleanups from that work.
 
 Thanks,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU5B9cAAoJENkgDmzRrbjxPacP/jajliXX353JJ/g/hkZ6oDN5
 o7FhELBKiUMr7enVZYwj2BBYk5OM36nB9pQkiqHMSbjJGoS5IK70enxb4YRxSHBn
 YCLblZMNqutGS0kclZ9DDysztjAhxH7CvLM6pMZ7eHP0f3+FM/QhbxHfbG9DTBUH
 2U/nybvd3M/+YBe7ptwQdrH8aOCAD6RTIsXellfm99dNMK6K/5lqnWQ98WSXmNXq
 vyvdaAQsqqUkmxtajjcBumaCH4/SehOJJjUqojCMsR3aBkgOBWDZJURMek+KA5Dt
 X996fBsTAlvTtCUKRrmLTb2ScDH7fu+jwbWRqMYDk8zpEr3XqiLTTPV4/TiHGmi7
 Wiw3g1wIY1YbETlZyongB5MIoVyUfmDAd+bT8nBsj3KIITD84gOUQFDMl6d63c0I
 z6A9Pu/UzpJGsXZT3WoFLi6TO67QyhOseqZnhS4wBgLabjxffNM7yov9RVKUVH/n
 JHunnpUk2iTtSgscBarOBz5867dstuurnaUIspZthVBo6y6N0z+GrU+agJ8Y4DXx
 mvwzeYLhQH2208PjxPFiah/kA/gHNm1m678TbpS+CUsgmpQiJ4gTwtazDSi4TwZY
 Hs9T9GulkzpZIzEyKL3qG2TsfyDhW5Avn+GvKInAT9+Fkig4BnP3DUONBxcwGZ78
 eI3FDUWsE36NqE5ECWmz
 =ivCe
 -----END PGP SIGNATURE-----

Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio updates from Rusty Russell:
 "OK, this has the big virtio 1.0 implementation, as specified by OASIS.

  On top of tht is the major rework of lguest, to use PCI and virtio
  1.0, to double-check the implementation.

  Then comes the inevitable fixes and cleanups from that work"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (80 commits)
  virtio: don't set VIRTIO_CONFIG_S_DRIVER_OK twice.
  virtio_net: unconditionally define struct virtio_net_hdr_v1.
  tools/lguest: don't use legacy definitions for net device in example launcher.
  virtio: Don't expose legacy net features when VIRTIO_NET_NO_LEGACY defined.
  tools/lguest: use common error macros in the example launcher.
  tools/lguest: give virtqueues names for better error messages
  tools/lguest: more documentation and checking of virtio 1.0 compliance.
  lguest: don't look in console features to find emerg_wr.
  tools/lguest: don't start devices until DRIVER_OK status set.
  tools/lguest: handle indirect partway through chain.
  tools/lguest: insert driver references from the 1.0 spec (4.1 Virtio Over PCI)
  tools/lguest: insert device references from the 1.0 spec (4.1 Virtio Over PCI)
  tools/lguest: rename virtio_pci_cfg_cap field to match spec.
  tools/lguest: fix features_accepted logic in example launcher.
  tools/lguest: handle device reset correctly in example launcher.
  virtual: Documentation: simplify and generalize paravirt_ops.txt
  lguest: remove NOTIFY call and eventfd facility.
  lguest: remove NOTIFY facility from demonstration launcher.
  lguest: use the PCI console device's emerg_wr for early boot messages.
  lguest: always put console in PCI slot #1.
  ...
2015-02-18 09:24:01 -08:00
Al Viro
d879cb8341 move iov_iter.c from mm/ to lib/
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-17 22:22:17 -05:00
Linus Torvalds
50652963ea Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc VFS updates from Al Viro:
 "This cycle a lot of stuff sits on topical branches, so I'll be sending
  more or less one pull request per branch.

  This is the first pile; more to follow in a few.  In this one are
  several misc commits from early in the cycle (before I went for
  separate branches), plus the rework of mntput/dput ordering on umount,
  switching to use of fs_pin instead of convoluted games in
  namespace_unlock()"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  switch the IO-triggering parts of umount to fs_pin
  new fs_pin killing logics
  allow attaching fs_pin to a group not associated with some superblock
  get rid of the second argument of acct_kill()
  take count and rcu_head out of fs_pin
  dcache: let the dentry count go down to zero without taking d_lock
  pull bumping refcount into ->kill()
  kill pin_put()
  mode_t whack-a-mole: chelsio
  file->f_path.dentry is pinned down for as long as the file is open...
  get rid of lustre_dump_dentry()
  gut proc_register() a bit
  kill d_validate()
  ncpfs: get rid of d_validate() nonsense
  selinuxfs: don't open-code d_genocide()
2015-02-17 14:56:45 -08:00
Jan Kiszka
3ee7b3fa2c scripts/gdb: add infrastructure
This provides the basic infrastructure to load kernel-specific python
helper scripts when debugging the kernel in gdb.

The loading mechanism is based on gdb loading for <objfile>-gdb.py when
opening <objfile>.  Therefore, this places a corresponding link to the
main helper script into the output directory that contains vmlinux.

The main scripts will pull in submodules containing Linux specific gdb
commands and functions.  To avoid polluting the source directory with
compiled python modules, we link to them from the object directory.

Due to gdb.parse_and_eval and string redirection for gdb.execute, we
depend on gdb >= 7.2.

This feature is enabled via CONFIG_GDB_SCRIPTS.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Michal Marek <mmarek@suse.cz>		[kbuild stuff]
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-17 14:34:53 -08:00
Christoph Jaeger
841c009007 lib/Kconfig: use bool instead of boolean
Keyword 'boolean' for type definition attributes is considered
deprecated and, therefore, should not be used anymore.

See http://lkml.kernel.org/r/cover.1418003065.git.cj@linux.com
See http://lkml.kernel.org/r/1419108071-11607-1-git-send-email-cj@linux.com

Signed-off-by: Christoph Jaeger <cj@linux.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-16 17:56:05 -08:00
Linus Torvalds
fee5429e02 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "Here is the crypto update for 3.20:

   - Added 192/256-bit key support to aesni GCM.
   - Added MIPS OCTEON MD5 support.
   - Fixed hwrng starvation and race conditions.
   - Added note that memzero_explicit is not a subsitute for memset.
   - Added user-space interface for crypto_rng.
   - Misc fixes"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (71 commits)
  crypto: tcrypt - do not allocate iv on stack for aead speed tests
  crypto: testmgr - limit IV copy length in aead tests
  crypto: tcrypt - fix buflen reminder calculation
  crypto: testmgr - mark rfc4106(gcm(aes)) as fips_allowed
  crypto: caam - fix resource clean-up on error path for caam_jr_init
  crypto: caam - pair irq map and dispose in the same function
  crypto: ccp - terminate ccp_support array with empty element
  crypto: caam - remove unused local variable
  crypto: caam - remove dead code
  crypto: caam - don't emit ICV check failures to dmesg
  hwrng: virtio - drop extra empty line
  crypto: replace scatterwalk_sg_next with sg_next
  crypto: atmel - Free memory in error path
  crypto: doc - remove colons in comments
  crypto: seqiv - Ensure that IV size is at least 8 bytes
  crypto: cts - Weed out non-CBC algorithms
  MAINTAINERS: add linux-crypto to hw random
  crypto: cts - Remove bogus use of seqiv
  crypto: qat - don't need qat_auth_state struct
  crypto: algif_rng - fix sparse non static symbol warning
  ...
2015-02-14 09:47:01 -08:00
Andrey Ryabinin
bebf56a1b1 kasan: enable instrumentation of global variables
This feature let us to detect accesses out of bounds of global variables.
This will work as for globals in kernel image, so for globals in modules.
Currently this won't work for symbols in user-specified sections (e.g.
__init, __read_mostly, ...)

The idea of this is simple.  Compiler increases each global variable by
redzone size and add constructors invoking __asan_register_globals()
function.  Information about global variable (address, size, size with
redzone ...) passed to __asan_register_globals() so we could poison
variable's redzone.

This patch also forces module_alloc() to return 8*PAGE_SIZE aligned
address making shadow memory handling (
kasan_module_alloc()/kasan_module_free() ) more simple.  Such alignment
guarantees that each shadow page backing modules address space correspond
to only one module_alloc() allocation.

Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:42 -08:00
Andrey Ryabinin
3f15801cdc lib: add kasan test module
This is a test module doing various nasty things like out of bounds
accesses, use after free.  It is useful for testing kernel debugging
features like kernel address sanitizer.

It mostly concentrates on testing of slab allocator, but we might want to
add more different stuff here in future (like stack/global variables out
of bounds accesses and so on).

Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:41 -08:00
Andrey Ryabinin
0316bec22e mm: slub: add kernel address sanitizer support for slub allocator
With this patch kasan will be able to catch bugs in memory allocated by
slub.  Initially all objects in newly allocated slab page, marked as
redzone.  Later, when allocation of slub object happens, requested by
caller number of bytes marked as accessible, and the rest of the object
(including slub's metadata) marked as redzone (inaccessible).

We also mark object as accessible if ksize was called for this object.
There is some places in kernel where ksize function is called to inquire
size of really allocated area.  Such callers could validly access whole
allocated memory, so it should be marked as accessible.

Code in slub.c and slab_common.c files could validly access to object's
metadata, so instrumentation for this files are disabled.

Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Signed-off-by: Dmitry Chernenkov <dmitryc@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:41 -08:00
Andrey Ryabinin
ef7f0d6a6c x86_64: add KASan support
This patch adds arch specific code for kernel address sanitizer.

16TB of virtual addressed used for shadow memory.  It's located in range
[ffffec0000000000 - fffffc0000000000] between vmemmap and %esp fixup
stacks.

At early stage we map whole shadow region with zero page.  Latter, after
pages mapped to direct mapping address range we unmap zero pages from
corresponding shadow (see kasan_map_shadow()) and allocate and map a real
shadow memory reusing vmemmap_populate() function.

Also replace __pa with __pa_nodebug before shadow initialized.  __pa with
CONFIG_DEBUG_VIRTUAL=y make external function call (__phys_addr)
__phys_addr is instrumented, so __asan_load could be called before shadow
area initialized.

Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:41 -08:00
Andrey Ryabinin
0b24becc81 kasan: add kernel address sanitizer infrastructure
Kernel Address sanitizer (KASan) is a dynamic memory error detector.  It
provides fast and comprehensive solution for finding use-after-free and
out-of-bounds bugs.

KASAN uses compile-time instrumentation for checking every memory access,
therefore GCC > v4.9.2 required.  v4.9.2 almost works, but has issues with
putting symbol aliases into the wrong section, which breaks kasan
instrumentation of globals.

This patch only adds infrastructure for kernel address sanitizer.  It's
not available for use yet.  The idea and some code was borrowed from [1].

Basic idea:

The main idea of KASAN is to use shadow memory to record whether each byte
of memory is safe to access or not, and use compiler's instrumentation to
check the shadow memory on each memory access.

Address sanitizer uses 1/8 of the memory addressable in kernel for shadow
memory and uses direct mapping with a scale and offset to translate a
memory address to its corresponding shadow address.

Here is function to translate address to corresponding shadow address:

     unsigned long kasan_mem_to_shadow(unsigned long addr)
     {
                return (addr >> KASAN_SHADOW_SCALE_SHIFT) + KASAN_SHADOW_OFFSET;
     }

where KASAN_SHADOW_SCALE_SHIFT = 3.

So for every 8 bytes there is one corresponding byte of shadow memory.
The following encoding used for each shadow byte: 0 means that all 8 bytes
of the corresponding memory region are valid for access; k (1 <= k <= 7)
means that the first k bytes are valid for access, and other (8 - k) bytes
are not; Any negative value indicates that the entire 8-bytes are
inaccessible.  Different negative values used to distinguish between
different kinds of inaccessible memory (redzones, freed memory) (see
mm/kasan/kasan.h).

To be able to detect accesses to bad memory we need a special compiler.
Such compiler inserts a specific function calls (__asan_load*(addr),
__asan_store*(addr)) before each memory access of size 1, 2, 4, 8 or 16.

These functions check whether memory region is valid to access or not by
checking corresponding shadow memory.  If access is not valid an error
printed.

Historical background of the address sanitizer from Dmitry Vyukov:

	"We've developed the set of tools, AddressSanitizer (Asan),
	ThreadSanitizer and MemorySanitizer, for user space. We actively use
	them for testing inside of Google (continuous testing, fuzzing,
	running prod services). To date the tools have found more than 10'000
	scary bugs in Chromium, Google internal codebase and various
	open-source projects (Firefox, OpenSSL, gcc, clang, ffmpeg, MySQL and
	lots of others): [2] [3] [4].
	The tools are part of both gcc and clang compilers.

	We have not yet done massive testing under the Kernel AddressSanitizer
	(it's kind of chicken and egg problem, you need it to be upstream to
	start applying it extensively). To date it has found about 50 bugs.
	Bugs that we've found in upstream kernel are listed in [5].
	We've also found ~20 bugs in out internal version of the kernel. Also
	people from Samsung and Oracle have found some.

	[...]

	As others noted, the main feature of AddressSanitizer is its
	performance due to inline compiler instrumentation and simple linear
	shadow memory. User-space Asan has ~2x slowdown on computational
	programs and ~2x memory consumption increase. Taking into account that
	kernel usually consumes only small fraction of CPU and memory when
	running real user-space programs, I would expect that kernel Asan will
	have ~10-30% slowdown and similar memory consumption increase (when we
	finish all tuning).

	I agree that Asan can well replace kmemcheck. We have plans to start
	working on Kernel MemorySanitizer that finds uses of unitialized
	memory. Asan+Msan will provide feature-parity with kmemcheck. As
	others noted, Asan will unlikely replace debug slab and pagealloc that
	can be enabled at runtime. Asan uses compiler instrumentation, so even
	if it is disabled, it still incurs visible overheads.

	Asan technology is easily portable to other architectures. Compiler
	instrumentation is fully portable. Runtime has some arch-dependent
	parts like shadow mapping and atomic operation interception. They are
	relatively easy to port."

Comparison with other debugging features:
========================================

KMEMCHECK:

  - KASan can do almost everything that kmemcheck can.  KASan uses
    compile-time instrumentation, which makes it significantly faster than
    kmemcheck.  The only advantage of kmemcheck over KASan is detection of
    uninitialized memory reads.

    Some brief performance testing showed that kasan could be
    x500-x600 times faster than kmemcheck:

$ netperf -l 30
		MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost (127.0.0.1) port 0 AF_INET
		Recv   Send    Send
		Socket Socket  Message  Elapsed
		Size   Size    Size     Time     Throughput
		bytes  bytes   bytes    secs.    10^6bits/sec

no debug:	87380  16384  16384    30.00    41624.72

kasan inline:	87380  16384  16384    30.00    12870.54

kasan outline:	87380  16384  16384    30.00    10586.39

kmemcheck: 	87380  16384  16384    30.03      20.23

  - Also kmemcheck couldn't work on several CPUs.  It always sets
    number of CPUs to 1.  KASan doesn't have such limitation.

DEBUG_PAGEALLOC:
	- KASan is slower than DEBUG_PAGEALLOC, but KASan works on sub-page
	  granularity level, so it able to find more bugs.

SLUB_DEBUG (poisoning, redzones):
	- SLUB_DEBUG has lower overhead than KASan.

	- SLUB_DEBUG in most cases are not able to detect bad reads,
	  KASan able to detect both reads and writes.

	- In some cases (e.g. redzone overwritten) SLUB_DEBUG detect
	  bugs only on allocation/freeing of object. KASan catch
	  bugs right before it will happen, so we always know exact
	  place of first bad read/write.

[1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel
[2] https://code.google.com/p/address-sanitizer/wiki/FoundBugs
[3] https://code.google.com/p/thread-sanitizer/wiki/FoundBugs
[4] https://code.google.com/p/memory-sanitizer/wiki/FoundBugs
[5] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel#Trophies

Based on work by Andrey Konovalov.

Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Acked-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:40 -08:00
Tejun Heo
46385326cc bitmap, cpumask, nodemask: remove dedicated formatting functions
Now that all bitmap formatting usages have been converted to
'%*pb[l]', the separate formatting functions are unnecessary.  The
following functions are removed.

* bitmap_scn[list]printf()
* cpumask_scnprintf(), cpulist_scnprintf()
* [__]nodemask_scnprintf(), [__]nodelist_scnprintf()
* seq_bitmap[_list](), seq_cpumask[_list](), seq_nodemask[_list]()
* seq_buf_bitmask()

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:39 -08:00
Tejun Heo
4a0792b0e7 bitmap: use %*pb[l] to print bitmaps including cpumasks and nodemasks
printk and friends can now format bitmaps using '%*pb[l]'.  cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:36 -08:00
Tejun Heo
dbc760bcc1 lib/vsprintf: implement bitmap printing through '%*pb[l]'
bitmap and its derivatives such as cpumask and nodemask currently only
provide formatting functions which put the output string into the
provided buffer; however, how long this buffer should be isn't defined
anywhere and given that some of these bitmaps can be too large to be
formatted into an on-stack buffer it users sometimes are unnecessarily
forced to come up with creative solutions and compromises for the
buffer just to printk these bitmaps.

There have been a couple different attempts at making this easier.

1. Way back, PeterZ tried printk '%pb' extension with the precision
   for bit width - '%.*pb'.  This was intuitive and made sense but
   unfortunately triggered a compile warning about using precision
   for a pointer.

   http://lkml.kernel.org/g/1336577562.2527.58.camel@twins

2. I implemented bitmap_pr_cont[_list]() and its wrappers for cpumask
   and nodemask.  This works but PeterZ pointed out that pr_cont's
   tendency to produce broken lines when multiple CPUs are printing is
   bothering considering the usages.

   http://lkml.kernel.org/g/1418226774-30215-3-git-send-email-tj@kernel.org

So, this patch is another attempt at teaching printk and friends how
to print bitmaps.  It's almost identical to what PeterZ tried with
precision but it uses the field width for the number of bits instead
of precision.  The format used is '%*pb[l]', with the optional
trailing 'l' specifying list format instead of hex masks.

This is a valid format string and doesn't trigger compiler warnings;
however, it does make it impossible to specify output field width when
printing bitmaps.  I think this is an acceptable trade-off given how
much easier it makes printing bitmaps and that we don't have any
in-kernel user which is using the field width specification.  If any
future user wants to use field width with a bitmap, it'd have to
format the bitmap into a string buffer and then print that buffer with
width spec, which isn't different from how it should be done now.

This patch implements bitmap[_list]_string() which are called from the
vsprintf pointer() formatting function.  The implementation is mostly
identical to bitmap_scn[list]printf() except that the output is
performed in the vsprintf way.  These functions handle formatting into
too small buffers and sprintf() family of functions report the correct
overrun output length.

bitmap_scn[list]printf() are now thin wrappers around scnprintf().

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:36 -08:00
Jan Kara
310ee9e8f3 lib/genalloc.c: check result of devres_alloc()
devm_gen_pool_create() calls devres_alloc() and dereferences its result
without checking whether devres_alloc() succeeded.  Check for error and
bail out if it happened.

Coverity-id 1016493.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:36 -08:00
Rasmus Villemoes
8da53d4595 lib/string.c: improve strrchr()
Instead of potentially passing over the string twice in case c is not
found, just keep track of the last occurrence.  According to
bloat-o-meter, this also cuts the generated code by a third (54 vs 36
bytes).  Oh, and we get rid of those 7-space indented lines.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:36 -08:00
Daniel Borkmann
f5e38b9284 lib: crc32: constify crc32 lookup table
Commit 8f243af42a ("sections: fix const sections for crc32 table")
removed the compile-time generated crc32 tables from the RO sections,
because it conflicts with the definition of __cacheline_aligned which
puts all such aligned data into .data..cacheline_aligned section
optimized for wasting less space, and can cause alignment issues when
used in combination with const with some gcc versions like 4.7.0 due to
a gcc bug [1].

Given that most gcc versions should have the fix by now, we can just use
____cacheline_aligned, which only aligns the data but doesn't move it
into specific sections as opposed to __cacheline_aligned.  In case of
gcc versions having the mentioned bug, the alignment attribute will have
no effect, but the data will still be made RO.

After patch tables are in RO:

  $ nm -v lib/crc32.o | grep -1 -E "crc32c?table"
  0000000000000000 t arch_local_irq_enable
  0000000000000000 r crc32ctable_le
  0000000000000000 t crc32_exit
  --
  0000000000000960 t test_buf
  0000000000002000 r crc32table_be
  0000000000004000 r crc32table_le
  000000001d1056e5 A __crc_crc32_be

  [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52181

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:35 -08:00
Rasmus Villemoes
7f59065793 lib: bitmap: remove redundant code from __bitmap_shift_left
The first of these conditionals is completely redundant: If k == lim-1, we
must have off==0, so the second conditional will also trigger and then it
wouldn't matter if upper had some high bits set.  But the second
conditional is in fact also redundant, since it only serves to clear out
some high-order "don't care" bits of dst, about which no guarantee is
made.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:35 -08:00
Rasmus Villemoes
6d874eca65 lib: bitmap: eliminate branch in __bitmap_shift_left
We can shift the bits from lower and upper into place before assembling
dst[k + off]; moving the shift of lower into the branch where we already
know that rem is non-zero allows us to remove a conditional.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:35 -08:00
Rasmus Villemoes
dba94c2553 lib: bitmap: change bitmap_shift_left to take unsigned parameters
gcc can generate slightly better code for stuff like "nbits %
BITS_PER_LONG" when it knows nbits is not negative.  Since negative size
bitmaps or shift amounts don't make sense, change these parameters of
bitmap_shift_right to unsigned.

If off >= lim (which requires shift >= nbits), k is initialized with a
large positive value, but since I've let k continue to be signed, the loop
will never run and dst will be zeroed as expected.  Inside the loop, k is
guaranteed to be non-negative, so the fact that it is promoted to unsigned
in the various expressions it appears in is harmless.

Also use "shift" and "nbits" consistently for the parameter names.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:35 -08:00
Rasmus Villemoes
cfac1d080a lib: bitmap: yet another simplification in __bitmap_shift_right
If left is 0, we can just let mask be ~0UL, so that anding with it is a
no-op.  Conveniently, BITMAP_LAST_WORD_MASK provides precisely what we
need, and we can eliminate left.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:35 -08:00
Rasmus Villemoes
97fb8e940b lib: bitmap: remove redundant code from __bitmap_shift_right
If the condition k==lim-1 is true, we must have off == 0 (otherwise, k
could never become that big).  But in that case we have upper == 0 and
hence dst[k] == (src[k] & mask) >> rem.  Since mask consists of a
consecutive range of bits starting from the LSB, anding dst[k] with mask
is a no-op.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:35 -08:00
Rasmus Villemoes
9d8a6b2a02 lib: bitmap: eliminate branch in __bitmap_shift_right
We can shift the bits from lower and upper into place before assembling
dst[k]; moving the shift of upper into the branch where we already know
that rem is non-zero allows us to remove a conditional.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:35 -08:00
Rasmus Villemoes
2fbad29917 lib: bitmap: change bitmap_shift_right to take unsigned parameters
I've previously changed the nbits parameter of most bitmap_* functions to
unsigned; now it is bitmap_shift_{left,right}'s turn.  This alone saves
some .text, but while at it I found that there were a few other things one
could do.  The end result of these seven patches is

  $ scripts/bloat-o-meter /tmp/bitmap.o.{old,new}
  add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-328 (-328)
  function                                     old     new   delta
  __bitmap_shift_right                         384     226    -158
  __bitmap_shift_left                          306     136    -170

and less importantly also a smaller stack footprint

  $ stack-o-meter.pl master bitmap
  file                 function                       old  new  delta
  lib/bitmap.o         __bitmap_shift_right             24    8  -16
  lib/bitmap.o         __bitmap_shift_left              24    0  -24

For each pair of 0 <= shift <= nbits <= 256 I've tested the end result
with a few randomly filled src buffers (including garbage beyond nbits),
in each case verifying that the shift {left,right}-most bits of dst are
zero and the remaining nbits-shift bits correspond to src, so I'm fairly
confident I didn't screw up.  That hasn't stopped me from being wrong
before, though.

This patch (of 7):

gcc can generate slightly better code for stuff like "nbits %
BITS_PER_LONG" when it knows nbits is not negative.  Since negative size
bitmaps or shift amounts don't make sense, change these parameters of
bitmap_shift_right to unsigned.

The expressions involving "lim - 1" are still ok, since if lim is 0 the
loop is never executed.

Also use "shift" and "nbits" consistently for the parameter names.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:35 -08:00
Rasmus Villemoes
e8f2427832 lib/bitmap.c: elide bitmap_copy_le on little-endian
On little-endian, there's no reason to have an extra, presumably less
efficient, way of copying a bitmap.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:35 -08:00
Rasmus Villemoes
9b6c2d2e2b lib/bitmap.c: change prototype of bitmap_copy_le
Make the prototype of bitmap_copy_le the same as bitmap_copy's.  All other
bitmap_* functions take unsigned long* parameters; there's no reason this
should be special.

The only current user is the static inline uwb_mas_bm_copy_le, which
already does the void* laundering, so the end users can pass their u8 or
__le32 buffers without a cast.

Furthermore, this allows us to simply let bitmap_copy_le be an alias for
bitmap_copy on little-endian; see next patch.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:34 -08:00
Linus Torvalds
818099574b Merge branch 'akpm' (patches from Andrew)
Merge third set of updates from Andrew Morton:

 - the rest of MM

   [ This includes getting rid of the numa hinting bits, in favor of
     just generic protnone logic.  Yay.     - Linus ]

 - core kernel

 - procfs

 - some of lib/ (lots of lib/ material this time)

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (104 commits)
  lib/lcm.c: replace include
  lib/percpu_ida.c: remove redundant includes
  lib/strncpy_from_user.c: replace module.h include
  lib/stmp_device.c: replace module.h include
  lib/sort.c: move include inside #if 0
  lib/show_mem.c: remove redundant include
  lib/radix-tree.c: change to simpler include
  lib/plist.c: remove redundant include
  lib/nlattr.c: remove redundant include
  lib/kobject_uevent.c: remove redundant include
  lib/llist.c: remove redundant include
  lib/md5.c: simplify include
  lib/list_sort.c: rearrange includes
  lib/genalloc.c: remove redundant include
  lib/idr.c: remove redundant include
  lib/halfmd4.c: simplify includes
  lib/dynamic_queue_limits.c: simplify includes
  lib/sort.c: use simpler includes
  lib/interval_tree.c: simplify includes
  hexdump: make it return number of bytes placed in buffer
  ...
2015-02-12 18:54:28 -08:00
Rasmus Villemoes
6016daed58 lib/lcm.c: replace include
We don't need all the stuff kernel.h pulls in; just compiler.h since
export.h doesn't do necessary #includes.  This removes more than 100
dependencies.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:16 -08:00
Rasmus Villemoes
6918584aad lib/percpu_ida.c: remove redundant includes
These three #includes seem to be completely redundant: Removing them
yields identical objdump -d output for each of {allyes,allno,def}config,
and neither included file end up in the generated dependency file through
some recursive include.  In total, about 50 lines are eliminated from
.percpu.o.cmd.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:16 -08:00
Rasmus Villemoes
bf3c2d6d2f lib/strncpy_from_user.c: replace module.h include
strncpy_from_user.c only needs EXPORT_SYMBOL, so just include compiler.h
and export.h instead of the whole module.h machinery.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:16 -08:00
Rasmus Villemoes
b6d4f3221d lib/stmp_device.c: replace module.h include
stmp_device.c only needs EXPORT_SYMBOL, so just include compiler.h and
export.h instead of the whole module.h machinery.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:16 -08:00
Rasmus Villemoes
2ddae683bf lib/sort.c: move include inside #if 0
The sort function and its helpers don't do memory allocation, so the
slab.h include is redundant.  Move it inside the #if 0 protecting the
self-test, similar to how it is done in lib/list_sort.c.  This removes
over 450 lines from the generated dependency file.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:16 -08:00
Rasmus Villemoes
b8b6db1793 lib/show_mem.c: remove redundant include
show_mem.c doesn't use anything from nmi.h.  Removing it yields identical
objdump -d output for each of {allyes,allno,def}config and eliminates more
than 100 lines in the dependency file.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:16 -08:00
Rasmus Villemoes
886d3dfa85 lib/radix-tree.c: change to simpler include
The comment helpfully explains why hardirq.h is included, but since
commit 2d4b84739f ("hardirq: Split preempt count mask definitions")
in_interrupt() has been provided by preempt_mask.h.  Use that instead,
saving around 40 lines in the generated dependency file.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:16 -08:00
Rasmus Villemoes
7f1ce3c864 lib/plist.c: remove redundant include
Removing the include of linux/spinlock.h produces byte-identical output
for {allno,def}config, and identical objdump -d output for allyesconfig.
In the former two cases, more than a 100 lines are eliminated from the
generated dependency file.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:16 -08:00
Rasmus Villemoes
fb41f9d71c lib/nlattr.c: remove redundant include
nlattr.c doesn't seem to rely on anything from netdevice.h.  Removing it
yields identical objdump -d output for each of {allyes,allno,def}config,
and eliminates more than 200 lines from the generated dependency file.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:16 -08:00
Rasmus Villemoes
a69ae45c26 lib/kobject_uevent.c: remove redundant include
The file doesn't seem to use anything from linux/user_namespace.h, and
removing it yields byte-identical object code and strictly fewer
dependencies in the .cmd file.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Rasmus Villemoes
9b40570bd9 lib/llist.c: remove redundant include
This file doesn't seem to use anything provided by linux/interrupt.h or
anything recursively included through that.  Removing it produces
byte-identical output, while reducing .llist.o.cmd from 541 to 156 lines.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Rasmus Villemoes
9a29ae84c1 lib/md5.c: simplify include
md5.c doesn't use anything from kernel.h, except that that pulls in
compiler.h, which is needed for the export.h to work.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Rasmus Villemoes
7259fa0424 lib/list_sort.c: rearrange includes
Memory allocation only happens in the self test, just as random numbers
are only used there.  So move the inclusion of slab.h inside the
CONFIG_TEST_LIST_SORT.

We don't need module.h and all of the stuff it carries with it, so replace
with export.h and compiler.h.  Unfortunately, the ARRAY_SIZE macro from
kernel.h requires the user to ensure bug.h is also included (for
BUILD_BUG_ON_ZERO, used by __must_be_array).  We used to get that through
some maze of nested includes, but just include it explicitly.

linux/string.h is then only included implicitly through
kernel.h->printk.h->dynamic_debug.h, but only if !CONFIG_DYNAMIC_DEBUG, so
just include it explicitly (for memset).

objdump -d says the generated code is the same, and wc -l says that
lib/.list_sort.o.cmd went from 579 to 165 lines.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Rasmus Villemoes
18fa6d2e45 lib/genalloc.c: remove redundant include
Removing this include produces byte-identical output, and thus removes a
false dependency.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Rasmus Villemoes
87d1d16937 lib/idr.c: remove redundant include
idr.c doesn't seem to use anything from hardirq.h (or anything included
from that).  Removing it produces identical objdump -d output, and gives
44 fewer lines in the .idr.o.cmd dependency file.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Rasmus Villemoes
3248340d3f lib/halfmd4.c: simplify includes
We only need EXPORT_SYMBOL, so compiler.h and export.h suffice.  This
means linux/types.h is no longer implicitly included, so add an include of
uapi/linux/types.h to linux/cryptohash.h for __u32.  Other users of
cryptohash.h cannot be affected, since they must already have been
including uapi/linux/types.h in order for gcc not to complain about
unknown types.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Rasmus Villemoes
565ac23b81 lib/dynamic_queue_limits.c: simplify includes
The file doesn't use anything from ctype.h.  Instead of module.h, just use
export.h for EXPORT_SYMBOL.  The latter requires the user to include
compiler.h, so do that explicitly instead of relying on some other header
pulling it in.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Rasmus Villemoes
42cf809654 lib/sort.c: use simpler includes
sort.c doesn't use facilities from kernel.h, but does use some types
defined in linux/types.h.  Include the latter directly instead of relying
on some other header doing it.  Similarly, include linux/export.h directly
instead of through module.h.  This removes 80 lines from the dependency
file .sort.o.cmd.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Rasmus Villemoes
85c5e27c4a lib/interval_tree.c: simplify includes
The file uses nothing from init.h, and also doesn't need the full module.h
machinery; export.h is sufficient.  The latter requires the user to ensure
compiler.h is included, so do that explicitly instead of relying on some
other header pulling it in.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Andy Shevchenko
114fc1afb2 hexdump: make it return number of bytes placed in buffer
This patch makes hexdump return the number of bytes placed in the buffer
excluding trailing NUL.  In the case of overflow it returns the desired
amount of bytes to produce the entire dump.  Thus, it mimics snprintf().

This will be useful for users that would like to repeat with a bigger
buffer.

[akpm@linux-foundation.org: fix printk warning]
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Andy Shevchenko
5d909c8d54 hexdump: do a few calculations ahead
Instead of doing calculations in each case of different groupsize let's do
them beforehand.  While there, change the switch to an if-else-if
construction.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:15 -08:00
Andy Shevchenko
6f6f3fcb87 hexdump: fix ascii column for the tail of a dump
In the current implementation we have a floating ascii column in the tail
of the dump.

For example, for row size equal to 16 the ascii column as in following
table

group size \ length	8	12	16
	1		50	50	50
	2		22	32	42
	4		20	29	38
	8		19	-	36

This patch makes it the same independently of amount of bytes dumped.

The change is safe since all current users, which use ASCII part of the
dump, rely on the group size equal to 1.  The patch doesn't change
behaviour for such group size (see the table above).

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:14 -08:00
Andy Shevchenko
64d1d77a44 hexdump: introduce test suite
Test different scenarios of function calls located in lib/hexdump.c.

Currently hex_dump_to_buffer() is only tested and test data is provided
for little endian CPUs.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:14 -08:00
Toshi Kikuchi
ad3d5d2f7d lib/genalloc.c: fix the end addr check in addr_in_gen_pool()
Since chunk->end_addr is (chunk->start_addr + size - 1), the end address
to compare should be (start + size - 1).

Signed-off-by: Toshi Kikuchi <toshik@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:14 -08:00
Rasmus Villemoes
af3cd13501 lib/string.c: remove strnicmp()
Now that all in-tree users of strnicmp have been converted to
strncasecmp, the wrapper can be removed.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: David Howells <dhowells@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:14 -08:00
Rasmus Villemoes
9814ec135d lib/bitmap.c: make the bits parameter of bitmap_remap unsigned
Also, rename bits to nbits. Both changes for consistency with other
bitmap_* functions.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:14 -08:00
Rasmus Villemoes
f6a1f5db8d lib/bitmap.c: simplify bitmap_ord_to_pos
Make the return value and the ord and nbits parameters of
bitmap_ord_to_pos unsigned.

Also, simplify the implementation and as a side effect make the result
fully defined, returning nbits for ord >= weight, in analogy with what
find_{first,next}_bit does.  This is a better sentinel than the former
("unofficial") 0.  No current users are affected by this change.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:14 -08:00
Rasmus Villemoes
df1d80a9eb lib/bitmap.c: simplify bitmap_pos_to_ord
The ordinal of a set bit is simply the number of set bits before it;
counting those doesn't need to be done one bit at a time.  While at it,
update the parameters to unsigned int.

It is not completely unthinkable that gcc would see pos as compile-time
constant 0 in one of the uses of bitmap_pos_to_ord.  Since the static
inline frontend bitmap_weight doesn't handle nbits==0 correctly (it would
behave exactly as if nbits==BITS_PER_LONG), use __bitmap_weight.

Alternatively, the last line could be spelled bitmap_weight(buf, pos+1)-1,
but this is simpler.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:14 -08:00
Rasmus Villemoes
b26ad5836c lib/bitmap.c: change parameters of bitmap_fold to unsigned
Change the sz and nbits parameters of bitmap_fold to unsigned int for
consistency with other bitmap_* functions, and to save another few bytes
in the generated code.

[akpm@linux-foundation.org: fix kerneldoc]
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:14 -08:00
Rasmus Villemoes
eb56988378 lib/bitmap.c: update bitmap_onto to unsigned
Change the nbits parameter of bitmap_onto to unsigned int for consistency
with other bitmap_* functions.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:14 -08:00
Rasmus Villemoes
d1214c65c0 libstring_helpers.c:string_get_size(): return void
string_get_size() was documented to return an error, but in fact always
returned 0.  Since the output always fits in 9 bytes, just document that
and let callers do what they do now: pass a small stack buffer and ignore
the return value.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:13 -08:00
Rasmus Villemoes
84b9fbedf5 lib/string_helpers.c:string_get_size(): use 32 bit arithmetic when possible
The remainder from do_div is always a u32, and after size has been reduced
to be below 1000 (or 1024), it certainly fits in u32.  So both remainder
and sf_cap can be made u32s, the format specifiers can be simplified (%lld
wasn't the right thing to use for _unsigned_ long long anyway), and we can
replace a do_div with an ordinary 32/32 bit division.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:13 -08:00
Rasmus Villemoes
7eed8fde02 lib/string_helpers.c:string_get_size(): remove redundant prefixes
While commit 3c9f3681d0 ("[SCSI] lib: add generic helper to print
sizes rounded to the correct SI range") says that Z and Y are included
in preparation for 128 bit computers, they just waste .text currently.
If and when we get u128, string_get_size needs updating anyway (and ISO
needs to come up with four more prefixes).

Also there's no need to include and test for the NULL sentinel; once we
reach "E" size is at most 18.  [The test is also wrong; it should be
units_str[units][i+1]; if we've reached NULL we're already doomed.]

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:13 -08:00
Rasmus Villemoes
43e5b666cf lib/vsprintf.c: replace while with do-while in skip_atoi
All callers of skip_atoi have already checked for the first character
being a digit.  In this case, gcc generates simpler code for a do
while-loop.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:13 -08:00
Rasmus Villemoes
2aa2f9e21e lib/vsprintf.c: improve sanity check in vsnprintf()
On 64 bit, size may very well be huge even if bit 31 happens to be 0.
Somehow it doesn't feel right that one can pass a 5 GiB buffer but not a
3 GiB one.  So cap at INT_MAX as was probably the intention all along.
This is also the made-up value passed by sprintf and vsprintf.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:13 -08:00
Rasmus Villemoes
ffbfed03b4 lib/vsprintf.c: consume 'p' in format_decode
It seems a little simpler to consume the p from a %p specifier in
format_decode, just as it is done for the surrounding %c, %s and %% cases.

While there, delete a redundant and misplaced comment.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:13 -08:00
Linus Torvalds
5d8e7fb691 md updates for 3.20
- assorted locking changes so that access to /proc/mdstat
    and much of /sys/block/mdXX/md/* is protected by a spinlock
    rather than a mutex and will never block indefinitely.
 
  - Make an 'if' condition in RAID5 - which has been implicated
    in recent bugs - more readable.
 
  - misc minor fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIVAwUAVNwaHTnsnt1WYoG5AQLssw//TniaqtlsK6RPd4PbWdyKFBfUx6bPurtx
 ZRQvw7FcX8sFW/QtxKyJZsOhKcCc1CQEByzZzvuR5SLE8scyw/SxC8eFqMY+vSWF
 HmqJtoK3LdxY/PPC2qRYcdzpU4DzYuIu3ChkJvdXJa4mULKJhutR8nbp1k52JXN8
 U2KPoE+hScVLKCuhwkvp6h4Fg/Caa9MjSpk3uMEkGDm75Iwl175EIb/fV95pIfqd
 cXs2wJBW3TA/YQecMeHC/YmVSIjF8nobCfhFlWfWYg0ySrSh1YSPLjb707Ohs5cF
 efEGoIWfm2uKd0XmsKEDclYDTNtwqgO7A3QDuWFP5ab9cTpi0Fuj1CE9LNBCf5Hf
 a5mwggBQ2KkpGwgtBuz6MJRaVu1xtOOjyFG8qzzeDOHLKxHRvZgiUnkb5U2I+YxB
 iMmyk7ijm9PF5M+wm3AqzFmG8icrAf6ixkSZidQy0PfAIFFFph+jQvdHqiXnLOGD
 6S/xpG0avHxdC5pC4R0iRsNMNl3IESqy6qLkTOYjQ+yoZ9A+LY19nsMf9LBnwgdY
 hz9WxV+EvFVggVl19U5Bg+3z1EwPt3kNr4Se1bkPI8reHH/adacNmHWBBLQ5KHle
 WYdqcNXiTwaLje/7Qw5TKQFSGgLMAPc8ciDA7QOX/ZFoyUmWFn89YAoDdGqvvDd9
 g5mQc/Amxt8=
 =/FmP
 -----END PGP SIGNATURE-----

Merge tag 'md/3.20' of git://neil.brown.name/md

Pull md updates from Neil Brown:

 - assorted locking changes so that access to /proc/mdstat
   and much of /sys/block/mdXX/md/* is protected by a spinlock
   rather than a mutex and will never block indefinitely.

 - Make an 'if' condition in RAID5 - which has been implicated
   in recent bugs - more readable.

 - misc minor fixes

* tag 'md/3.20' of git://neil.brown.name/md: (28 commits)
  md/raid10: fix conversion from RAID0 to RAID10
  md: wakeup thread upon rdev_dec_pending()
  md: make reconfig_mutex optional for writes to md sysfs files.
  md: move mddev_lock and related to md.h
  md: use mddev->lock to protect updates to resync_{min,max}.
  md: minor cleanup in safe_delay_store.
  md: move GET_BITMAP_FILE ioctl out from mddev_lock.
  md: tidy up set_bitmap_file
  md: remove unnecessary 'buf' from get_bitmap_file.
  md: remove mddev_lock from rdev_attr_show()
  md: remove mddev_lock() from md_attr_show()
  md/raid5: use ->lock to protect accessing raid5 sysfs attributes.
  md: remove need for mddev_lock() in md_seq_show()
  md/bitmap: protect clearing of ->bitmap by mddev->lock
  md: protect ->pers changes with mddev->lock
  md: level_store: group all important changes into one place.
  md: rename ->stop to ->free
  md: split detach operation out from ->stop.
  md/linear: remove rcu protections in favour of suspend/resume
  md: make merge_bvec_fn more robust in face of personality changes.
  ...
2015-02-12 11:05:49 -08:00
Linus Torvalds
42cf0f203e Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:

 - clang assembly fixes from Ard

 - optimisations and cleanups for Aurora L2 cache support

 - efficient L2 cache support for secure monitor API on Exynos SoCs

 - debug menu cleanup from Daniel Thompson to allow better behaviour for
   multiplatform kernels

 - StrongARM SA11x0 conversion to irq domains, and pxa_timer

 - kprobes updates for older ARM CPUs

 - move probes support out of arch/arm/kernel to arch/arm/probes

 - add inline asm support for the rbit (reverse bits) instruction

 - provide an ARM mode secondary CPU entry point (for Qualcomm CPUs)

 - remove the unused ARMv3 user access code

 - add driver_override support to AMBA Primecell bus

* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (55 commits)
  ARM: 8256/1: driver coamba: add device binding path 'driver_override'
  ARM: 8301/1: qcom: Use secondary_startup_arm()
  ARM: 8302/1: Add a secondary_startup that assumes ARM mode
  ARM: 8300/1: teach __asmeq that r11 == fp and r12 == ip
  ARM: kprobes: Fix compilation error caused by superfluous '*'
  ARM: 8297/1: cache-l2x0: optimize aurora range operations
  ARM: 8296/1: cache-l2x0: clean up aurora cache handling
  ARM: 8284/1: sa1100: clear RCSR_SMR on resume
  ARM: 8283/1: sa1100: collie: clear PWER register on machine init
  ARM: 8282/1: sa1100: use handle_domain_irq
  ARM: 8281/1: sa1100: move GPIO-related IRQ code to gpio driver
  ARM: 8280/1: sa1100: switch to irq_domain_add_simple()
  ARM: 8279/1: sa1100: merge both GPIO irqdomains
  ARM: 8278/1: sa1100: split irq handling for low GPIOs
  ARM: 8291/1: replace magic number with PAGE_SHIFT macro in fixup_pv code
  ARM: 8290/1: decompressor: fix a wrong comment
  ARM: 8286/1: mm: Fix dma_contiguous_reserve comment
  ARM: 8248/1: pm: remove outdated comment
  ARM: 8274/1: Fix DEBUG_LL for multi-platform kernels (without PL01X)
  ARM: 8273/1: Seperate DEBUG_UART_PHYS from DEBUG_LL on EP93XX
  ...
2015-02-12 08:51:56 -08:00
Linus Torvalds
8cc748aa76 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security layer updates from James Morris:
 "Highlights:

   - Smack adds secmark support for Netfilter
   - /proc/keys is now mandatory if CONFIG_KEYS=y
   - TPM gets its own device class
   - Added TPM 2.0 support
   - Smack file hook rework (all Smack users should review this!)"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (64 commits)
  cipso: don't use IPCB() to locate the CIPSO IP option
  SELinux: fix error code in policydb_init()
  selinux: add security in-core xattr support for pstore and debugfs
  selinux: quiet the filesystem labeling behavior message
  selinux: Remove unused function avc_sidcmp()
  ima: /proc/keys is now mandatory
  Smack: Repair netfilter dependency
  X.509: silence asn1 compiler debug output
  X.509: shut up about included cert for silent build
  KEYS: Make /proc/keys unconditional if CONFIG_KEYS=y
  MAINTAINERS: email update
  tpm/tpm_tis: Add missing ifdef CONFIG_ACPI for pnp_acpi_device
  smack: fix possible use after frees in task_security() callers
  smack: Add missing logging in bidirectional UDS connect check
  Smack: secmark support for netfilter
  Smack: Rework file hooks
  tpm: fix format string error in tpm-chip.c
  char/tpm/tpm_crb: fix build error
  smack: Fix a bidirectional UDS connect check typo
  smack: introduce a special case for tmpfs in smack_d_instantiate()
  ...
2015-02-11 20:25:11 -08:00
Linus Torvalds
b3d6524ff7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:

 - The remaining patches for the z13 machine support: kernel build
   option for z13, the cache synonym avoidance, SMT support,
   compare-and-delay for spinloops and the CES5S crypto adapater.

 - The ftrace support for function tracing with the gcc hotpatch option.
   This touches common code Makefiles, Steven is ok with the changes.

 - The hypfs file system gets an extension to access diagnose 0x0c data
   in user space for performance analysis for Linux running under z/VM.

 - The iucv hvc console gets wildcard spport for the user id filtering.

 - The cacheinfo code is converted to use the generic infrastructure.

 - Cleanup and bug fixes.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (42 commits)
  s390/process: free vx save area when releasing tasks
  s390/hypfs: Eliminate hypfs interval
  s390/hypfs: Add diagnose 0c support
  s390/cacheinfo: don't use smp_processor_id() in preemptible context
  s390/zcrypt: fixed domain scanning problem (again)
  s390/smp: increase maximum value of NR_CPUS to 512
  s390/jump label: use different nop instruction
  s390/jump label: add sanity checks
  s390/mm: correct missing space when reporting user process faults
  s390/dasd: cleanup profiling
  s390/dasd: add locking for global_profile access
  s390/ftrace: hotpatch support for function tracing
  ftrace: let notrace function attribute disable hotpatching if necessary
  ftrace: allow architectures to specify ftrace compile options
  s390: reintroduce diag 44 calls for cpu_relax()
  s390/zcrypt: Add support for new crypto express (CEX5S) adapter.
  s390/zcrypt: Number of supported ap domains is not retrievable.
  s390/spinlock: add compare-and-delay to lock wait loops
  s390/tape: remove redundant if statement
  s390/hvc_iucv: add simple wildcard matches to the iucv allow filter
  ...
2015-02-11 17:42:32 -08:00
Linus Torvalds
c5ce28df0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) More iov_iter conversion work from Al Viro.

    [ The "crypto: switch af_alg_make_sg() to iov_iter" commit was
      wrong, and this pull actually adds an extra commit on top of the
      branch I'm pulling to fix that up, so that the pre-merge state is
      ok.   - Linus ]

 2) Various optimizations to the ipv4 forwarding information base trie
    lookup implementation.  From Alexander Duyck.

 3) Remove sock_iocb altogether, from CHristoph Hellwig.

 4) Allow congestion control algorithm selection via routing metrics.
    From Daniel Borkmann.

 5) Make ipv4 uncached route list per-cpu, from Eric Dumazet.

 6) Handle rfs hash collisions more gracefully, also from Eric Dumazet.

 7) Add xmit_more support to r8169, e1000, and e1000e drivers.  From
    Florian Westphal.

 8) Transparent Ethernet Bridging support for GRO, from Jesse Gross.

 9) Add BPF packet actions to packet scheduler, from Jiri Pirko.

10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer.

11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman
    Kwok.

12) More sanely handle out-of-window dupacks, which can result in
    serious ACK storms.  From Neal Cardwell.

13) Various rhashtable bug fixes and enhancements, from Herbert Xu,
    Patrick McHardy, and Thomas Graf.

14) Support xmit_more in be2net, from Sathya Perla.

15) Group Policy extensions for vxlan, from Thomas Graf.

16) Remove Checksum Offload support for vxlan, from Tom Herbert.

17) Like ipv4, support lockless transmit over ipv6 UDP sockets.  From
    Vlad Yasevich.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits)
  crypto: fix af_alg_make_sg() conversion to iov_iter
  ipv4: Namespecify TCP PMTU mechanism
  i40e: Fix for stats init function call in Rx setup
  tcp: don't include Fast Open option in SYN-ACK on pure SYN-data
  openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set
  ipv6: Make __ipv6_select_ident static
  ipv6: Fix fragment id assignment on LE arches.
  bridge: Fix inability to add non-vlan fdb entry
  net: Mellanox: Delete unnecessary checks before the function call "vunmap"
  cxgb4: Add support in cxgb4 to get expansion rom version via ethtool
  ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version
  net: dsa: Remove redundant phy_attach()
  IB/mlx4: Reset flow support for IB kernel ULPs
  IB/mlx4: Always use the correct port for mirrored multicast attachments
  net/bonding: Fix potential bad memory access during bonding events
  tipc: remove tipc_snprintf
  tipc: nl compat add noop and remove legacy nl framework
  tipc: convert legacy nl stats show to nl compat
  tipc: convert legacy nl net id get to nl compat
  tipc: convert legacy nl net id set to nl compat
  ...
2015-02-10 20:01:30 -08:00
Linus Torvalds
29afc4e9a4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree changes from Jiri Kosina:
 "Patches from trivial.git that keep the world turning around.

  Mostly documentation and comment fixes, and a two corner-case code
  fixes from Alan Cox"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  kexec, Kconfig: spell "architecture" properly
  mm: fix cleancache debugfs directory path
  blackfin: mach-common: ints-priority: remove unused function
  doubletalk: probe failure causes OOPS
  ARM: cache-l2x0.c: Make it clear that cache-l2x0 handles L310 cache controller
  msdos_fs.h: fix 'fields' in comment
  scsi: aic7xxx: fix comment
  ARM: l2c: fix comment
  ibmraid: fix writeable attribute with no store method
  dynamic_debug: fix comment
  doc: usbmon: fix spelling s/unpriviledged/unprivileged/
  x86: init_mem_mapping(): use capital BIOS in comment
2015-02-10 18:57:15 -08:00
Linus Torvalds
23e8fe2e16 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main RCU changes in this cycle are:

   - Documentation updates.

   - Miscellaneous fixes.

   - Preemptible-RCU fixes, including fixing an old bug in the
     interaction of RCU priority boosting and CPU hotplug.

   - SRCU updates.

   - RCU CPU stall-warning updates.

   - RCU torture-test updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  rcu: Initialize tiny RCU stall-warning timeouts at boot
  rcu: Fix RCU CPU stall detection in tiny implementation
  rcu: Add GP-kthread-starvation checks to CPU stall warnings
  rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors
  rcu: Optionally run grace-period kthreads at real-time priority
  ksoftirqd: Use new cond_resched_rcu_qs() function
  ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
  rcutorture: Add more diagnostics in rcu_barrier() test failure case
  torture: Flag console.log file to prevent holdovers from earlier runs
  torture: Add "-enable-kvm -soundhw pcspk" to qemu command line
  rcutorture: Handle different mpstat versions
  rcutorture: Check from beginning to end of grace period
  rcu: Remove redundant rcu_batches_completed() declaration
  rcutorture: Drop rcu_torture_completed() and friends
  rcu: Provide rcu_batches_completed_sched() for TINY_RCU
  rcutorture: Use unsigned for Reader Batch computations
  rcutorture: Make build-output parsing correctly flag RCU's warnings
  rcu: Make _batches_completed() functions return unsigned long
  rcutorture: Issue warnings on close calls due to Reader Batch blows
  documentation: Fix smp typo in memory-barriers.txt
  ...
2015-02-09 14:28:42 -08:00
Stephen Rothwell
61d7b09773 rhashtable: using ERR_PTR requires linux/err.h
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-08 21:52:24 -08:00
Thomas Graf
020219a69d rhashtable: Fix remove logic to avoid cross references between buckets
The remove logic properly searched the remaining chain for a matching
entry with an identical hash but it did this while searching from both
the old and new table. Instead in order to not leave stale references
behind we need to:

 1. When growing and searching from the new table:
    Search remaining chain for entry with same hash to avoid having
    the new table directly point to a entry with a different hash.

 2. When shrinking and searching from the old table:
    Check if the element after the removed would create a cross
    reference and avoid it if so.

These bugs were present from the beginning in nft_hash.

Also, both insert functions calculated the hash based on the mask of
the new table. This worked while growing. Wwhile shrinking, the mask
of the inew table is smaller than the mask of the old table. This lead
to a bit not being taken into account when selecting the bucket lock
and thus caused the wrong bucket to be locked eventually.

Fixes: 7e1e77636e ("lib: Resizable, Scalable, Concurrent Hash Table")
Fixes: 97defe1ecf ("rhashtable: Per bucket locks & deferred expansion/shrinking")
Reported-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-06 15:19:17 -08:00
Thomas Graf
cf52d52f9c rhashtable: Avoid bucket cross reference after removal
During a resize, when two buckets in the larger table map to
a single bucket in the smaller table and the new table has already
been (partially) linked to the old table. Removal of an element
may result the bucket in the larger table to point to entries
which all hash to a different value than the bucket index. Thus
causing two buckets to point to the same sub chain after unzipping.
This is not illegal *during* the resize phase but after it has
completed.

Keep the old table around until all of the unzipping is done to
allow the removal code to only search for matching hashed entries
during this special period.

Reported-by: Ying Xue <ying.xue@windriver.com>
Fixes: 97defe1ecf ("rhashtable: Per bucket locks & deferred expansion/shrinking")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-06 15:18:35 -08:00
Thomas Graf
7cd10db8de rhashtable: Add more lock verification
Catch hash miscalculations which result in hard to track down race
conditions.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-06 15:18:34 -08:00
Thomas Graf
a03eaec0df rhashtable: Dump bucket tables on locking violation under PROVE_LOCKING
This simplifies debugging of locking violations if compiled with
CONFIG_PROVE_LOCKING.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-06 15:18:34 -08:00
Thomas Graf
2af4b52988 rhashtable: Wait for RCU readers after final unzip work
We need to wait for all RCU readers to complete after the last bit of
unzipping has been completed. Otherwise the old table is freed up
prematurely.

Fixes: 7e1e77636e ("lib: Resizable, Scalable, Concurrent Hash Table")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-06 15:18:34 -08:00
Thomas Graf
a5ec68e3b8 rhashtable: Use a single bucket lock for sibling buckets
rhashtable currently allows to use a bucket lock per bucket. This
requires multiple levels of complicated nested locking because when
resizing, a single bucket of the smaller table will map to two
buckets in the larger table. So far rhashtable has explicitly locked
both buckets in the larger table.

By excluding the highest bit of the hash from the bucket lock map and
thus only allowing locks to buckets in a ratio of 1:2, the locking
can be simplified a lot without losing the benefits of multiple locks.
Larger tables which benefit from multiple locks will not have a single
lock per bucket anyway.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-06 15:18:34 -08:00
Thomas Graf
c88455ce50 rhashtable: key_hashfn() must return full hash value
The value computed by key_hashfn() is used by rhashtable_lookup_compare()
to traverse both tables during a resize. key_hashfn() must therefore
return the hash value without the buckets mask applied so it can be
masked to the size of each individual table.

Fixes: 97defe1ecf ("rhashtable: Per bucket locks & deferred expansion/shrinking")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-06 15:18:34 -08:00
David S. Miller
6e03f896b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/vxlan.c
	drivers/vhost/net.c
	include/linux/if_vlan.h
	net/core/dev.c

The net/core/dev.c conflict was the overlap of one commit marking an
existing function static whilst another was adding a new function.

In the include/linux/if_vlan.h case, the type used for a local
variable was changed in 'net', whereas the function got rewritten
to fix a stacked vlan bug in 'net-next'.

In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next'
overlapped with an endainness fix for VHOST 1.0 in 'net'.

In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter
in 'net-next' whereas in 'net' there was a bug fix to pass in the
correct network namespace pointer in calls to this function.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05 14:33:28 -08:00
David S. Miller
f2683b743f Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
More iov_iter work from Al Viro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 20:46:55 -08:00
Herbert Xu
f2dba9c6ff rhashtable: Introduce rhashtable_walk_*
Some existing rhashtable users get too intimate with it by walking
the buckets directly.  This prevents us from easily changing the
internals of rhashtable.

This patch adds the helpers rhashtable_walk_init/exit/start/next/stop
which will replace these custom walkers.

They are meant to be usable for both procfs seq_file walks as well
as walking by a netlink dump.  The iterator structure should fit
inside a netlink dump cb structure, with at least one element to
spare.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 20:34:52 -08:00
Herbert Xu
28134a53d6 rhashtable: Fix potential crash on destroy in rhashtable_shrink
The current being_destroyed check in rhashtable_expand is not
enough since if we start a shrinking process after freeing all
elements in the table that's also going to crash.

This patch adds a being_destroyed check to the deferred worker
thread so that we bail out as soon as we take the lock.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 20:34:52 -08:00
Al Viro
57dd8a0735 vhost: vhost_scsi_handle_vq() should just use copy_from_user()
it has just verified that it asks no more than the length of the
first segment of iovec.

And with that the last user of stuff in lib/iovec.c is gone.
RIP.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:16 -05:00
Al Viro
ba7438aed9 vhost: don't bother copying iovecs in handle_rx(), kill memcpy_toiovecend()
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:16 -05:00
Al Viro
aad9a1cec7 vhost: switch vhost get_indirect() to iov_iter, kill memcpy_fromiovec()
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:15 -05:00
Jan Beulich
75aaf4c3e6 x86/raid6: correctly check for assembler capabilities
Just like for AVX2 (which simply needs an #if -> #ifdef conversion),
SSSE3 assembler support should be checked for before using it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-02-04 08:35:51 +11:00
Geert Uytterhoeven
9d6dbe1bba rhashtable: Make selftest modular
Allow the selftest on the resizable hash table to be built modular, just
like all other tests that do not depend on DEBUG_KERNEL.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-30 18:06:33 -08:00
karl beldan
9ce357795e lib/checksum.c: fix build for generic csum_tcpudp_nofold
Fixed commit added from64to32 under _#ifndef do_csum_ but used it
under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's
robot reported TILEGX's). Move from64to32 under the latter.

Fixes: 150ae0e946 ("lib/checksum.c: fix carry in csum_tcpudp_nofold")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-29 11:57:38 -08:00
Heiko Carstens
c0a80c0c27 ftrace: allow architectures to specify ftrace compile options
If the kernel is compiled with function tracer support the -pg compile option
is passed to gcc to generate extra code into the prologue of each function.

This patch replaces the "open-coded" -pg compile flag with a CC_FLAGS_FTRACE
makefile variable which architectures can override if a different option
should be used for code generation.

Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-01-29 09:19:19 +01:00
karl beldan
150ae0e946 lib/checksum.c: fix carry in csum_tcpudp_nofold
The carry from the 64->32bits folding was dropped, e.g with:
saddr=0xFFFFFFFF daddr=0xFF0000FF len=0xFFFF proto=0 sum=1,
csum_tcpudp_nofold returned 0 instead of 1.

Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-28 22:32:33 -08:00
Thomas Graf
fe6a043c53 rhashtable: rhashtable_remove() must unlink in both tbl and future_tbl
As removals can occur during resizes, entries may be referred to from
both tbl and future_tbl when the removal is requested. Therefore
rhashtable_remove() must unlink the entry in both tables if this is
the case. The existing code did search both tables but stopped when it
hit the first match.

Failing to unlink in both tables resulted in use after free.

Fixes: 97defe1ecf ("rhashtable: Per bucket locks & deferred expansion/shrinking")
Reported-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-26 11:56:34 -08:00