To help track down where memory leaks may be, add the ability to turn
on/off printing allocations, frees and delayed frees.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
To allow developers to run a subset of tests, build separate multiorder
and idr-test binaries which will run just the tests in those files.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
We can use the root entry as a bitmap and save allocating a 128 byte
bitmap for an IDA that contains only a few entries (30 on a 32-bit
machine, 62 on a 64-bit machine). This costs about 300 bytes of kernel
text on x86-64, so as long as 3 IDAs fall into this category, this
is a net win for memory consumption.
Thanks to Rasmus Villemoes for his work documenting the problem and
collecting statistics on IDAs.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
When we preload the IDA, we allocate an IDA bitmap. Instead of storing
that preallocated bitmap in the IDA, we store it in a percpu variable.
Generally there are more IDAs in the system than CPUs, so this cuts down
on the number of preallocated bitmaps that are unused, and about half
of the IDA users did not call ida_destroy() so they were leaking IDA
bitmaps.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
The IDR is very similar to the radix tree. It has some functionality that
the radix tree did not have (alloc next free, cyclic allocation, a
callback-based for_each, destroy tree), which is readily implementable on
top of the radix tree. A few small changes were needed in order to use a
tag to represent nodes with free space below them. More extensive
changes were needed to support storing NULL as a valid entry in an IDR.
Plain radix trees still interpret NULL as a not-present entry.
The IDA is reimplemented as a client of the newly enhanced radix tree. As
in the current implementation, it uses a bitmap at the last level of the
tree.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
radix-tree.c doesn't use these CONFIG options any more.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
Instead of specifying how to build find_bit.o from lib/find_bit.o,
use vpath to tell make where to find find_bit.c.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
Many of the definitions in the radix-tree kernel.h are redundant with
others in tools/include, or are no longer used, such as panic().
Move the definition of __init to init.h and in_interrupt() to preempt.h
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
The radix tree hasn't used a mempool since the beginning of git history.
Remove the userspace mempool implementation.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
Changing tools/include/asm/bug.h showed a missing dependency in the
Makefile.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Rehas Sachdeva <aquannie@gmail.com>
The definition of WARN_ON being used by the radix tree test suite was
deficient in two ways: it did not provide a return value, and it stopped
execution instead of continuing. This version of WARN_ON tells you
which file & line the assertion was triggered in.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
By adding __set_bit and __clear_bit to the tools include directory, we
can share the bitops code. This reveals an include loop between kernel.h,
log2.h, bitmap.h and bitops.h. Break it the same way as the kernel does;
by moving the kernel.h include from bitops.h to bitmap.h.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Pull networking fixes from David Miller:
1) GTP fixes from Andreas Schultz (missing genl module alias, clear IP
DF on transmit).
2) Netfilter needs to reflect the fwmark when sending resets, from Pau
Espin Pedrol.
3) nftable dump OOPS fix from Liping Zhang.
4) Fix erroneous setting of VIRTIO_NET_HDR_F_DATA_VALID on transmit,
from Rolf Neugebauer.
5) Fix build error of ipt_CLUSTERIP when procfs is disabled, from Arnd
Bergmann.
6) Fix regression in handling of NETIF_F_SG in harmonize_features(),
from Eric Dumazet.
7) Fix RTNL deadlock wrt. lwtunnel module loading, from David Ahern.
8) tcp_fastopen_create_child() needs to setup tp->max_window, from
Alexey Kodanev.
9) Missing kmemdup() failure check in ipv6 segment routing code, from
Eric Dumazet.
10) Don't execute unix_bind() under the bindlock, otherwise we deadlock
with splice. From WANG Cong.
11) ip6_tnl_parse_tlv_enc_lim() potentially reallocates the skb buffer,
therefore callers must reload cached header pointers into that skb.
Fix from Eric Dumazet.
12) Fix various bugs in legacy IRQ fallback handling in alx driver, from
Tobias Regnery.
13) Do not allow lwtunnel drivers to be unloaded while they are
referenced by active instances, from Robert Shearman.
14) Fix truncated PHY LED trigger names, from Geert Uytterhoeven.
15) Fix a few regressions from virtio_net XDP support, from John
Fastabend and Jakub Kicinski.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (102 commits)
ISDN: eicon: silence misleading array-bounds warning
net: phy: micrel: add support for KSZ8795
gtp: fix cross netns recv on gtp socket
gtp: clear DF bit on GTP packet tx
gtp: add genl family modules alias
tcp: don't annotate mark on control socket from tcp_v6_send_response()
ravb: unmap descriptors when freeing rings
virtio_net: reject XDP programs using header adjustment
virtio_net: use dev_kfree_skb for small buffer XDP receive
r8152: check rx after napi is enabled
r8152: re-schedule napi for tx
r8152: avoid start_xmit to schedule napi when napi is disabled
r8152: avoid start_xmit to call napi_schedule during autosuspend
net: dsa: Bring back device detaching in dsa_slave_suspend()
net: phy: leds: Fix truncated LED trigger names
net: phy: leds: Break dependency of phy.h on phy_led_triggers.h
net: phy: leds: Clear phy_num_led_triggers on failure to avoid crash
net-next: ethernet: mediatek: change the compatible string
Documentation: devicetree: change the mediatek ethernet compatible string
bnxt_en: Fix RTNL lock usage on bnxt_get_port_module_status().
...
Test uses PMC2 to count the event. But PMC1 is being initialized.
Patch to fix it.
Fixes: 3752e453f6 ('selftests/powerpc: Add tests of PMU EBBs')
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
test_lru_sanity5() fails when the number of online cpus
is fewer than the number of possible cpus. It can be
reproduced with qemu by using cmd args "--smp cpus=2,maxcpus=8".
The problem is the loop in test_lru_sanity5() is testing
'i' which is incorrect.
This patch:
1. Make sched_next_online() always return -1 if it cannot
find a next cpu to schedule the process.
2. In test_lru_sanity5(), the parent process does
sched_setaffinity() first (through sched_next_online())
and the forked process will inherit it according to
the 'man sched_setaffinity'.
Fixes: 5db58faf98 ("bpf: Add tests for the LRU bpf_htab")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix spelling mistake in print test pass message.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Nothing in this minimal script seems to require bash. We often run these
tests on embedded devices where the only shell available is the busybox
ash. Use sh instead.
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Cc: stable@vger.kernel.org
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Nothing in this minimal script seems to require bash. We often run these
tests on embedded devices where the only shell available is the busybox
ash. Use sh instead.
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Cc: stable@vger.kernel.org
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Nothing in this minimal script seems to require bash. We often run these
tests on embedded devices where the only shell available is the busybox
ash. Use sh instead.
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Cc: stable@vger.kernel.org
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
* Dynamic label support: To date namespace label support has been
limited to disambiguating cases where PMEM (direct load/store) and BLK
(mmio aperture) accessed-capacity alias on the same DIMM. Since 4.9 added
support for multiple namespaces per PMEM-region there is value to
support namespace labels even in the non-aliasing case. The presence of
a valid namespace index block force-enables label support when the
kernel would otherwise rely on region boundaries, and permits the region
to be sub-divided.
* Handle media errors in namespace metadata: Complement the error
handling for media errors in namespace data areas with support for
clearing errors on writes, and downgrading potential machine-check
exceptions to simple i/o errors on read.
* Device-DAX region attributes: Add 'align', 'id', and 'size' as
attributes for device-dax regions. In particular this enables userspace
tooling to generically size memory mapping and i/o operations. Prevent
userspace from growing assumptions / dependencies about the parent
device topology for a dax region. A libnvdimm namespace may not always
be the parent device of a dax region.
* Various cleanups and small fixes.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYVxRcAAoJEB7SkWpmfYgCSnoQAKNs7IJRtGqKFCkMB5VDAW79
Lifi35Hqm+mfFaLMIyRoKJMnhXKQBsthfcHZIy5t63kDWEV/tJ+riZ+m2Ibz4GPE
AUn07jxge2q9v0RwMylqpZ/EHyaK/xD1z0+SdIwVF2VaEDoVdnmu2WEqjuir/lfM
t70B/YoWLg6CcHeTzaV5vdxlRKyG4pzOV3UzPlYLROr3wFJBHD88j/4zcGy3F1ue
DpzGThhtEQXZUZCJLp54E0ch7V2cz/DNUDYwsjbCZrdYYmUJFqjZQxNyftn28aEJ
HI/BERXLhm2HF2qRqC+0C1lJmUylYk4wXUGco+b/u1GYN59eFe/JT3dJgxEHFKL2
nVOUmR8XsRE6gkFWQGcFesRjjI1Rqz2Wgql7oqkSL9grGOwY4rvKX0LhJxgvfuZ0
Itb5h0dMFUriAP0DaaJFMYfANIOx+mqr4RxDMwP5ZADku6+Ga0LXxvcBt06K/mYO
/zLo27MhsuuFy/odXm1A21OjFiH2pM3pSCaytKvDjy7DwzgE7ZzXmixXQ7j8YVp6
OtLYzMjIt8nt4xh0hmav5tb0r9l6mgNqlifacMC9lEM/7SDAiBXXBLuQngJN/j0s
iXllBk0pYVWibf3VcD6oY+qKdmBWvXxPjgj1lHE6j9A7Gw2jwIt/W2NWzV94nKmC
f6d+faHiRokqyVhdgfx3
=YWfP
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"The libnvdimm pull request is relatively small this time around due to
some development topics being deferred to 4.11.
As for this pull request the bulk of it has been in -next for several
releases leading to one late fix being added (commit 868f036fee
("libnvdimm: fix mishandled nvdimm_clear_poison() return value")). It
has received a build success notification from the 0day-kbuild robot
and passes the latest libnvdimm unit tests.
Summary:
- Dynamic label support: To date namespace label support has been
limited to disambiguating cases where PMEM (direct load/store) and
BLK (mmio aperture) accessed-capacity alias on the same DIMM. Since
4.9 added support for multiple namespaces per PMEM-region there is
value to support namespace labels even in the non-aliasing case.
The presence of a valid namespace index block force-enables label
support when the kernel would otherwise rely on region boundaries,
and permits the region to be sub-divided.
- Handle media errors in namespace metadata: Complement the error
handling for media errors in namespace data areas with support for
clearing errors on writes, and downgrading potential machine-check
exceptions to simple i/o errors on read.
- Device-DAX region attributes: Add 'align', 'id', and 'size' as
attributes for device-dax regions. In particular this enables
userspace tooling to generically size memory mapping and i/o
operations. Prevent userspace from growing assumptions /
dependencies about the parent device topology for a dax region. A
libnvdimm namespace may not always be the parent device of a dax
region.
- Various cleanups and small fixes"
* tag 'libnvdimm-for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax: add region 'id', 'size', and 'align' attributes
libnvdimm: fix mishandled nvdimm_clear_poison() return value
libnvdimm: replace mutex_is_locked() warnings with lockdep_assert_held
libnvdimm, pfn: fix align attribute
libnvdimm, e820: use module_platform_driver
libnvdimm, namespace: use octal for permissions
libnvdimm, namespace: avoid multiple sector calculations
libnvdimm: remove else after return in nsio_rw_bytes()
libnvdimm, namespace: fix the type of name variable
libnvdimm: use consistent naming for request_mem_region()
nvdimm: use the right length of "pmem"
libnvdimm: check and clear poison before writing to pmem
tools/testing/nvdimm: dynamic label support
libnvdimm: allow a platform to force enable label support
libnvdimm: use generic iostat interfaces
Pull networking fixes and cleanups from David Miller:
1) Revert bogus nla_ok() change, from Alexey Dobriyan.
2) Various bpf validator fixes from Daniel Borkmann.
3) Add some necessary SET_NETDEV_DEV() calls to hsis_femac and hip04
drivers, from Dongpo Li.
4) Several ethtool ksettings conversions from Philippe Reynes.
5) Fix bugs in inet port management wrt. soreuseport, from Tom Herbert.
6) XDP support for virtio_net, from John Fastabend.
7) Fix NAT handling within a vrf, from David Ahern.
8) Endianness fixes in dpaa_eth driver, from Claudiu Manoil
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (63 commits)
net: mv643xx_eth: fix build failure
isdn: Constify some function parameters
mlxsw: spectrum: Mark split ports as such
cgroup: Fix CGROUP_BPF config
qed: fix old-style function definition
net: ipv6: check route protocol when deleting routes
r6040: move spinlock in r6040_close as SOFTIRQ-unsafe lock order detected
irda: w83977af_ir: cleanup an indent issue
net: sfc: use new api ethtool_{get|set}_link_ksettings
net: davicom: dm9000: use new api ethtool_{get|set}_link_ksettings
net: cirrus: ep93xx: use new api ethtool_{get|set}_link_ksettings
net: chelsio: cxgb3: use new api ethtool_{get|set}_link_ksettings
net: chelsio: cxgb2: use new api ethtool_{get|set}_link_ksettings
bpf: fix mark_reg_unknown_value for spilled regs on map value marking
bpf: fix overflow in prog accounting
bpf: dynamically allocate digest scratch buffer
gtp: Fix initialization of Flags octet in GTPv1 header
gtp: gtp_check_src_ms_ipv4() always return success
net/x25: use designated initializers
isdn: use designated initializers
...
Running ./test_verifier as unprivileged lets 1 out of 98 tests fail:
[...]
#71 unpriv: check that printk is disallowed FAIL
Unexpected error message!
0: (7a) *(u64 *)(r10 -8) = 0
1: (bf) r1 = r10
2: (07) r1 += -8
3: (b7) r2 = 8
4: (bf) r3 = r1
5: (85) call bpf_trace_printk#6
unknown func bpf_trace_printk#6
[...]
The test case is correct, just that the error outcome changed with
ebb676daa1 ("bpf: Print function name in addition to function id").
Same as with e00c7b216f ("bpf: fix multiple issues in selftest suite
and samples") issue 2), so just fix up the function name.
Fixes: ebb676daa1 ("bpf: Print function name in addition to function id")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL
registers") introduced a regression where existing programs stopped
loading due to reaching the verifier's maximum complexity limit,
whereas prior to this commit they were loading just fine; the affected
program has roughly 2k instructions.
What was found is that state pruning couldn't be performed effectively
anymore due to mismatches of the verifier's register state, in particular
in the id tracking. It doesn't mean that 57a09bf0a4 is incorrect per
se, but rather that verifier needs to perform a lot more work for the
same program with regards to involved map lookups.
Since commit 57a09bf0a4 is only about tracking registers with type
PTR_TO_MAP_VALUE_OR_NULL, the id is only needed to follow registers
until they are promoted through pattern matching with a NULL check to
either PTR_TO_MAP_VALUE or UNKNOWN_VALUE type. After that point, the
id becomes irrelevant for the transitioned types.
For UNKNOWN_VALUE, id is already reset to 0 via mark_reg_unknown_value(),
but not so for PTR_TO_MAP_VALUE where id is becoming stale. It's even
transferred further into other types that don't make use of it. Among
others, one example is where UNKNOWN_VALUE is set on function call
return with RET_INTEGER return type.
states_equal() will then fall through the memcmp() on register state;
note that the second memcmp() uses offsetofend(), so the id is part of
that since d2a4dd37f6 ("bpf: fix state equivalence"). But the bisect
pointed already to 57a09bf0a4, where we really reach beyond complexity
limit. What I found was that states_equal() often failed in this
case due to id mismatches in spilled regs with registers in type
PTR_TO_MAP_VALUE. Unlike non-spilled regs, spilled regs just perform
a memcmp() on their reg state and don't have any other optimizations
in place, therefore also id was relevant in this case for making a
pruning decision.
We can safely reset id to 0 as well when converting to PTR_TO_MAP_VALUE.
For the affected program, it resulted in a ~17 fold reduction of
complexity and let the program load fine again. Selftest suite also
runs fine. The only other place where env->id_gen is used currently is
through direct packet access, but for these cases id is long living, thus
a different scenario.
Also, the current logic in mark_map_regs() is not fully correct when
marking NULL branch with UNKNOWN_VALUE. We need to cache the destination
reg's id in any case. Otherwise, once we marked that reg as UNKNOWN_VALUE,
it's id is reset and any subsequent registers that hold the original id
and are of type PTR_TO_MAP_VALUE_OR_NULL won't be marked UNKNOWN_VALUE
anymore, since mark_map_reg() reuses the uncached regs[regno].id that
was just overridden. Note, we don't need to cache it outside of
mark_map_regs(), since it's called once on this_branch and the other
time on other_branch, which are both two independent verifier states.
A test case for this is added here, too.
Fixes: 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Highlights include:
- Support for the kexec_file_load() syscall, which is a prereq for secure and
trusted boot.
- Prevent kernel execution of userspace on P9 Radix (similar to SMEP/PXN).
- Sort the exception tables at build time, to save time at boot, and store
them as relative offsets to save space in the kernel image & memory.
- Allow building the kernel with thin archives, which should allow us to build
an allyesconfig once some other fixes land.
- Build fixes to allow us to correctly rebuild when changing the kernel endian
from big to little or vice versa.
- Plumbing so that we can avoid doing a full mm TLB flush on P9 Radix.
- Initial stack protector support (-fstack-protector).
- Support for dumping the radix (aka. Linux) and hash page tables via debugfs.
- Fix an oops in cxl coredump generation when cxl_get_fd() is used.
- Freescale updates from Scott: "Highlights include 8xx hugepage support,
qbman fixes/cleanup, device tree updates, and some misc cleanup."
- Many and varied fixes and minor enhancements as always.
Thanks to:
Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual,
Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz, Christophe Jaillet,
Christophe Leroy, Denis Kirjanov, Elimar Riesebieter, Frederic Barrat,
Gautham R. Shenoy, Geliang Tang, Geoff Levand, Jack Miller, Johan Hovold,
Lars-Peter Clausen, Libin, Madhavan Srinivasan, Michael Neuling, Nathan
Fontenot, Naveen N. Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin,
Rashmica Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYU4YSAAoJEFHr6jzI4aWAC4gQALtIAqqPon0Cd5b/FVVcMbW7
mMqB2b/0FGEl5GoRTzGUDaQqElilm6AEVfHO86C7DFji/a6olneFfw87iz+mtWuZ
JvrNq68ZiSnoeszdUy4MgtXFLb5sTzNMev4skaHfjI9E5CepWBoR0zH4G+kNVnd5
WSgudv8Cq4Px+MEuTOigt3QYjHzZ3cw/XNOOm9c+oGj+PDW4O9UItVI+S1WLoey4
rAB2nRcLMDPuwfRQC9XsF3zEbkv4h1dEXo/EBRuRpcF+0lLTzFw1lv1WE8OxlUmS
kAXbty3dIytBfSbtJT0c0Ps6sfQ4HFhu6ZV2fjnxNTz2KDkBIN7LBYHmBYiqY9oZ
9zvbUWtfiTu5ocfRtTq7rC/Hcj4Kbr9S9F/FvXR0WyDsKgu4xxAovqC3gcn6YjYK
Rr1tcCI4nUzyhVJVmd+OEhUvc5JbFy9aGage+YeOyejfvvSbXIunaxWlPjoDkvim
Vjl+UKU8gw51XFssqY5ZBi/HNlMFKYedLpMFp/fItnLglhj50V0eFWkpDgdSCYom
vo9ifPLZx8n8m8De3H7TV4E0F4gCHcTeqZdu7tW9AAUVM6iLJcDLm3asGmtNh21t
snOHNOJ5QSIno6ezUUg29T6VBjbPh46fdJJSlIZrEe8OzLZ1haGyttf0tD00PQvY
Z2W/m3gxafnOeGgBqvyv
=xOzf
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights include:
- Support for the kexec_file_load() syscall, which is a prereq for
secure and trusted boot.
- Prevent kernel execution of userspace on P9 Radix (similar to
SMEP/PXN).
- Sort the exception tables at build time, to save time at boot, and
store them as relative offsets to save space in the kernel image &
memory.
- Allow building the kernel with thin archives, which should allow us
to build an allyesconfig once some other fixes land.
- Build fixes to allow us to correctly rebuild when changing the
kernel endian from big to little or vice versa.
- Plumbing so that we can avoid doing a full mm TLB flush on P9
Radix.
- Initial stack protector support (-fstack-protector).
- Support for dumping the radix (aka. Linux) and hash page tables via
debugfs.
- Fix an oops in cxl coredump generation when cxl_get_fd() is used.
- Freescale updates from Scott: "Highlights include 8xx hugepage
support, qbman fixes/cleanup, device tree updates, and some misc
cleanup."
- Many and varied fixes and minor enhancements as always.
Thanks to:
Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman
Khandual, Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz,
Christophe Jaillet, Christophe Leroy, Denis Kirjanov, Elimar
Riesebieter, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff
Levand, Jack Miller, Johan Hovold, Lars-Peter Clausen, Libin,
Madhavan Srinivasan, Michael Neuling, Nathan Fontenot, Naveen N.
Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin, Rashmica
Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain"
[ And thanks to Michael, who took time off from a new baby to get this
pull request done. - Linus ]
* tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (174 commits)
powerpc/fsl/dts: add FMan node for t1042d4rdb
powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb
powerpc/fsl/dts: add QMan and BMan nodes on t1024
powerpc/fsl/dts: add QMan and BMan nodes on t1023
soc/fsl/qman: test: use DEFINE_SPINLOCK()
powerpc/fsl-lbc: use DEFINE_SPINLOCK()
powerpc/8xx: Implement support of hugepages
powerpc: get hugetlbpage handling more generic
powerpc: port 64 bits pgtable_cache to 32 bits
powerpc/boot: Request no dynamic linker for boot wrapper
soc/fsl/bman: Use resource_size instead of computation
soc/fsl/qe: use builtin_platform_driver
powerpc/fsl_pmc: use builtin_platform_driver
powerpc/83xx/suspend: use builtin_platform_driver
powerpc/ftrace: Fix the comments for ftrace_modify_code
powerpc/perf: macros for power9 format encoding
powerpc/perf: power9 raw event format encoding
powerpc/perf: update attribute_group data structure
powerpc/perf: factor out the event format field
powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown
...
This update consists of:
-- New tests to exercise the Sync Kernel Infrastructure. These tests
are part of a battery of Android libsync tests and are re-written
to test the new sync user-space interfaces from Emilio López, and
Gustavo Padovan.
-- Test to run hw-independent mock tests for i915.ko from Chris Wilson
-- A new gpio test case from Bamvor Jian Zhang
-- Missing gitignore additions
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJYUv0yAAoJEAsCRMQNDUMcNVQP/0JaO+AaA0GyrxtBlaj4R/sJ
VI4wQGPIqPGk+DtyjoZ4zkGPZzdY7U05wgJeUrm0/K263XaijjneCvfskCG/2R4d
1rO0sAOXSsgpDnQ1Nt2hDAV42zSwecFWFl4odfe1oHznME+fKGg7tXHX74/R8FZp
eMhg0UYME5Dpq4dC3s3NxDYcieHVq9e8v2A5eJx5PjTbcrnztkKzdBiNWUWaW4Jt
qyUmTVBA2M/vTDyJWCzfTBFadVpYBycewIZdDpQN682VFT6d+mvN/hP6pRYGqNwd
FDhM1asdhQiJXEmNd7Muufz3CZ5FlzV3VW+u+eZkeEwInw4uwtajXvqM8+5K4YyE
yKcnCXyFjsmnJtsEba6Gf8S35Dbzcx7j3nJRkqk+ULcGKSUHR2GnWtltZUKtKsqp
/aB8M1H+xoLKdzDnMADNZG/6YMhGq/QOJn1qocn2HmDUSawjUMriCm2ylo09YQwH
Ud0WVSTZZD+jWRLAB6nLXOREXLincZ0e7z2wfV/SHQ1d7MaS9lidxSRbwaRLd9+f
iZ9WS9xkW8axxpe3vmgHdIxxrhy7LLJe+4b/iDo/iMC9waFv6qXcYh7l5/bEXoIz
0D1eB4L5mBHF032tSatjxy1ewy9uiU5M9STmZVdL6Jla7qsxHQEc0x5cr2evpQ4/
tjautkTKCbWocEl/KLPt
=qjL9
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-4.10-rc1-update' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest updates from Shuah Khan:
"This update consists of:
- new tests to exercise the Sync Kernel Infrastructure. These tests
are part of a battery of Android libsync tests and are re-written
to test the new sync user-space interfaces from Emilio López, and
Gustavo Padovan.
- test to run hw-independent mock tests for i915.ko from Chris Wilson
- a new gpio test case from Bamvor Jian Zhang
- missing gitignore additions"
* tag 'linux-kselftest-4.10-rc1-update' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftest/gpio: add gpio test case
selftest: sync: improve assert() failure message
kselftests: Exercise hw-independent mock tests for i915.ko
selftests: add missing gitignore files/dirs
selftests: add missing set-tz to timers .gitignore
selftest: sync: stress test for merges
selftest: sync: stress consumer/producer test
selftest: sync: stress test for parallelism
selftest: sync: wait tests for sw_sync framework
selftest: sync: merge tests for sw_sync framework
selftest: sync: fence tests for sw_sync framework
selftest: sync: basic tests for sw_sync framework
o STM can hook into the function tracer
o Function filtering now supports more advance glob matching
o Ftrace selftests updates and added tests
o Softirq tag in traces now show only softirqs
o ARM nop added to non traced locations at compile time
o New trace_marker_raw file that allows for binary input
o Optimizations to the ring buffer
o Removal of kmap in trace_marker
o Wakeup and irqsoff tracers now adhere to the set_graph_notrace file
o Other various fixes and clean ups
Note, there are two patches marked for stable. These were discovered
near the end of the 4.9 rc release cycle. By the time I had them tested
it was just a matter of days before 4.9 would be released, and I
figured I would just submit them in the merge window. They are old
bugs and not critical. Nothing non-root could abuse.
-----BEGIN PGP SIGNATURE-----
iQExBAABCAAbBQJYUrFHFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
2+AIAIr20kSQV/nA5htGAeCTobVk3WUxY6bvjd9mIJDKPP19akNLyREW0G3KnfCr
yhx4aFRZG98fRu/6F8qieRosyN36lADDVYHelMFHMpcTOpE2aZGjaaOuNGxOEA9v
FmMPTX+K3+dzKyFP4l68R3+5JuQ1/AqLTioTWeLW8IDQ2OOVsjD8+0BuXrNKMJDY
o6U4Hk5U/vn+zHc6BmgBzloAXemBd7iJ1t5V3FRRGvm8yv3HU85Twc5ofGeYTWvB
J8PboEywRlIzxg0Kd8mxnMI5PgaKZSEc2ub8E7cY/CZ5PYpDE2xDA2hJmJgfYp00
1VW+DHRpRZfElsCcya6S6P4bs5Y=
=MGZ/
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"This release has a few updates:
- STM can hook into the function tracer
- Function filtering now supports more advance glob matching
- Ftrace selftests updates and added tests
- Softirq tag in traces now show only softirqs
- ARM nop added to non traced locations at compile time
- New trace_marker_raw file that allows for binary input
- Optimizations to the ring buffer
- Removal of kmap in trace_marker
- Wakeup and irqsoff tracers now adhere to the set_graph_notrace file
- Other various fixes and clean ups"
* tag 'trace-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (42 commits)
selftests: ftrace: Shift down default message verbosity
kprobes/trace: Fix kprobe selftest for newer gcc
tracing/kprobes: Add a helper method to return number of probe hits
tracing/rb: Init the CPU mask on allocation
tracing: Use SOFTIRQ_OFFSET for softirq dectection for more accurate results
tracing/fgraph: Have wakeup and irqsoff tracers ignore graph functions too
fgraph: Handle a case where a tracer ignores set_graph_notrace
tracing: Replace kmap with copy_from_user() in trace_marker writing
ftrace/x86_32: Set ftrace_stub to weak to prevent gcc from using short jumps to it
tracing: Allow benchmark to be enabled at early_initcall()
tracing: Have system enable return error if one of the events fail
tracing: Do not start benchmark on boot up
tracing: Have the reg function allow to fail
ring-buffer: Force rb_end_commit() and rb_set_commit_to_write() inline
ring-buffer: Froce rb_update_write_stamp() to be inlined
ring-buffer: Force inline of hotpath helper functions
tracing: Make __buffer_unlock_commit() always_inline
tracing: Make tracepoint_printk a static_key
ring-buffer: Always inline rb_event_data()
ring-buffer: Make rb_reserve_next_event() always inlined
...
[ This resurrects commit 53855d10f4, which was reverted in
2b41226b39. It depended on commit d544abd5ff ("lib/radix-tree:
Convert to hotplug state machine") so now it is correct to apply ]
Patch "lib/radix-tree: Convert to hotplug state machine" breaks the test
suite as it adds a call to cpuhp_setup_state_nocalls() which is not
currently emulated in the test suite. Add it, and delete the emulation
of the old CPU hotplug mechanism.
Link: http://lkml.kernel.org/r/1480369871-5271-36-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This file was used to implement call_rcu() before liburcu implemented
that function. It hasn't even been compiled since before the test suite
was added to the kernel. Remove it to reduce confusion.
Link: http://lkml.kernel.org/r/1481667692-14500-5-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have a check that setting a tag on a single entry at root succeeds,
but we were missing a check that clearing a tag on that same entry also
succeeds.
Link: http://lkml.kernel.org/r/1481667692-14500-4-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
radix_tree_join() was freeing nodes with a non-zero ->exceptional count,
and radix_tree_split() wasn't zeroing ->exceptional when it allocated
the new node. Fix this by making all callers of radix_tree_node_alloc()
pass in the new counts (and some other always-initialised fields), which
will prevent the problem recurring if in future we decide to do
something similar.
Link: http://lkml.kernel.org/r/1481667692-14500-3-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The kmem_cache_alloc implementation simply allocates new memory from
malloc() and calls the ctor, which zeroes out the entire object. This
means it cannot spot bugs where the object isn't properly reinitialised
before being freed.
Add a small (11 objects) cache before freeing objects back to malloc.
This is enough to let us write a test to catch it, although the memory
allocator is now aware of the structure of the radix tree node, since it
chains free objects through ->private_data (like the percpu cache does).
Link: http://lkml.kernel.org/r/1481667692-14500-2-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
IDR needs more functionality from the kernel: kmalloc()/kfree(), and
xchg().
Link: http://lkml.kernel.org/r/1480369871-5271-67-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The random iteration test only inserts order-0 entries currently.
Update it to insert entries of order between 7 and 0. Also make the
maximum index configurable, make some variables static, make the test
duration variable, remove some useless spinning, and add a fifth thread
which calls tag_tagged_items().
Link: http://lkml.kernel.org/r/1480369871-5271-62-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When replacing an entry with NULL, we need to delete any sibling
entries. Also account deleting exceptional entries properly. Also fix
a bug with radix_tree_iter_replace() where we would fail to remove
entirely freed nodes. Also fix accounting bug when switching between
normal and exceptional entries with replace_slot. Also add testcases
for all these bugs.
Link: http://lkml.kernel.org/r/1480369871-5271-61-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Calculate how many nodes we need to allocate to split an old_order entry
into multiple entries, each of size new_order. The test suite checks
that we allocated exactly the right number of nodes; neither too many
(checked by rtp->nr == 0), nor too few (checked by comparing
nr_allocated before and after the call to radix_tree_split()).
Link: http://lkml.kernel.org/r/1480369871-5271-60-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This new function splits a larger multiorder entry into smaller entries
(potentially multi-order entries). These entries are initialised to
RADIX_TREE_RETRY to ensure that RCU walkers who see this state aren't
confused. The caller should then call radix_tree_for_each_slot() and
radix_tree_replace_slot() in order to turn these retry entries into the
intended new entries. Tags are replicated from the original multiorder
entry into each new entry.
Link: http://lkml.kernel.org/r/1480369871-5271-59-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This new function allows for the replacement of many smaller entries in
the radix tree with one larger multiorder entry. From the point of view
of an RCU walker, they may see a mixture of the smaller entries and the
large entry during the same walk, but they will never see NULL for an
index which was populated before the join.
Link: http://lkml.kernel.org/r/1480369871-5271-58-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is an exceptionally complicated function with just one caller
(tag_pages_for_writeback). We devote a large portion of the runtime of
the test suite to testing this one function which has one caller. By
introducing the new function radix_tree_iter_tag_set(), we can eliminate
all of the complexity while keeping the performance. The caller can now
use a fairly standard radix_tree_for_each() loop, and it doesn't need to
worry about tricksy things like 'start' wrapping.
The test suite continues to spend a large amount of time investigating
this function, but now it's testing the underlying primitives such as
radix_tree_iter_resume() and the radix_tree_for_each_tagged() iterator
which are also used by other parts of the kernel.
Link: http://lkml.kernel.org/r/1480369871-5271-57-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This rather complicated function can be better implemented as an
iterator. It has only one caller, so move the functionality to the only
place that needs it. Update the test suite to follow the same pattern.
Link: http://lkml.kernel.org/r/1480369871-5271-56-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Konstantin Khlebnikov <koct9i@gmail.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes several interlinked problems with the iterators in the
presence of multiorder entries.
1. radix_tree_iter_next() would only advance by one slot, which would
result in the iterators returning the same entry more than once if
there were sibling entries.
2. radix_tree_next_slot() could return an internal pointer instead of
a user pointer if a tagged multiorder entry was immediately followed by
an entry of lower order.
3. radix_tree_next_slot() expanded to a lot more code than it used to
when multiorder support was compiled in. And I wasn't comfortable with
entry_to_node() being in a header file.
Fixing radix_tree_iter_next() for the presence of sibling entries
necessarily involves examining the contents of the radix tree, so we now
need to pass 'slot' to radix_tree_iter_next(), and we need to change the
calling convention so it is called *before* dropping the lock which
protects the tree. Also rename it to radix_tree_iter_resume(), as some
people thought it was necessary to call radix_tree_iter_next() each time
around the loop.
radix_tree_next_slot() becomes closer to how it looked before multiorder
support was introduced. It only checks to see if the next entry in the
chunk is a sibling entry or a pointer to a node; this should be rare
enough that handling this case out of line is not a performance impact
(and such impact is amortised by the fact that the entry we just
processed was a multiorder entry). Also, radix_tree_next_slot() used to
force a new chunk lookup for untagged entries, which is more expensive
than the out of line sibling entry skipping.
Link: http://lkml.kernel.org/r/1480369871-5271-55-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the old find_next_bit code in favour of linking in the find_bit
code from tools/lib.
Link: http://lkml.kernel.org/r/1480369871-5271-48-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This probably doubles the size of each item allocated by the test suite
but it lets us check a few more things, and may be needed for upcoming
API changes that require the caller pass in the order of the entry.
Link: http://lkml.kernel.org/r/1480369871-5271-46-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
item_kill_tree() assumes that everything in the tree is a pointer to a
struct item, which is annoying when testing the behaviour of exceptional
entries. Fix it to delete exceptional entries on the assumption they
don't need to be freed.
Link: http://lkml.kernel.org/r/1480369871-5271-45-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Calling rcu_barrier() allows all of the rcu-freed memory to be actually
returned to the pool, and allows nr_allocated to return to 0. As well
as allowing diffs between runs to be more useful, it also lets us
pinpoint leaks more effectively.
Link: http://lkml.kernel.org/r/1480369871-5271-44-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds simple benchmark for iterator similar to one I've used for
commit 78c1d78488 ("radix-tree: introduce bit-optimized iterator")
Building with make BENCHMARK=1 set radix tree order to 6, this allows to
get performance comparable to in kernel performance.
Link: http://lkml.kernel.org/r/1480369871-5271-43-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Each thread needs to register itself with RCU, otherwise the reading
thread's read lock has no effect and the freeing thread will free the
memory in the tree without waiting for the read lock to be dropped.
Link: http://lkml.kernel.org/r/1480369871-5271-42-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of reseeding the random number generator every time around the
loop in big_gang_check(), seed it at the beginning of execution. Use
rand_r() and an independent base seed for each thread in
iteration_test() so they don't stomp all over each others state. Since
this particular test depends on the kernel scheduler, the iteration test
can't be reproduced based purely on the random seed, but at least it
won't pollute the other tests.
Print the seed, and allow the seed to be specified so that a run which
hits a problem can be reproduced.
Link: http://lkml.kernel.org/r/1480369871-5271-41-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>