The remaining tests are not suitable for moving in-kernel, so move
item_insert_order() into multiorder.c, make it static and make it use
the XArray.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
This version is a little less thorough in order to be a little quicker,
but tests the important edge cases. Also test adding a multiorder entry
at a non-canonical index, and erasing it.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Test this functionality inside the kernel as well as in userspace.
Also remove insert_bug() as there's no comparable thing to test
in the XArray code.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
With no code left in the kernel using the multiorder radix tree, convert
the iteration test from the radix tree to the XArray. It's unlikely to
suffer the same bug as the radix tree, but this test will prevent that
bug from ever creeping into the XArray implementation.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
The tag_tagged_items() function is supposed to test the page-writeback
tagging code. Since that has been converted to the XArray, there's
not much point in testing the radix tree's tagging code. This requires
using the pthread mutex embedded in the xarray instead of an external
lock, so remove the pthread mutexes which protect xarrays/radix trees.
Also remove radix_tree_iter_tag_set() as this was the last user.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
The page cache was the only user of this interface and it has now
been converted to the XArray. Transform the test into a test of
xas_init_marks().
Signed-off-by: Matthew Wilcox <willy@infradead.org>
radix_tree_split and radix_tree_join were never used upstream. Remove
them; if they're needed in future they will be replaced by XArray
equivalents.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
The only user of this functionality was the workingset code, and it's
now been converted to the XArray. Remove __radix_tree_delete_node()
entirely as it was also only used by the workingset code.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
This is a 1:1 conversion. The major part of this patch is converting
the test framework from userspace to kernel space and mirroring the
algorithm now used in find_swap_entry().
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Now the page cache lookup is using the XArray, let's convert this
regression test from the radix tree API to the XArray so it's testing
roughly the same thing it was testing before.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
There's no direct replacement for radix_tree_for_each_contig()
in the XArray API as it's an unusual thing to do. Instead,
open-code a loop using xas_next(). This removes the only user of
radix_tree_for_each_contig() so delete the iterator from the API and
the test suite code for it.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Use the XA_TRACK_FREE ability to track which entries have a free bit,
similarly to how it uses the radix tree's IDR_FREE tag. This eliminates
the per-cpu ida_bitmap preload, and fixes the memory consumption
regression I introduced when making the IDR able to store any pointer.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
xa_store() differs from radix_tree_insert() in that it will overwrite an
existing element in the array rather than returning an error. This is
the behaviour which most users want, and those that want more complex
behaviour generally want to use the xas family of routines anyway.
For memory allocation, xa_store() will first attempt to request memory
from the slab allocator; if memory is not immediately available, it will
drop the xa_lock and allocate memory, keeping a pointer in the xa_state.
It does not use the per-CPU cache, although those will continue to exist
until all radix tree users are converted to the xarray.
This patch also includes xa_erase() and __xa_erase() for a streamlined
way to store NULL. Since there is no need to allocate memory in order
to store a NULL in the XArray, we do not need to trouble the user with
deciding what memory allocation flags to use.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
XArray marks are like the radix tree tags, only slightly more strongly
typed. They are renamed in order to distinguish them from tagged
pointers. This commit adds the basic get/set/clear operations.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
The xa_load function brings with it a lot of infrastructure; xa_empty(),
xa_is_err(), and large chunks of the XArray advanced API that are used
to implement xa_load.
As the test-suite demonstrates, it is possible to use the XArray functions
on a radix tree. The radix tree functions depend on the GFP flags being
stored in the root of the tree, so it's not possible to use the radix
tree functions on an XArray.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
This is a direct replacement for struct radix_tree_node. A couple of
struct members have changed name, so convert those. Use a #define so
that radix tree users continue to work without change.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
This is a direct replacement for struct radix_tree_root. Some of the
struct members have changed name; convert those, and use a #define so
that radix_tree users continue to work without change.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Instead of storing a pointer to the slot containing the canonical entry,
store the offset of the slot. Produces slightly more efficient code
(~300 bytes) and simplifies the implementation.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Introduce xarray value entries and tagged pointers to replace radix
tree exceptional entries. This is a slight change in encoding to allow
the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a
value entry). It is also a change in emphasis; exceptional entries are
intimidating and different. As the comment explains, you can choose
to store values or pointers in the xarray and they are both first-class
citizens.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
An upcoming change to the encoding of internal entries will set the bottom
two bits to 0b10. Unfortunately, m68k only aligns some data structures
to 2 bytes, so the IDR will interpret them as internal entries and things
will go badly wrong.
Change the radix tree so that it stops either when the node indicates
that it's the bottom of the tree (shift == 0) or when the entry is not an
internal entry. This means we cannot insert an arbitrary kernel pointer
as a multiorder entry, but the IDR does not permit multiorder entries.
Annoyingly, this means the IDR can no longer take advantage of the radix
tree's ability to store a single entry at offset 0 without allocating
memory. A pointer which is 2-byte aligned cannot be stored directly in
the root as it would be indistinguishable from a node, so we must allocate
a node in order to store a 2-byte pointer at index 0. The idr_replace()
function does not take a GFP flags argument, so cannot allocate memory.
If a user inserts a 4-byte aligned pointer at index 0 and then replaces
it with a 2-byte aligned pointer, we must be able to store it.
Arbitrary pointer values are still not permitted; pointers of the
form 2 + (i * 4) for values of i between 0 and 1023 are reserved for
the implementation. These are not valid kernel pointers as they would
point into the zero page.
This change does cause a runtime memory consumption regression for
the IDA. I will recover that later.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
A reasonably big batch of fixes due to me being away for a few weeks.
A fix for the TM emulation support on Power9, which could result in corrupting
the guest r11 when running under KVM.
Two fixes to the TM code which could lead to userspace GPR corruption if we take
an SLB miss at exactly the wrong time.
Our dynamic patching code had a bug that meant we could patch freed __init text,
which could lead to corrupting userspace memory.
csum_ipv6_magic() didn't work on little endian platforms since we optimised it
recently.
A fix for an endian bug when reading a device tree property telling us how many
storage keys the machine has available.
Fix a crash seen on some configurations of PowerVM when migrating the partition
from one machine to another.
A fix for a regression in the setup of our CPU to NUMA node mapping in KVM
guests.
A fix to our selftest Makefiles to make them work since a recent change to the
shared Makefile logic.
Thanks to:
Alexey Kardashevskiy, Breno Leitao, Christophe Leroy, Michael Bringmann,
Michael Neuling, Nicholas Piggin, Paul Mackerras,, Srikar Dronamraju, Thiago
Jung Bauermann, Xin Long.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAluuCTsTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgDKjD/sHWK1MFXM4baYrdju2tW9tBBiIq18n
UQ4byhJ6wgXzNDzxSq3v4ofW/5A8HoF+2wTqIXFY5l9hMQRw+TfB3ZymloZZ/0Nf
3sxFfL0sv7or4V8Ben/vGEdalg9PzFInwAXhBCGV0nnHiE3D7GkO+MU3zsZUjzum
EbWN/SKMLhjGByeqGbujJYfLC2IdvAdpTzrYd0t0X/Pqi8EUuGjJa0E+cJqD6mjI
GR4eGagImuDuTH6F9eLwcQt2mlDvmv6KJjtgoMbVeh99pqkG9tSnfOlZ+KrWWVps
1+/CXHlaJiHsmX3wqWKjw31x1Ag4v+BEtAjXpwnr7MFroumZ18xRLaDfy11eJ+GB
x3fPbdylittMQy1XYKRrntppDsr0dTAMCjBmE6KSMRyi+F5/agW8bG3sx7D26/AO
JoWGw/XTxislvdr/PIPkjISXRSKHa5jwT3rH45Id8W1uEt0TQ7hJOmWIOdC1WXU8
IbYy3imksXLCcxIJmSzc+cUkVo1wHf4S4NxcJsVs4N5pE/zih5mEKi9jaIpxocF/
YX43n4hGa/6oJYBBMeQZpsO3aDiP/Ynjk4j/I2l92WAIXzQXjVXhreLfHzci+fGg
Q6KdxNWPbwe3UKUTnQ8TOY4ncshU716HrFsp/B4HWvJTfvY6tc7nKDupieetzyKp
G5NA6Bh+VOVDxQ==
=G6wq
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.19-3' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Michael writes:
"powerpc fixes for 4.19 #3
A reasonably big batch of fixes due to me being away for a few weeks.
A fix for the TM emulation support on Power9, which could result in
corrupting the guest r11 when running under KVM.
Two fixes to the TM code which could lead to userspace GPR corruption
if we take an SLB miss at exactly the wrong time.
Our dynamic patching code had a bug that meant we could patch freed
__init text, which could lead to corrupting userspace memory.
csum_ipv6_magic() didn't work on little endian platforms since we
optimised it recently.
A fix for an endian bug when reading a device tree property telling
us how many storage keys the machine has available.
Fix a crash seen on some configurations of PowerVM when migrating the
partition from one machine to another.
A fix for a regression in the setup of our CPU to NUMA node mapping
in KVM guests.
A fix to our selftest Makefiles to make them work since a recent
change to the shared Makefile logic."
* tag 'powerpc-4.19-3' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc: Fix Makefiles for headers_install change
powerpc/numa: Use associativity if VPHN hcall is successful
powerpc/tm: Avoid possible userspace r1 corruption on reclaim
powerpc/tm: Fix userspace r13 corruption
powerpc/pseries: Fix unitialized timer reset on migration
powerpc/pkeys: Fix reading of ibm, processor-storage-keys property
powerpc: fix csum_ipv6_magic() on little endian platforms
powerpc/powernv/ioda2: Reduce upper limit for DMA window size (again)
powerpc: Avoid code patching freed init sections
KVM: PPC: Book3S HV: Fix guest r11 corruption with POWER9 TM workarounds
Commit b2d35fa5fc ("selftests: add headers_install to lib.mk")
introduced a requirement that Makefiles more than one level below the
selftests directory need to define top_srcdir, but it didn't update
any of the powerpc Makefiles.
This broke building all the powerpc selftests with eg:
make[1]: Entering directory '/src/linux/tools/testing/selftests/powerpc'
BUILD_TARGET=/src/linux/tools/testing/selftests/powerpc/alignment; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C alignment all
make[2]: Entering directory '/src/linux/tools/testing/selftests/powerpc/alignment'
../../lib.mk:20: ../../../../scripts/subarch.include: No such file or directory
make[2]: *** No rule to make target '../../../../scripts/subarch.include'.
make[2]: Failed to remake makefile '../../../../scripts/subarch.include'.
Makefile:38: recipe for target 'alignment' failed
Fix it by setting top_srcdir in the affected Makefiles.
Fixes: b2d35fa5fc ("selftests: add headers_install to lib.mk")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Dave writes:
"Networking fixes:
1) Fix multiqueue handling of coalesce timer in stmmac, from Jose
Abreu.
2) Fix memory corruption in NFC, from Suren Baghdasaryan.
3) Don't write reserved bits in ravb driver, from Kazuya Mizuguchi.
4) SMC bug fixes from Karsten Graul, YueHaibing, and Ursula Braun.
5) Fix TX done race in mvpp2, from Antoine Tenart.
6) ipv6 metrics leak, from Wei Wang.
7) Adjust firmware version requirements in mlxsw, from Petr Machata.
8) Fix autonegotiation on resume in r8169, from Heiner Kallweit.
9) Fixed missing entries when dumping /proc/net/if_inet6, from Jeff
Barnhill.
10) Fix double free in devlink, from Dan Carpenter.
11) Fix ethtool regression from UFO feature removal, from Maciej
Żenczykowski.
12) Fix drivers that have a ndo_poll_controller() that captures the
cpu entirely on loaded hosts by trying to drain all rx and tx
queues, from Eric Dumazet.
13) Fix memory corruption with jumbo frames in aquantia driver, from
Friedemann Gerold."
* gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net: (79 commits)
net: mvneta: fix the remaining Rx descriptor unmapping issues
ip_tunnel: be careful when accessing the inner header
mpls: allow routes on ip6gre devices
net: aquantia: memory corruption on jumbo frames
tun: remove ndo_poll_controller
nfp: remove ndo_poll_controller
bnxt: remove ndo_poll_controller
bnx2x: remove ndo_poll_controller
mlx5: remove ndo_poll_controller
mlx4: remove ndo_poll_controller
i40evf: remove ndo_poll_controller
ice: remove ndo_poll_controller
igb: remove ndo_poll_controller
ixgb: remove ndo_poll_controller
fm10k: remove ndo_poll_controller
ixgbevf: remove ndo_poll_controller
ixgbe: remove ndo_poll_controller
bonding: use netpoll_poll_dev() helper
netpoll: make ndo_poll_controller() optional
rds: Fix build regression.
...
Thomas writes:
"- Provide a strerror_r wrapper so lib/bpf can be built on systems
without _GNU_SOURCE
- Unbreak the man page generator when building out of tree"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf Documentation: Fix out-of-tree asciidoctor man page generation
tools lib bpf: Provide wrapper for strerror_r to build in !_GNU_SOURCE systems
Ensure that sockets added to a sock{map|hash} that is not in the
ESTABLISHED state is rejected.
Fixes: 1aa12bdf1b ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
I swear I would have sent it the same to Linus! The main cause for
this is that I was on vacation until two weeks ago and it took a while
to sort all the pending patches between 4.19 and 4.20, test them and
so on.
It's mostly small bugfixes and cleanups, mostly around x86 nested
virtualization. One important change, not related to nested
virtualization, is that the ability for the guest kernel to trap CPUID
instructions (in Linux that's the ARCH_SET_CPUID arch_prctl) is now
masked by default. This is because the feature is detected through an
MSR; a very bad idea that Intel seems to like more and more. Some
applications choke if the other fields of that MSR are not initialized
as on real hardware, hence we have to disable the whole MSR by default,
as was the case before Linux 4.12.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJbpPo1AAoJEL/70l94x66DdxgH/is0qe6ZBtzb6Qc0W+8mHHD7
nxIkWAs2V5NsouJ750YwRQ+0Ym407+wlNt30acdBUEoXhrnA5/TvyGq999XvCL96
upWEIxpIgbvTMX/e2nLhe4wQdhsboUK4r0/B9IFgVFYrdCt5uRXjB2G4ewxcqxL/
GxxqrAKhaRsbQG9Xv0Fw5Vohh/Ls6fQDJcyuY1EBnbMpVenq2QDLI6cOAPXncyFb
uLN6ov4GNCWIPckwxejri5XhZesUOsafrmn48sApShh4T6TrisrdtSYdzl+DGza+
j5vhIEwdFO5kulZ3viuhqKJOnS2+F6wvfZ75IKT0tEKeU2bi+ifGDyGRefSF6Q0=
=YXLw
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Paolo writes:
"It's mostly small bugfixes and cleanups, mostly around x86 nested
virtualization. One important change, not related to nested
virtualization, is that the ability for the guest kernel to trap
CPUID instructions (in Linux that's the ARCH_SET_CPUID arch_prctl) is
now masked by default. This is because the feature is detected
through an MSR; a very bad idea that Intel seems to like more and
more. Some applications choke if the other fields of that MSR are
not initialized as on real hardware, hence we have to disable the
whole MSR by default, as was the case before Linux 4.12."
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (23 commits)
KVM: nVMX: Fix bad cleanup on error of get/set nested state IOCTLs
kvm: selftests: Add platform_info_test
KVM: x86: Control guest reads of MSR_PLATFORM_INFO
KVM: x86: Turbo bits in MSR_PLATFORM_INFO
nVMX x86: Check VPID value on vmentry of L2 guests
nVMX x86: check posted-interrupt descriptor addresss on vmentry of L2
KVM: nVMX: Wake blocked vCPU in guest-mode if pending interrupt in virtual APICv
KVM: VMX: check nested state and CR4.VMXE against SMM
kvm: x86: make kvm_{load|put}_guest_fpu() static
x86/hyper-v: rename ipi_arg_{ex,non_ex} structures
KVM: VMX: use preemption timer to force immediate VMExit
KVM: VMX: modify preemption timer bit only when arming timer
KVM: VMX: immediately mark preemption timer expired only for zero value
KVM: SVM: Switch to bitmap_zalloc()
KVM/MMU: Fix comment in walk_shadow_page_lockless_end()
kvm: selftests: use -pthread instead of -lpthread
KVM: x86: don't reset root in kvm_mmu_setup()
kvm: mmu: Don't read PDPTEs when paging is not enabled
x86/kvm/lapic: always disable MMIO interface in x2APIC mode
KVM: s390: Make huge pages unavailable in ucontrol VMs
...
Test guest access to MSR_PLATFORM_INFO when the capability is enabled
or disabled.
Signed-off-by: Drew Schmitt <dasch@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
I run into the following error
testing/selftests/kvm/dirty_log_test.c:285: undefined reference to `pthread_create'
testing/selftests/kvm/dirty_log_test.c:297: undefined reference to `pthread_join'
collect2: error: ld returned 1 exit status
my gcc version is gcc version 4.8.4
"-pthread" would work everywhere
Signed-off-by: Lei Yang <Lei.Yang@windriver.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
libc_compat.h is used by libbpf so make sure it's licensed under
LGPL or BSD license. The license change should be OK, I'm the only
author of the file.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
- Fix the build on !_GNU_SOURCE libc systems such as Alpine Linux/musl
libc due to usage of strerror_r glibc variant on libbpf (Arnaldo Carvalho de Melo)
- Fix out-of-tree asciidoctor man page generation (Ben Hutchings)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCW6EaQwAKCRCyPKLppCJ+
J2GjAP9j3onRb93lyoLP5eUAxdSrEmFQ/myn7rLuDLZoo2RVPQD+MqCwf6hlNQgl
wKM4jrJT/mwV53a3+bDeNoXBCEbuxQw=
=hr0o
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo-4.19-20180918' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix the build on !_GNU_SOURCE libc systems such as Alpine Linux/musl
libc due to usage of strerror_r glibc variant on libbpf (Arnaldo Carvalho de Melo)
- Fix out-of-tree asciidoctor man page generation (Ben Hutchings)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The cleanup function uses "$CMD 2 > /dev/null", which doesn't actually
send stderr to /dev/null, so when the netns doesn't exist, the error
message is shown. Use "2> /dev/null" instead, so that those messages
disappear, as was intended.
Fixes: d1f1b9cbf3 ("selftests: net: Introduce first PMTU test")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dependency for the man page rule using asciidoctor incorrectly
specifies a source file in $(OUTPUT). When building out-of-tree, the
source file is not found, resulting in a fall-back to the following rule
which uses xmlto.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180916151704.GF4765@decadent.org.uk
Fixes: ffef80ecf8 ("perf Documentation: Support for asciidoctor")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Same problem that got fixed in a similar fashion in tools/perf/ in
c8b5f2c96d ("tools: Introduce str_error_r()"), fix it in the same
way, licensing needs to be sorted out to libbpf to use libapi, so,
for this simple case, just get the same wrapper in tools/lib/bpf.
This makes libbpf and its users (bpftool, selftests, perf) to build
again in Alpine Linux 3.[45678] and edge.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Quentin Monnet <quentin.monnet@netronome.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Fixes: 1ce6a9fc15 ("bpf: fix build error in libbpf with EXTRA_CFLAGS="-Wp, -D_FORTIFY_SOURCE=2 -O2"")
Link: https://lkml.kernel.org/r/20180917151636.GA21790@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Dave writes:
"Various fixes, all over the place:
1) OOB data generation fix in bluetooth, from Matias Karhumaa.
2) BPF BTF boundary calculation fix, from Martin KaFai Lau.
3) Don't bug on excessive frags, to be compatible in situations mixing
older and newer kernels on each end. From Juergen Gross.
4) Scheduling in RCU fix in hv_netvsc, from Stephen Hemminger.
5) Zero keying information in TLS layer before freeing copies
of them, from Sabrina Dubroca.
6) Fix NULL deref in act_sample, from Davide Caratti.
7) Orphan SKB before GRO in veth to prevent crashes with XDP,
from Toshiaki Makita.
8) Fix use after free in ip6_xmit, from Eric Dumazet.
9) Fix VF mac address regression in bnxt_en, from Micahel Chan.
10) Fix MSG_PEEK behavior in TLS layer, from Daniel Borkmann.
11) Programming adjustments to r8169 which fix not being to enter deep
sleep states on some machines, from Kai-Heng Feng and Hans de
Goede.
12) Fix DST_NOCOUNT flag handling for ipv6 routes, from Peter
Oskolkov."
* gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net: (45 commits)
net/ipv6: do not copy dst flags on rt init
qmi_wwan: set DTR for modems in forced USB2 mode
clk: x86: Stop marking clocks as CLK_IS_CRITICAL
r8169: Get and enable optional ether_clk clock
clk: x86: add "ether_clk" alias for Bay Trail / Cherry Trail
r8169: enable ASPM on RTL8106E
r8169: Align ASPM/CLKREQ setting function with vendor driver
Revert "kcm: remove any offset before parsing messages"
kcm: remove any offset before parsing messages
net: ethernet: Fix a unused function warning.
net: dsa: mv88e6xxx: Fix ATU Miss Violation
tls: fix currently broken MSG_PEEK behavior
hv_netvsc: pair VF based on serial number
PCI: hv: support reporting serial number as slot information
bnxt_en: Fix VF mac address regression.
ipv6: fix possible use-after-free in ip6_xmit()
net: hp100: fix always-true check for link up state
ARM: dts: at91: add new compatibility string for macb on sama5d3
net: macb: disable scatter-gather for macb on sama5d3
net: mvpp2: let phylink manage the carrier state
...
In kTLS MSG_PEEK behavior is currently failing, strace example:
[pid 2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
[pid 2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
[pid 2430] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 2430] listen(4, 10) = 0
[pid 2430] getsockname(4, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
[pid 2430] connect(3, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 2430] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
[pid 2430] setsockopt(3, 0x11a /* SOL_?? */, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
[pid 2430] accept(4, {sa_family=AF_INET, sin_port=htons(49636), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
[pid 2430] setsockopt(5, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
[pid 2430] setsockopt(5, 0x11a /* SOL_?? */, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
[pid 2430] close(4) = 0
[pid 2430] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14
[pid 2430] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11
[pid 2430] recvfrom(5, "test_read_peektest_read_peektest"..., 64, MSG_PEEK, NULL, NULL) = 64
As can be seen from strace, there are two TLS records sent,
i) 'test_read_peek' and ii) '_mult_recs\0' where we end up
peeking 'test_read_peektest_read_peektest'. This is clearly
wrong, and what happens is that given peek cannot call into
tls_sw_advance_skb() to unpause strparser and proceed with
the next skb, we end up looping over the current one, copying
the 'test_read_peek' over and over into the user provided
buffer.
Here, we can only peek into the currently held skb (current,
full TLS record) as otherwise we would end up having to hold
all the original skb(s) (depending on the peek depth) in a
separate queue when unpausing strparser to process next
records, minimally intrusive is to return only up to the
current record's size (which likely was what c46234ebb4
("tls: RX path for ktls") originally intended as well). Thus,
after patch we properly peek the first record:
[pid 2046] wait4(2075, <unfinished ...>
[pid 2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
[pid 2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
[pid 2075] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 2075] listen(4, 10) = 0
[pid 2075] getsockname(4, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
[pid 2075] connect(3, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 2075] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
[pid 2075] setsockopt(3, 0x11a /* SOL_?? */, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
[pid 2075] accept(4, {sa_family=AF_INET, sin_port=htons(45732), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
[pid 2075] setsockopt(5, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
[pid 2075] setsockopt(5, 0x11a /* SOL_?? */, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
[pid 2075] close(4) = 0
[pid 2075] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14
[pid 2075] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11
[pid 2075] recvfrom(5, "test_read_peek", 64, MSG_PEEK, NULL, NULL) = 14
Fixes: c46234ebb4 ("tls: RX path for ktls")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This Kselftest fixes update for 4.9-rc5 consists of:
-- fixes to build failures
-- fixes to add missing config files to increase test coverage
-- fixes to cgroup test and a new cgroup test for memory.oom.group
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAluerlwACgkQCwJExA0N
QxzFbA//Xk32yqgo16SNN8uovPJo+Oi5TICblNZTI5d2LlmI/H6+773EALa+m0zC
2tMvZ1A/SSWymsGTyEAFEuzhwnYqcLZc9InoK8W+hZjKd3XqyjvTp1ZGnA2iJb89
3wP0VgbEuqhtnyhIAAp6dQuaVBK3kJEmC47IPy1qziVwupotN8xJuj/1/9WiWn5X
MqIEfmBo4Bi0Ugn8xpOEIBU9bEi3ZNO2iA/3V5j//jzKZzOvbkLSexIQdtcay4Rj
eLY03Sw5VEKNN5btpf5dpOojmAb2ipOUfQh3RKGpZMmqMqcHm71+GxXxQyjOAZrc
kFsUjLvKoyEUuVKC84jAhuim8aZIbNxxiKFGBlZFFIcrF/yJt8PX4zo1mpWrlpa/
Yh5gP+xMMq7p0CaTVTksnqy051bfjCVKyTqwSTTFo5pimCA7JHxAsFvRtSrxgXHf
DFWXF2n4Jxonn+urc9sdhggdocUTHoqO3c0ImbO897CaDYiOGrmhBuxeaZi6y/0I
Y9PUAt17IGsB8sp/1C+LMBKkIjxYCBuX53LK/QIvLDNMUcWI6laAYx0etyH0xVPe
/HMhsDnpyQc0khhDB3XZ5HC3EB3g2S1t2wOl9VRPN3MqJvVlffNHb7psXxG2k99J
Z/dIaYVuksP6JGM3I/Y30BGpwhdQ/hyzxRpiX065XpDPttUO1XY=
=79Fh
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pulled kselftest fixes from Shuah:
"This Kselftest fixes update for 4.9-rc5 consists of:
-- fixes to build failures
-- fixes to add missing config files to increase test coverage
-- fixes to cgroup test and a new cgroup test for memory.oom.group"
Fix a bug in the key delete code - the num_records range
from 0 to num_records-1.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: David Binderman <dcb314@hotmail.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 1c5aae7710 ("perf machine: Create maps for x86 PTI entry
trampolines") revealed a problem with maps__find_symbol_by_name() that
resulted in probes not being found e.g.
$ sudo perf probe xsk_mmap
xsk_mmap is out of .text, skip it.
Probe point 'xsk_mmap' not found.
Error: Failed to add events.
maps__find_symbol_by_name() can optionally return the map of the found
symbol. It can get the map wrong because, in fact, the symbol is found
on the map's dso, not allowing for the possibility that the dso has more
than one map. Fix by always checking the map contains the symbol.
Reported-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 1c5aae7710 ("perf machine: Create maps for x86 PTI entry trampolines")
Link: http://lkml.kernel.org/r/20180907085116.25782-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get the changes in:
3e7a50ceb1 ("net: report min and max mtu network device settings")
2756f68c31 ("net: bridge: add support for backup port")
a25717d2b6 ("xdp: support simultaneous driver and hw XDP attachment")
4f91da26c8 ("xdp: add per mode attributes for attached programs")
f203b76d78 ("xfrm: Add virtual xfrm interfaces")
Silencing this libbpf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/if_link.h' differs from latest version at 'include/uapi/linux/if_link.h'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-xd9ztioa894zemv8ag8kg64u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get the changes in:
c48300c92a ("vhost: fix VHOST_GET_BACKEND_FEATURES ioctl request definition")
This makes 'perf trace' and other tools in the future using its
beautifiers in a libbeauty.so library be able to translate these new
ioctl to strings:
$ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > /tmp/after
$ diff -u /tmp/before /tmp/after
--- /tmp/before 2018-09-11 13:10:57.923038244 -0300
+++ /tmp/after 2018-09-11 13:11:20.329012685 -0300
@@ -15,6 +15,7 @@
[0x22] = "SET_VRING_ERR",
[0x23] = "SET_VRING_BUSYLOOP_TIMEOUT",
[0x24] = "GET_VRING_BUSYLOOP_TIMEOUT",
+ [0x25] = "SET_BACKEND_FEATURES",
[0x30] = "NET_SET_BACKEND",
[0x40] = "SCSI_SET_ENDPOINT",
[0x41] = "SCSI_CLEAR_ENDPOINT",
@@ -27,4 +28,5 @@
static const char *vhost_virtio_ioctl_read_cmds[] = {
[0x00] = "GET_FEATURES",
[0x12] = "GET_VRING_BASE",
+ [0x26] = "GET_BACKEND_FEATURES",
};
$
We'll also use this to be able to express syscall filters using symbolic
these symbolic names, something like:
# perf trace --all-cpus -e ioctl(cmd=*GET_FEATURES)
This silences the following warning during perf's build:
Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h'
diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-35x71oei2hdui9u0tarpimbq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get the changes in:
d67b6a2065 ("drm: writeback: Add client capability for exposing writeback connectors")
This is for an argument to a DRM ioctl, which is not being prettyfied in
the 'perf trace' DRM ioctl beautifier, but will now that syscalls are
starting to have pointer arguments augmented via BPF.
This time around this just cures the following warning during perf's
build:
Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'
diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brian Starkey <brian.starkey@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-n7qib1bac6mc6w9oke7r4qdc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get the changes in:
db7a2d1809 ("asm-generic: unistd.h: Wire up sys_rseq")
That wires up the new 'rsec' system call, which will automagically
support that syscall in the syscall table used by 'perf trace' on
arm/arm64.
This cures the following warning during perf's build:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/n/tip-vt7k2itnitp1t9p3dp7qeb08@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get the changes in:
09121255c7 ("perf/UAPI: Clearly mark __PERF_SAMPLE_CALLCHAIN_EARLY as internal use")
This cures the following warning during perf's build:
Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-2vvwh2o19orn56di0ksrtgzr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>