Commit Graph

21709 Commits

Author SHA1 Message Date
John Fastabend
13d70f5a5e bpf, sk_msg: Add get socket storage helpers
Add helpers to use local socket storage.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/159033907577.12355.14740125020572756560.stgit@john-Precision-5820-Tower
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 14:38:20 -07:00
Yauheni Kaliuta
55983299b7 libbpf: Use .so dynamic symbols for abi check
Since dynamic symbols are used for dynamic linking it makes sense to
use them (readelf --dyn-syms) for abi check.

Found with some configuration on powerpc where linker puts
local *.plt_call.* symbols into .so.

Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200525061846.16524-1-yauheni.kaliuta@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 14:38:20 -07:00
Nikolay Borisov
93581359e7 libbpf: Install headers as part of make install
Current 'make install' results in only pkg-config and library binaries
being installed. For consistency also install headers as part of
"make install"

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200526174612.5447-1-nborisov@suse.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 14:38:20 -07:00
Eelco Chaudron
272d51af32 libbpf: Add API to consume the perf ring buffer content
This new API, perf_buffer__consume, can be used as follows:

- When you have a perf ring where wakeup_events is higher than 1,
  and you have remaining data in the rings you would like to pull
  out on exit (or maybe based on a timeout).

- For low latency cases where you burn a CPU that constantly polls
  the queues.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159048487929.89441.7465713173442594608.stgit@ebuild
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 14:38:19 -07:00
Tobias Klauser
dc3ca5cf3e tools, bpftool: Print correct error message when failing to load BTF
btf__parse_raw and btf__parse_elf return negative error numbers wrapped
in an ERR_PTR, so the extracted value needs to be negated before passing
them to strerror which expects a positive error number.

Before:
  Error: failed to load BTF from .../vmlinux: Unknown error -2

After:
  Error: failed to load BTF from .../vmlinux: No such file or directory

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200525135421.4154-1-tklauser@distanz.ch
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 14:38:19 -07:00
Quentin Monnet
73a4f0407e tools, bpftool: Make capability check account for new BPF caps
Following the introduction of CAP_BPF, and the switch from CAP_SYS_ADMIN
to other capabilities for various BPF features, update the capability
checks (and potentially, drops) in bpftool for feature probes. Because
bpftool and/or the system might not know of CAP_BPF yet, some caution is
necessary:

- If compiled and run on a system with CAP_BPF, check CAP_BPF,
  CAP_SYS_ADMIN, CAP_PERFMON, CAP_NET_ADMIN.

- Guard against CAP_BPF being undefined, to allow compiling bpftool from
  latest sources on older systems. If the system where feature probes
  are run does not know of CAP_BPF, stop checking after CAP_SYS_ADMIN,
  as this should be the only capability required for all the BPF
  probing.

- If compiled from latest sources on a system without CAP_BPF, but later
  executed on a newer system with CAP_BPF knowledge, then we only test
  CAP_SYS_ADMIN. Some probes may fail if the bpftool process has
  CAP_SYS_ADMIN but misses the other capabilities. The alternative would
  be to redefine the value for CAP_BPF in bpftool, but this does not
  look clean, and the case sounds relatively rare anyway.

Note that libcap offers a cap_to_name() function to retrieve the name of
a given capability (e.g. "cap_sys_admin"). We do not use it because
deriving the names from the macros looks simpler than using
cap_to_name() (doing a strdup() on the string) + cap_free() + handling
the case of failed allocations, when we just want to use the name of the
capability in an error message.

The checks when compiling without libcap (i.e. root versus non-root) are
unchanged.

v2:
- Do not allocate cap_list dynamically.
- Drop BPF-related capabilities when running with "unprivileged", even
  if we didn't have the full set in the first place (in v1, we would
  skip dropping them in that case).
- Keep track of what capabilities we have, print the names of the
  missing ones for privileged probing.
- Attempt to drop only the capabilities we actually have.
- Rename a couple variables.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200523010247.20654-1-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 14:38:19 -07:00
Quentin Monnet
90040351a8 tools, bpftool: Clean subcommand help messages
This is a clean-up for the formatting of the do_help functions for
bpftool's subcommands. The following fixes are included:

- Do not use argv[-2] for "iter" help message, as the help is shown by
  default if no "iter" action is selected, resulting in messages looking
  like "./bpftool bpftool pin...".

- Do not print unused HELP_SPEC_PROGRAM in help message for "bpftool
  link".

- Andrii used argument indexing to avoid having multiple occurrences of
  bin_name and argv[-2] in the fprintf() for the help message, for
  "bpftool gen" and "bpftool link". Let's reuse this for all other help
  functions. We can remove up to thirty arguments for the "bpftool map"
  help message.

- Harmonise all functions, e.g. use ending quotes-comma on a separate
  line.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200523010751.23465-1-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01 14:38:18 -07:00
Linus Torvalds
d861f6e682 Misc cleanups in the SMP hotplug and cross-call code.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl7VJfsRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ihcA/+Ko18kdGRPAlShM9qkDWO5N80p1LEp7F0
 ku1OxPAz9ii7K/jlnGr9wYYPxsIL3lbFeqFE7q5q5socXufaN8MUj9sVCmN7ScmR
 zO84aTHtxrJJhKIPM6HkUTbVl5KrQaud3F/J56CCjuKPsJWy9iuCGnKtfKK38bx+
 qJEfVKVm95Bv0NSEvqvci3DKKPYjzpKzuuttHXQ8Z80zG94FEkwj0JwZzttIjLl1
 rgRMgWTH7+3tQCMnZEfXG8xBxbXS9i3hKyr/v5QTNgIICyXGquPkf5MiwjJFS2Xb
 wpPqNh8HTo5kUJstYygRjcftatU7K72h2Rz/CoUkN2roNYlvRAhdBaBMwN0cGaG8
 pPhnLHHHRYZjl4fiROgRwVV3A6LcAHSrIcKzwGrvpCSpqyVozPGsmD/e8ZG1JYpC
 vxESTZbCDywng2Ls8jqQBut+dFGElvopXl1s004bCak89IFR4p15qojMJK2MSsqu
 BxhjIoqp8/f1fsAX+1p0RBEYnEr1KFtWa+nY8aVKL6bEx+Y7Qyq0ypMGtKavP06X
 VMcPMm1gYeXoGpLaTLYBRL5t7Rmm7i+xufuDQKUJetenfh2YS4aQ9lfV+rsQH1YE
 wavQrbwThfBZ9K1XkEmOkSqONysZ2YAtK9slKzciQIZvY3V8NbKAmBudCgqTgarp
 xqeW9NFfeFc=
 =Rr2n
 -----END PGP SIGNATURE-----

Merge tag 'smp-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull SMP updates from Ingo Molnar:
 "Misc cleanups in the SMP hotplug and cross-call code"

* tag 'smp-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Remove __freeze_secondary_cpus()
  cpu/hotplug: Remove disable_nonboot_cpus()
  cpu/hotplug: Fix a typo in comment "broadacasted"->"broadcasted"
  smp: Use smp_call_func_t in on_each_cpu()
2020-06-01 13:38:55 -07:00
Linus Torvalds
a7092c8204 Kernel side changes:
- Add AMD Fam17h RAPL support
   - Introduce CAP_PERFMON to kernel and user space
   - Add Zhaoxin CPU support
   - Misc fixes and cleanups
 
 Tooling changes:
 
   perf record:
 
     - Introduce --switch-output-event to use arbitrary events to be setup
       and read from a side band thread and, when they take place a signal
       be sent to the main 'perf record' thread, reusing the --switch-output
       code to take perf.data snapshots from the --overwrite ring buffer, e.g.:
 
 	# perf record --overwrite -e sched:* \
 		      --switch-output-event syscalls:*connect* \
 		      workload
 
       will take perf.data.YYYYMMDDHHMMSS snapshots up to around the
       connect syscalls.
 
     - Add --num-synthesize-threads option to control degree of parallelism of the
       synthesize_mmap() code which is scanning /proc/PID/task/PID/maps and can be
       time consuming. This mimics pre-existing behaviour in 'perf top'.
 
   perf bench:
 
     - Add a multi-threaded synthesize benchmark.
     - Add kallsyms parsing benchmark.
 
   Intel PT support:
 
     - Stitch LBR records from multiple samples to get deeper backtraces,
       there are caveats, see the csets for details.
     - Allow using Intel PT to synthesize callchains for regular events.
     - Add support for synthesizing branch stacks for regular events (cycles,
       instructions, etc) from Intel PT data.
 
   Misc changes:
 
     - Updated perf vendor events for power9 and Coresight.
     - Add flamegraph.py script via 'perf flamegraph'
     - Misc other changes, fixes and cleanups - see the Git log for details.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl7VJAcRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hAYw/8DFtzGkMaaWkrDSj62LXtWQiqr1l01ZFt
 9GzV4aN4/go+K4BQtsQN8cUjOkRHFnOryLuD9LfSBfqsdjuiyTynV/cJkeUGQBck
 TT/GgWf3XKJzTUBRQRk367Gbqs9UKwBP8CdFhOXcNzGEQpjhbwwIDPmem94U4L1N
 XLsysgC45ejWL1kMTZKmk6hDIidlFeDg9j70WDPX1nNfCeisk25rxwTpdgvjsjcj
 3RzPRt2EGS+IkuF4QSCT5leYSGaCpVDHCQrVpHj57UoADfWAyC71uopTLG4OgYSx
 PVd9gvloMeeqWmroirIxM67rMd/TBTfVekNolhnQDjqp60Huxm+gGUYmhsyjNqdx
 Pb8HRZCBAudei9Ue4jNMfhCRK2Ug1oL5wNvN1xcSteAqrwMlwBMGHWns6l12x0ks
 BxYhyLvfREvnKijXc1o8D5paRgqohJgfnHlrUZeacyaw5hQCbiVRpwg0T1mWAF53
 u9hfWLY0Oy+Qs2C7EInNsWSYXRw8oPQNTFVx2I968GZqsEn4DC6Pt3ovWrDKIDnz
 ugoZJQkJ3/O8stYSMiyENehdWlo575NkapCTDwhLWnYztrw4skqqHE8ighU/e8ug
 o/Kx7ANWN9OjjjQpq2GVUeT0jCaFO+OMiGMNEkKoniYgYjogt3Gw5PeedBMtY07p
 OcWTiQZamjU=
 =i27M
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
 "Kernel side changes:

   - Add AMD Fam17h RAPL support

   - Introduce CAP_PERFMON to kernel and user space

   - Add Zhaoxin CPU support

   - Misc fixes and cleanups

  Tooling changes:

   - perf record:

     Introduce '--switch-output-event' to use arbitrary events to be
     setup and read from a side band thread and, when they take place a
     signal be sent to the main 'perf record' thread, reusing the core
     for '--switch-output' to take perf.data snapshots from the ring
     buffer used for '--overwrite', e.g.:

	# perf record --overwrite -e sched:* \
		      --switch-output-event syscalls:*connect* \
		      workload

     will take perf.data.YYYYMMDDHHMMSS snapshots up to around the
     connect syscalls.

     Add '--num-synthesize-threads' option to control degree of
     parallelism of the synthesize_mmap() code which is scanning
     /proc/PID/task/PID/maps and can be time consuming. This mimics
     pre-existing behaviour in 'perf top'.

   - perf bench:

     Add a multi-threaded synthesize benchmark and kallsyms parsing
     benchmark.

   - Intel PT support:

     Stitch LBR records from multiple samples to get deeper backtraces,
     there are caveats, see the csets for details.

     Allow using Intel PT to synthesize callchains for regular events.

     Add support for synthesizing branch stacks for regular events
     (cycles, instructions, etc) from Intel PT data.

  Misc changes:

   - Updated perf vendor events for power9 and Coresight.

   - Add flamegraph.py script via 'perf flamegraph'

   - Misc other changes, fixes and cleanups - see the Git log for details

  Also, since over the last couple of years perf tooling has matured and
  decoupled from the kernel perf changes to a large degree, going
  forward Arnaldo is going to send perf tooling changes via direct pull
  requests"

* tag 'perf-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (163 commits)
  perf/x86/rapl: Add AMD Fam17h RAPL support
  perf/x86/rapl: Make perf_probe_msr() more robust and flexible
  perf/x86/rapl: Flip logic on default events visibility
  perf/x86/rapl: Refactor to share the RAPL code between Intel and AMD CPUs
  perf/x86/rapl: Move RAPL support to common x86 code
  perf/core: Replace zero-length array with flexible-array
  perf/x86: Replace zero-length array with flexible-array
  perf/x86/intel: Add more available bits for OFFCORE_RESPONSE of Intel Tremont
  perf/x86/rapl: Add Ice Lake RAPL support
  perf flamegraph: Use /bin/bash for report and record scripts
  perf cs-etm: Move definition of 'traceid_list' global variable from header file
  libsymbols kallsyms: Move hex2u64 out of header
  libsymbols kallsyms: Parse using io api
  perf bench: Add kallsyms parsing
  perf: cs-etm: Update to build with latest opencsd version.
  perf symbol: Fix kernel symbol address display
  perf inject: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
  perf annotate: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
  perf trace: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
  perf script: Rename perf_evsel__*() operating on 'struct evsel *' to evsel__*()
  ...
2020-06-01 13:23:59 -07:00
Linus Torvalds
69fc06f70f There are a lot of objtool changes in this cycle, all across the map:
- Speed up objtool significantly, especially when there are large number of sections
  - Improve objtool's understanding of special instructions such as IRET,
    to reduce the number of annotations required
  - Implement 'noinstr' validation
  - Do baby steps for non-x86 objtool use
  - Simplify/fix retpoline decoding
  - Add vmlinux validation
  - Improve documentation
  - Fix various bugs and apply smaller cleanups
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl7VHvcRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gEfBAAhvPWljUmfQsetYq4q9BdbuC4xPSQN9ra
 e+2zu1MQaohkjAdFM1boNVhCCGKFUvlTEEw3GJR141Us6Y/ZRS8VIo70tmVSku6I
 OwuR5i8SgEKwurr1SwLxrI05rovYWRLSaDIRTHn2CViPEjgriyFGRV8QKam3AYmI
 dx47la3ELwuQR68nIdIMzDRt49oZVy+ZKW8Pjgjklzrd5KMYsPy7HPtraHUMeDg+
 GdoC7RresIt5AFiDiIJzKTT/jROI7KuHFnM6blluKHoKenWhYBFCz3sd6IvCdQWX
 JGy+KKY6H+YDMSpgc4FRP56M3GI0hX14oCd7L72epSLfOuzPr9Tmf6wfyQ8f50Je
 LGLD47tyltIcQR9H85YdR8UQspkjSW6xcql4ByCPTEqp0UzSGTsVntvsHzwsgz6A
 Csh3s+DVdv0rk5ZjMCu8STA2oErpehJm7fmugt2oLx+nsCNCBUI25lilw5JGeq5c
 +cO0IRxRxHPeRvMWvItTjbixVAHOHYlB00ilDbvsm+GnTJgu/5cMqpXdLvfXI2Rr
 nl360bSS3t3J4w5rX0mXw4x24vjQmVrA69jU+oo8RSHje2X8Y4Q7sFHNjmN0YAI3
 Re8aP6HSLQjioJxGz9aISlrxmPOXe0CMp8JE586SREVgmS/olXtidMgi7l12uZ2B
 cRdtNYcn31U=
 =dbCU
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:
 "There are a lot of objtool changes in this cycle, all across the map:

   - Speed up objtool significantly, especially when there are large
     number of sections

   - Improve objtool's understanding of special instructions such as
     IRET, to reduce the number of annotations required

   - Implement 'noinstr' validation

   - Do baby steps for non-x86 objtool use

   - Simplify/fix retpoline decoding

   - Add vmlinux validation

   - Improve documentation

   - Fix various bugs and apply smaller cleanups"

* tag 'objtool-core-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  objtool: Enable compilation of objtool for all architectures
  objtool: Move struct objtool_file into arch-independent header
  objtool: Exit successfully when requesting help
  objtool: Add check_kcov_mode() to the uaccess safelist
  samples/ftrace: Fix asm function ELF annotations
  objtool: optimize add_dead_ends for split sections
  objtool: use gelf_getsymshndx to handle >64k sections
  objtool: Allow no-op CFI ops in alternatives
  x86/retpoline: Fix retpoline unwind
  x86: Change {JMP,CALL}_NOSPEC argument
  x86: Simplify retpoline declaration
  x86/speculation: Change FILL_RETURN_BUFFER to work with objtool
  objtool: Add support for intra-function calls
  objtool: Move the IRET hack into the arch decoder
  objtool: Remove INSN_STACK
  objtool: Make handle_insn_ops() unconditional
  objtool: Rework allocating stack_ops on decode
  objtool: UNWIND_HINT_RET_OFFSET should not check registers
  objtool: is_fentry_call() crashes if call has no destination
  x86,smap: Fix smap_{save,restore}() alternatives
  ...
2020-06-01 13:13:00 -07:00
Linus Torvalds
2227e5b21a The RCU updates for this cycle were:
- RCU-tasks update, including addition of RCU Tasks Trace for
    BPF use and TASKS_RUDE_RCU
  - kfree_rcu() updates.
  - Remove scheduler locking restriction
  - RCU CPU stall warning updates.
  - Torture-test updates.
  - Miscellaneous fixes and other updates.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl7U/r0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hSNxAAirKhPGBoLI9DW1qde4OFhZg+BlIpS+LD
 IE/0eGB8hGwhb1793RGbzIJfSnRQpSOPxWbWc6DJZ4Zpi5/ZbVkiPKsuXpM1xGxs
 kuBCTOhWy1/p3iCZ1JH/JCrCAdWGZkIzEoaV7ipnHtV/+UrRbCWH5PB7R0fYvcbI
 q5bUcWJyEp/bYMxQn8DhAih6SLPHx+F9qaGAqqloLSHstTYG2HkBhBGKnqcd/Jex
 twkLK53poCkeP/c08V1dyagU2IRWj2jGB1NjYh/Ocm+Sn/vru15CVGspjVjqO5FF
 oq07lad357ddMsZmKoM2F5DhXbOh95A+EqF9VDvIzCvfGMUgqYI1oxWF4eycsGhg
 /aYJgYuN23YeEe2DkDzJB67GvBOwl4WgdoFaxKRzOiCSfrhkM8KqM4G9Fz1JIepG
 abRJCF85iGcLslU9DkrShQiDsd/CRPzu/jz6ybK0I2II2pICo6QRf76T7TdOvKnK
 yXwC6OdL7/dwOht20uT6XfnDXMCWI4MutiUrb8/C1DbaihwEaI2denr3YYL+IwrB
 B38CdP6sfKZ5UFxKh0xb+sOzWrw0KA+ThSAXeJhz3tKdxdyB6nkaw3J9lFg8oi20
 XGeAujjtjMZG5cxt2H+wO9kZY0RRau/nTqNtmmRrCobd5yJjHHPHH8trEd0twZ9A
 X5Wjh11lv3E=
 =Yisx
 -----END PGP SIGNATURE-----

Merge tag 'core-rcu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:
 "The RCU updates for this cycle were:

   - RCU-tasks update, including addition of RCU Tasks Trace for BPF use
     and TASKS_RUDE_RCU

   - kfree_rcu() updates.

   - Remove scheduler locking restriction

   - RCU CPU stall warning updates.

   - Torture-test updates.

   - Miscellaneous fixes and other updates"

* tag 'core-rcu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits)
  rcu: Allow for smp_call_function() running callbacks from idle
  rcu: Provide rcu_irq_exit_check_preempt()
  rcu: Abstract out rcu_irq_enter_check_tick() from rcu_nmi_enter()
  rcu: Provide __rcu_is_watching()
  rcu: Provide rcu_irq_exit_preempt()
  rcu: Make RCU IRQ enter/exit functions rely on in_nmi()
  rcu/tree: Mark the idle relevant functions noinstr
  x86: Replace ist_enter() with nmi_enter()
  x86/mce: Send #MC singal from task work
  x86/entry: Get rid of ist_begin/end_non_atomic()
  sched,rcu,tracing: Avoid tracing before in_nmi() is correct
  sh/ftrace: Move arch_ftrace_nmi_{enter,exit} into nmi exception
  lockdep: Always inline lockdep_{off,on}()
  hardirq/nmi: Allow nested nmi_enter()
  arm64: Prepare arch_nmi_enter() for recursion
  printk: Disallow instrumenting print_nmi_enter()
  printk: Prepare for nested printk_nmi_enter()
  rcutorture: Convert ULONG_CMP_LT() to time_before()
  torture: Add a --kasan argument
  torture: Save a few lines by using config_override_param initially
  ...
2020-06-01 12:56:29 -07:00
Linus Torvalds
829f3b9401 Fixes and new features for pstore
- refactor pstore locking for safer module unloading (Kees Cook)
 - remove orphaned records from pstorefs when backend unloaded (Kees Cook)
 - refactor dump_oops parameter into max_reason (Pavel Tatashin)
 - introduce pstore/zone for common code for contiguous storage (WeiXiong Liao)
 - introduce pstore/blk for block device backend (WeiXiong Liao)
 - introduce mtd backend (WeiXiong Liao)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl7UbYYWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJpkgD/9/09OkJIWydwk2lr2T89HW5fSF
 5uBT0a309/QDUpnV9yhcRsrESEicnvbtaGxD0kuYIInkiW/2cj1l689EkyRjUmy9
 q3z4GzLqOlC7qvd7LUPFNGHmllBb09H/CxmXDxRP3aynB9oHzdpNQdPcpLBDA00r
 0byp/AE48dFbKIhtT0QxpGUYZFOlyc7XVAaOkED4bmu148gx8q7MU1AxFgbx0Feb
 9iPV0r6XYMgXJZ3sn/3PJsxF0V/giDSJ8ui2xsYRjCE408zVIYLdDs2e8dz+2yW6
 +3Lyankgo+ofZc4XYExTYgn3WjhPFi+pjVRUaj+BcyTk9SLNIj2WmZdmcLMuzanh
 BaUurmED7ffTtlsH4PhQgn8/OY4FX2PO2MwUHwlU+87Y8YDiW0lpzTq5H822OO8p
 QQ8awql/6lLCJuyzuWIciVUsS65MCPxsZ4+LSiMZzyYpWu1sxrEY8ic3agzCgsA0
 0i+4nZFlLG+Aap/oiKpegenkIyAunn2tDXAyFJFH6qLOiZJ78iRuws3XZqjCElhJ
 XqvyDJIfjkJhWUb++ckeqX7ThOR4CPSnwba/7GHv7NrQWuk3Cn+GQ80oxydXUY6b
 2/4eYjq0wtvf9NeuJ4/LYNXotLR/bq9zS0zqwTWG50v+RPmuC3bNJB+RmF7fCiCG
 jo1Sd1LMeTQ7bnULpA==
 =7s1u
 -----END PGP SIGNATURE-----

Merge tag 'pstore-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull pstore updates from Kees Cook:
 "Fixes and new features for pstore.

  This is a pretty big set of changes (relative to past pstore pulls),
  but it has been in -next for a while. The biggest change here is the
  ability to support a block device as a pstore backend, which has been
  desired for a while. A lot of additional fixes and refactorings are
  also included, mostly in support of the new features.

   - refactor pstore locking for safer module unloading (Kees Cook)

   - remove orphaned records from pstorefs when backend unloaded (Kees
     Cook)

   - refactor dump_oops parameter into max_reason (Pavel Tatashin)

   - introduce pstore/zone for common code for contiguous storage
     (WeiXiong Liao)

   - introduce pstore/blk for block device backend (WeiXiong Liao)

   - introduce mtd backend (WeiXiong Liao)"

* tag 'pstore-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (35 commits)
  mtd: Support kmsg dumper based on pstore/blk
  pstore/blk: Introduce "best_effort" mode
  pstore/blk: Support non-block storage devices
  pstore/blk: Provide way to query pstore configuration
  pstore/zone: Provide way to skip "broken" zone for MTD devices
  Documentation: Add details for pstore/blk
  pstore/zone,blk: Add ftrace frontend support
  pstore/zone,blk: Add console frontend support
  pstore/zone,blk: Add support for pmsg frontend
  pstore/blk: Introduce backend for block devices
  pstore/zone: Introduce common layer to manage storage zones
  ramoops: Add "max-reason" optional field to ramoops DT node
  pstore/ram: Introduce max_reason and convert dump_oops
  pstore/platform: Pass max_reason to kmesg dump
  printk: Introduce kmsg_dump_reason_str()
  printk: honor the max_reason field in kmsg_dumper
  printk: Collapse shutdown types into a single dump reason
  pstore/ftrace: Provide ftrace log merging routine
  pstore/ram: Refactor ftrace buffer merging
  pstore/ram: Refactor DT size parsing
  ...
2020-06-01 12:07:34 -07:00
Ido Schimmel
9959b38977 selftests: mlxsw: Add test for control packets
Generate packets matching the various control traps and check that the
traps' stats increase accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01 11:49:23 -07:00
Linus Torvalds
a36de5ebac spi: Updates for v5.8
This has been a very active release for the DesignWare driver in
 particular - after a long period of inactivity we have had a lot of
 people actively working on it for unrelated reasons this cycle with some
 of that work still not landed.  Otherwise it's been fairly quiet for the
 subsystem.  Highlights include:
 
  - Lots of performance improvements and fixes for the DesignWare driver
    from Serge Semin, Andy Shevchenko, Wan Ahmad Zainie, Clement Leger,
    Dinh Nguyen and Jarkko Nikula.
  - Support for octal mode transfers in spidev.
  - Slave mode support for the Rockchip drivers.
  - Support for AMD controllers, Broadcom mspi and Raspberry Pi 4,
    and Intel Elkhart Lake.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl7U6r0THGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0LasB/9npXOMe6tOT03YqtIhN4pxrdUo+LsN
 A5Rc8prfQo7srnIZMndt5/wcTftomVdvjSNrtyXMXtzj+Logx01Pndrr6UVUP6Qq
 Sy0R+4QXBSlj5QtUOBvGFTlzKw2BEaOBYftxVKQM6s4eoefvl0BFALHpEeaHvsDO
 YXfwU8EK6sZylDzvsuVy2uoJlTcY4+wKop7JWY5Ze+LTUjsuJQVEG9zbxpZNEpOn
 ZHO3FVS2MlIAuhcVmy0TfvYxTldTrT89zv8x4sKaPaXwDJFzYjJBwz77vYAjD8u5
 i52JhrAMkZyU4SZdnciJLJx9oTdT8+Rj32oQBU6uK8nRN7U3zflNHHQw
 =qm1J
 -----END PGP SIGNATURE-----

Merge tag 'spi-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi updates from Mark Brown:
 "This has been a very active release for the DesignWare driver in
  particular - after a long period of inactivity we have had a lot of
  people actively working on it for unrelated reasons this cycle with
  some of that work still not landed.

  Otherwise it's been fairly quiet for the subsystem.

  Highlights include:

   - Lots of performance improvements and fixes for the DesignWare
     driver from Serge Semin, Andy Shevchenko, Wan Ahmad Zainie, Clement
     Leger, Dinh Nguyen and Jarkko Nikula.

   - Support for octal mode transfers in spidev.

   - Slave mode support for the Rockchip drivers.

   - Support for AMD controllers, Broadcom mspi and Raspberry Pi 4, and
     Intel Elkhart Lake"

* tag 'spi-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (125 commits)
  spi: spi-fsl-dspi: fix native data copy
  spi: Convert DW SPI binding to DT schema
  spi: dw: Refactor mid_spi_dma_setup() to separate DMA and IRQ config
  spi: dw: Make DMA request line assignments explicit for Intel Medfield
  spi: bcm2835: Remove shared interrupt support
  dt-bindings: snps,dw-apb-ssi: add optional reset property
  spi: dw: add reset control
  spi: bcm2835: Enable shared interrupt support
  spi: bcm2835: Implement shutdown callback
  spi: dw: Use regset32 DebugFS method to create regdump file
  spi: dw: Add DMA support to the DW SPI MMIO driver
  spi: dw: Cleanup generic DW DMA code namings
  spi: dw: Add DW SPI DMA/PCI/MMIO dependency on the DW SPI core
  spi: dw: Remove DW DMA code dependency from DW_DMAC_PCI
  spi: dw: Move Non-DMA code to the DW PCIe-SPI driver
  spi: dw: Add core suffix to the DW APB SSI core source file
  spi: dw: Fix Rx-only DMA transfers
  spi: dw: Use DMA max burst to set the request thresholds
  spi: dw: Parameterize the DMA Rx/Tx burst length
  spi: dw: Add SPI Rx-done wait method to DMA-based transfer
  ...
2020-06-01 11:42:38 -07:00
Vitaly Kuznetsov
13ffbd8db1 KVM: selftests: fix rdtsc() for vmx_tsc_adjust_test
vmx_tsc_adjust_test fails with:

IA32_TSC_ADJUST is -4294969448 (-1 * TSC_ADJUST_VALUE + -2152).
IA32_TSC_ADJUST is -4294969448 (-1 * TSC_ADJUST_VALUE + -2152).
IA32_TSC_ADJUST is 281470681738540 (65534 * TSC_ADJUST_VALUE + 4294962476).
==== Test Assertion Failure ====
  x86_64/vmx_tsc_adjust_test.c:153: false
  pid=19738 tid=19738 - Interrupted system call
     1	0x0000000000401192: main at vmx_tsc_adjust_test.c:153
     2	0x00007fe1ef8583d4: ?? ??:0
     3	0x0000000000401201: _start at ??:?
  Failed guest assert: (adjust <= max)

The problem is that is 'tsc_val' should be u64, not u32 or the reading
gets truncated.

Fixes: 8d7fbf01f9 ("KVM: selftests: VMX preemption timer migration test")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200601154726.261868-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-01 11:58:23 -04:00
Vitaly Kuznetsov
fb0cb6a821 KVM: selftests: update hyperv_cpuid with SynDBG tests
Update tests to reflect new CPUID capabilities with SYNDBG.
Check that we get the right number of entries and that
0x40000000.EAX always returns the correct max leaf.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Jon Doron <arilou@gmail.com>
Message-Id: <20200529134543.1127440-7-arilou@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-01 04:26:12 -04:00
Makarand Sonare
8d7fbf01f9 KVM: selftests: VMX preemption timer migration test
When a nested VM with a VMX-preemption timer is migrated, verify that the
nested VM and its parent VM observe the VMX-preemption timer exit close to
the original expiration deadline.

Signed-off-by: Makarand Sonare <makarandsonare@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20200526215107.205814-3-makarandsonare@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-01 04:26:10 -04:00
Vitaly Kuznetsov
8ec107c89b selftests: kvm: fix smm test on SVM
KVM_CAP_NESTED_STATE is now supported for AMD too but smm test acts like
it is still Intel only.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200529130407.57176-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-01 04:26:04 -04:00
Paolo Bonzini
10b910cb7e selftests: kvm: add a SVM version of state-test
The test is similar to the existing one for VMX, but simpler because we
don't have to test shadow VMCS or vmptrld/vmptrst/vmclear.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-01 04:26:04 -04:00
Vitaly Kuznetsov
ed88129733 selftests: kvm: introduce cpu_has_svm() check
Many tests will want to check if the CPU is Intel or AMD in
guest code, add cpu_has_svm() and put it as static
inline to svm_util.h.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200529130407.57176-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-01 04:26:04 -04:00
David S. Miller
1806c13dc2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
xdp_umem.c had overlapping changes between the 64-bit math fix
for the calculation of npgs and the removal of the zerocopy
memory type which got rid of the chunk_size_nohdr member.

The mlx5 Kconfig conflict is a case where we just take the
net-next copy of the Kconfig entry dependency as it takes on
the ESWITCH dependency by one level of indirection which is
what the 'net' conflicting change is trying to ensure.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-31 17:48:46 -07:00
Linus Torvalds
8fc984aedc A pile of x86 fixes:
- Prevent a memory leak in ioperm which was caused by the stupid
     assumption that the exit cleanup is always called for current, which is
     not the case when fork fails after taking a reference on the ioperm
     bitmap.
 
   - Fix an arithmething overflow in the DMA code on 32bit systems
 
   - Fill gaps in the xstate copy with defaults instead of leaving them
     uninitialized
 
   - Revert: o"Make __X32_SYSCALL_BIT be unsigned long" as it turned out
     that existing user space fails to build.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl7Tt8YTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYobtTEACukhGsuivgiTwltWuHcATqrcNbgHSu
 nnhuQrjJ8KJiF5O60nDztPAVzxD+Ww2tzuDnD1BLFDI9cEA5oPhzXf7kUuJvrYUK
 INY+OALPPpw2iWjmygIsEyw3Pzmnm6peRA4h5UZSZdFxdROGGwBeGYNxowuVWFiH
 X7Fa1J4QxTI7e2X3psDVz94bOnVTPRPAR2bNpX8K8Qs+Wn1FFO92LFU04EvJTCHe
 JdN73VAS+0o0qPlPMewiuyfxaHexc8eJySMdOiysPnGRy+vagyyMPOV2Kg0DD6bp
 caDxCXNjIxXlRExV6F75s8hnl42DwXzLSzY/G7L/HVJ5r3voqcREYtXHgfenl7Jg
 8o6tEi+qFduPJ6SuRjfjPBDBF4wJvcjgmCwJaPJbMkrg8p5jH9Xg35egmEMo9cF8
 JQa2RzWJTR9XUjuPAuHJZR6f9jnle01PCznmw7Mavoed82udW1Lo32+QnvWsx6Qq
 4uuV38FqK3lsVCfFjyZir9OB9DGeuT/NETs3WJuGW5QUnC1mqfvIYipL3BkxNMKP
 IBB7n5X2iCJ545JkydepXF2I+b/i8XhNcIwYMVoSbZzBKccwCZ7zxHFNj6YAWG+M
 TN77x/+lw5zbnxhL3YzK+fgPNLio/By4Zcpmq6uppaf9Ip67SJGVq22Ef3S0w8vG
 X1inh1zqLX9hsQ==
 =DmSb
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A pile of x86 fixes:

   - Prevent a memory leak in ioperm which was caused by the stupid
     assumption that the exit cleanup is always called for current,
     which is not the case when fork fails after taking a reference on
     the ioperm bitmap.

   - Fix an arithmething overflow in the DMA code on 32bit systems

   - Fill gaps in the xstate copy with defaults instead of leaving them
     uninitialized

   - Revert: "Make __X32_SYSCALL_BIT be unsigned long" as it turned out
     that existing user space fails to build"

* tag 'x86-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ioperm: Prevent a memory leak when fork fails
  x86/dma: Fix max PFN arithmetic overflow on 32 bit systems
  copy_xstate_to_kernel(): don't leave parts of destination uninitialized
  x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"
2020-05-31 10:45:11 -07:00
Petr Machata
3ed97037f0 selftests: forwarding: pedit_dsfield: Check counter value
A missing stats_update callback was recently added to act_pedit. Now that
iproute2 supports JSON dumping for pedit, extend the pedit_dsfield selftest
with a check that would have caught the fact that the callback was missing.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:48:24 -07:00
Petr Machata
1c0522b4a2 selftests: forwarding: mirror_lib: Use mausezahn
Using ping in tests is error-prone, because ping is too smart. On a
flaky system (notably in a simulator), when packets don't come quickly
enough, more pings are sent, and that throws off counters. Instead use
mausezahn to generate ICMP echo request packets. That allows us to
send them in quicker succession as well, because the reason the ping
was made slow in the first place was to make the tests work on
simulated systems.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:48:24 -07:00
Kees Cook
d195c39052 pstore/platform: Use backend name for console registration
If the pstore backend changes, there's no indication in the logs what
the console is (it always says "pstore"). Instead, pass through the
active backend's name. (Also adjust the selftest to match.)

Link: https://lore.kernel.org/lkml/20200510202436.63222-5-keescook@chromium.org/
Link: https://lore.kernel.org/lkml/20200526135429.GQ12456@shao2-debian
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30 10:34:02 -07:00
Mark Brown
fb02b9eb4e
Merge remote-tracking branch 'spi/for-5.8' into spi-next 2020-05-30 00:03:53 +01:00
David S. Miller
f9e0ce3ddc Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2020-05-29

The following pull-request contains BPF updates for your *net* tree.

We've added 6 non-merge commits during the last 7 day(s) which contain
a total of 4 files changed, 55 insertions(+), 34 deletions(-).

The main changes are:

1) minor verifier fix for fmod_ret progs, from Alexei.

2) af_xdp overflow check, from Bjorn.

3) minor verifier fix for 32bit assignment, from John.

4) powerpc has non-overlapping addr space, from Petr.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 15:59:08 -07:00
John Fastabend
cf66c29bd7 bpf, selftests: Add a verifier test for assigning 32bit reg states to 64bit ones
Added a verifier test for assigning 32bit reg states to
64bit where 32bit reg holds a constant value of 0.

Without previous kernel verifier.c fix, the test in
this patch will fail.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159077335867.6014.2075350327073125374.stgit@john-Precision-5820-Tower
2020-05-29 13:34:06 -07:00
John Fastabend
e3effcdfe0 bpf, selftests: Verifier bounds tests need to be updated
After previous fix for zero extension test_verifier tests #65 and #66 now
fail. Before the fix we can see the alu32 mov op at insn 10

10: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=invP(id=0,
              smin_value=4294967168,smax_value=4294967423,
              umin_value=4294967168,umax_value=4294967423,
              var_off=(0x0; 0x1ffffffff),
              s32_min_value=-2147483648,s32_max_value=2147483647,
              u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
10: (bc) w1 = w1
11: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=invP(id=0,
              smin_value=0,smax_value=2147483647,
              umin_value=0,umax_value=4294967295,
              var_off=(0x0; 0xffffffff),
              s32_min_value=-2147483648,s32_max_value=2147483647,
              u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm

After the fix at insn 10 because we have 's32_min_value < 0' the following
step 11 now has 'smax_value=U32_MAX' where before we pulled the s32_max_value
bound into the smax_value as seen above in 11 with smax_value=2147483647.

10: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
             smin_value=4294967168,smax_value=4294967423,
             umin_value=4294967168,umax_value=4294967423,
             var_off=(0x0; 0x1ffffffff),
             s32_min_value=-2147483648, s32_max_value=2147483647,
             u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
10: (bc) w1 = w1
11: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
             smin_value=0,smax_value=4294967295,
             umin_value=0,umax_value=4294967295,
             var_off=(0x0; 0xffffffff),
             s32_min_value=-2147483648, s32_max_value=2147483647,
             u32_min_value=0, u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm

The fall out of this is by the time we get to the failing instruction at
step 14 where previously we had the following:

14: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
             smin_value=72057594021150720,smax_value=72057594029539328,
             umin_value=72057594021150720,umax_value=72057594029539328,
             var_off=(0xffffffff000000; 0xffffff),
             s32_min_value=-16777216,s32_max_value=-1,
             u32_min_value=-16777216,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
14: (0f) r0 += r1

We now have,

14: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
             smin_value=0,smax_value=72057594037927935,
             umin_value=0,umax_value=72057594037927935,
             var_off=(0x0; 0xffffffffffffff),
             s32_min_value=-2147483648,s32_max_value=2147483647,
             u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
14: (0f) r0 += r1

In the original step 14 'smin_value=72057594021150720' this trips the logic
in the verifier function check_reg_sane_offset(),

 if (smin >= BPF_MAX_VAR_OFF || smin <= -BPF_MAX_VAR_OFF) {
	verbose(env, "value %lld makes %s pointer be out of bounds\n",
		smin, reg_type_str[type]);
	return false;
 }

Specifically, the 'smin <= -BPF_MAX_VAR_OFF' check. But with the fix
at step 14 we have bounds 'smin_value=0' so the above check is not tripped
because BPF_MAX_VAR_OFF=1<<29.

We have a smin_value=0 here because at step 10 the smaller smin_value=0 means
the subtractions at steps 11 and 12 bring the smin_value negative.

11: (17) r1 -= 2147483584
12: (17) r1 -= 2147483584
13: (77) r1 >>= 8

Then the shift clears the top bit and smin_value is set to 0. Note we still
have the smax_value in the fixed code so any reads will fail. An alternative
would be to have reg_sane_check() do both smin and smax value tests.

To fix the test we can omit the 'r1 >>=8' at line 13. This will change the
err string, but keeps the intention of the test as suggseted by the title,
"check after truncation of boundary-crossing range". If the verifier logic
changes a different value is likely to be thrown in the error or the error
will no longer be thrown forcing this test to be examined. With this change
we see the new state at step 13.

13: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=invP(id=0,
              smin_value=-4294967168,smax_value=127,
              umin_value=0,umax_value=18446744073709551615,
              s32_min_value=-2147483648,s32_max_value=2147483647,
              u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm

Giving the expected out of bounds error, "value -4294967168 makes map_value
pointer be out of bounds" However, for unpriv case we see a different error
now because of the mixed signed bounds pointer arithmatic. This seems OK so
I've only added the unpriv_errstr for this. Another optino may have been to
do addition on r1 instead of subtraction but I favor the approach above
slightly.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/159077333942.6014.14004320043595756079.stgit@john-Precision-5820-Tower
2020-05-29 13:34:06 -07:00
David Ahern
7c741868ce selftests: Add torture tests to nexthop tests
Add Nik's torture tests as a new set to stress the replace and cleanup
paths.

Torture test created by Nikolay Aleksandrov and then I adapted to
selftest and added IPv6 version.

Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 11:00:31 -07:00
Ingo Molnar
0bffedbce9 Linux 5.7-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl7K9iEeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGzTAH/0ifZEG4BQ8x/WlB
 8YLSLE6QQTSXYi25nyExuJbFkkKY5Tik8M2HD/36xwY/HnZOlH9jH6m0ntqZxpaA
 3EU9lr1ct79nCBMYhiJssvz8d9AOZXlyogFW9y2y9pmPjlmUtseZ7yGh1xD465cj
 B5Ty2w2W34cs7zF3og2xn5agOJMtWWXLXZ5mRa9EOquKC5zeYyRicmd0T+plYQD6
 hbRYmxFfDfppVnBCBARPNN0+NU5JJD94H+8bOuf1tl48XNrLiZMOicmtohKNQ6+W
 rZNpJNEGEp7KMtqWH0Nl3hmy3yfZHMwe1DXM/AZDqR7jTHZY4mZ0GEpLyfI9AU4n
 34jVHwU=
 =SmJ9
 -----END PGP SIGNATURE-----

Merge tag 'v5.7-rc7' into perf/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-05-28 07:58:12 +02:00
Stephen Worley
5a1b72cebc net: add large ecmp group nexthop tests
Add a couple large ecmp group nexthop selftests to cover
the remnant fixed by d69100b8ee.

The tests create 100 x32 ecmp groups of ipv4 and ipv6 and then
dump them. On kernels without the fix, they will fail due
to data remnant during the dump.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:38:43 -07:00
Davide Caratti
bb2f930d6d net/sched: fix infinite loop in sch_fq_pie
this command hangs forever:

 # tc qdisc add dev eth0 root fq_pie flows 65536

 watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [tc:1028]
 [...]
 CPU: 1 PID: 1028 Comm: tc Not tainted 5.7.0-rc6+ #167
 RIP: 0010:fq_pie_init+0x60e/0x8b7 [sch_fq_pie]
 Code: 4c 89 65 50 48 89 f8 48 c1 e8 03 42 80 3c 30 00 0f 85 2a 02 00 00 48 8d 7d 10 4c 89 65 58 48 89 f8 48 c1 e8 03 42 80 3c 30 00 <0f> 85 a7 01 00 00 48 8d 7d 18 48 c7 45 10 46 c3 23 00 48 89 f8 48
 RSP: 0018:ffff888138d67468 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
 RAX: 1ffff9200018d2b2 RBX: ffff888139c1c400 RCX: ffffffffffffffff
 RDX: 000000000000c5e8 RSI: ffffc900000e5000 RDI: ffffc90000c69590
 RBP: ffffc90000c69580 R08: fffffbfff79a9699 R09: fffffbfff79a9699
 R10: 0000000000000700 R11: fffffbfff79a9698 R12: ffffc90000c695d0
 R13: 0000000000000000 R14: dffffc0000000000 R15: 000000002347c5e8
 FS:  00007f01e1850e40(0000) GS:ffff88814c880000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000000000067c340 CR3: 000000013864c000 CR4: 0000000000340ee0
 Call Trace:
  qdisc_create+0x3fd/0xeb0
  tc_modify_qdisc+0x3be/0x14a0
  rtnetlink_rcv_msg+0x5f3/0x920
  netlink_rcv_skb+0x121/0x350
  netlink_unicast+0x439/0x630
  netlink_sendmsg+0x714/0xbf0
  sock_sendmsg+0xe2/0x110
  ____sys_sendmsg+0x5b4/0x890
  ___sys_sendmsg+0xe9/0x160
  __sys_sendmsg+0xd3/0x170
  do_syscall_64+0x9a/0x370
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

we can't accept 65536 as a valid number for 'nflows', because the loop on
'idx' in fq_pie_init() will never end. The extack message is correct, but
it doesn't say that 0 is not a valid number for 'flows': while at it, fix
this also. Add a tdc selftest to check correct validation of 'flows'.

CC: Ivan Vecera <ivecera@redhat.com>
Fixes: ec97ecf1eb ("net: sched: add Flow Queue PIE packet scheduler")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:11:18 -07:00
Ido Schimmel
025b7de7f4 mlxsw: spectrum: Reduce priority of locally delivered packets
To align with recent recommended values. Will be configurable by future
patches.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Andy Lutomirski
700d3a5a66 x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"
Revert

  45e29d119e ("x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long")

and add a comment to discourage someone else from making the same
mistake again.

It turns out that some user code fails to compile if __X32_SYSCALL_BIT
is unsigned long. See, for example [1] below.

 [ bp: Massage and do the same thing in the respective tools/ header. ]

Fixes: 45e29d119e ("x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long")
Reported-by: Thorsten Glaser <t.glaser@tarent.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@kernel.org
Link: [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=954294
Link: https://lkml.kernel.org/r/92e55442b744a5951fdc9cfee10badd0a5f7f828.1588983892.git.luto@kernel.org
2020-05-26 16:42:43 +02:00
Rafael J. Wysocki
5ed909b368 linux-cpupower-5.8-rc1
This cpupower for update for Linux 5.8-rc1 consists of a single
 patch to fix coccicheck unneeded semicolon warning.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl7IWkMACgkQCwJExA0N
 QxzstRAAiRJ7Nhy3uPxJSI6s5+bsXZfJQ2BK7ox3QbWsuq8yAe/Kr9RCHR/8MQz+
 ZmXzu0ptxbLubrU2ikUj58sng+FtW2Mx4MwQRw7to35RzceVR6A5X4f3MYWEoIFY
 JLTrG7dFgqITvuEFYNF6Bp/EuEGYMF4sBSP1MKz/eSaPJnYFk2whwoUHl1ztpTMV
 H/RdFlbHCGErw1cEkZ0acT9NZcX85eWDUUnZ3P4zNBxso8CV7/YleJiRfsI5lQPX
 884+23bgAWgM+DcpfAVuSXxtvY24F2VjUuyMQN/jlScpO/EjIuLSyqMMfgvdZVEB
 dJxUfPAjbabSsAj5YQuKgSqtg0r4TbZK6+WR3RUf3Y89Q4gGqkrV7ZHnY83IutGY
 bzZEjgoDeSOvA2pGqmOxZ+besKOp/j8gIXLpt09VpEtllzILS8lsFhw4jAx854Cc
 KcGfOY670lFAH/Wi0FGCv7MeWKFPwegvlLAFH8BJvow4lTPDKgV3SyTpdFvhDF/J
 51Kt1DF0JusVHwVTUfRBxFDaWXqFbW7OEk46+b1FAreQLD5Ubm2D9G/xSw53cXmA
 QiUHEIgF/Mfwp8D84bT6c8kvmehREJQSjTQ72pZr1FsC/J7FX9eTn9RozbgDJ5F4
 mSIB205FGeK9Q7jxDCAeBYlRMON5DllY1iAEhzmJ1dbVaVwCZaM=
 =cYtw
 -----END PGP SIGNATURE-----

Merge tag 'linux-cpupower-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux

Pull cpupower utility updates for v5.8-rc1 from Shuah Khan:

"This cpupower update for Linux 5.8-rc1 consists of a single
 patch to fix coccicheck unneeded semicolon warning."

* tag 'linux-cpupower-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
  cpupower: Remove unneeded semicolon
2020-05-25 10:42:31 +02:00
David S. Miller
13209a8f73 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The MSCC bug fix in 'net' had to be slightly adjusted because the
register accesses are done slightly differently in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 13:47:27 -07:00
Linus Torvalds
caffb99b69 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix RCU warnings in ipv6 multicast router code, from Madhuparna
    Bhowmik.

 2) Nexthop attributes aren't being checked properly because of
    mis-initialized iterator, from David Ahern.

 3) Revert iop_idents_reserve() change as it caused performance
    regressions and was just working around what is really a UBSAN bug
    in the compiler. From Yuqi Jin.

 4) Read MAC address properly from ROM in bmac driver (double iteration
    proceeds past end of address array), from Jeremy Kerr.

 5) Add Microsoft Surface device IDs to r8152, from Marc Payne.

 6) Prevent reference to freed SKB in __netif_receive_skb_core(), from
    Boris Sukholitko.

 7) Fix ACK discard behavior in rxrpc, from David Howells.

 8) Preserve flow hash across packet scrubbing in wireguard, from Jason
    A. Donenfeld.

 9) Cap option length properly for SO_BINDTODEVICE in AX25, from Eric
    Dumazet.

10) Fix encryption error checking in kTLS code, from Vadim Fedorenko.

11) Missing BPF prog ref release in flow dissector, from Jakub Sitnicki.

12) dst_cache must be used with BH disabled in tipc, from Eric Dumazet.

13) Fix use after free in mlxsw driver, from Jiri Pirko.

14) Order kTLS key destruction properly in mlx5 driver, from Tariq
    Toukan.

15) Check devm_platform_ioremap_resource() return value properly in
    several drivers, from Tiezhu Yang.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits)
  net: smsc911x: Fix runtime PM imbalance on error
  net/mlx4_core: fix a memory leak bug.
  net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspend
  net: phy: mscc: fix initialization of the MACsec protocol mode
  net: stmmac: don't attach interface until resume finishes
  net: Fix return value about devm_platform_ioremap_resource()
  net/mlx5: Fix error flow in case of function_setup failure
  net/mlx5e: CT: Correctly get flow rule
  net/mlx5e: Update netdev txq on completions during closure
  net/mlx5: Annotate mutex destroy for root ns
  net/mlx5: Don't maintain a case of del_sw_func being null
  net/mlx5: Fix cleaning unmanaged flow tables
  net/mlx5: Fix memory leak in mlx5_events_init
  net/mlx5e: Fix inner tirs handling
  net/mlx5e: kTLS, Destroy key object after destroying the TIS
  net/mlx5e: Fix allowed tc redirect merged eswitch offload cases
  net/mlx5: Avoid processing commands before cmdif is ready
  net/mlx5: Fix a race when moving command interface to events mode
  net/mlx5: Add command entry handling completion
  rxrpc: Fix a memory leak in rxkad_verify_response()
  ...
2020-05-23 17:16:18 -07:00
John Hubbard
380e5c1d9b selftests/vm/write_to_hugetlbfs.c: fix unused variable warning
Remove unused variable "i", which was triggering a compiler warning.

Fixes: 29750f71a9 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-By: Mina Almasry <almasrymina@google.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Link: http://lkml.kernel.org/r/20200517001245.361762-2-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23 10:26:31 -07:00
John Hubbard
98097701cc selftests/vm/.gitignore: add mremap_dontunmap
Add mremap_dontunmap to .gitignore.

Fixes: 0c28759ee3 ("selftests: add MREMAP_DONTUNMAP selftest")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Brian Geffon <bgeffon@google.com>
Link: http://lkml.kernel.org/r/20200517002509.362401-2-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-23 10:26:31 -07:00
David S. Miller
a152b85984 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-05-23

The following pull-request contains BPF updates for your *net-next* tree.

We've added 50 non-merge commits during the last 8 day(s) which contain
a total of 109 files changed, 2776 insertions(+), 2887 deletions(-).

The main changes are:

1) Add a new AF_XDP buffer allocation API to the core in order to help
   lowering the bar for drivers adopting AF_XDP support. i40e, ice, ixgbe
   as well as mlx5 have been moved over to the new API and also gained a
   small improvement in performance, from Björn Töpel and Magnus Karlsson.

2) Add getpeername()/getsockname() attach types for BPF sock_addr programs
   in order to allow for e.g. reverse translation of load-balancer backend
   to service address/port tuple from a connected peer, from Daniel Borkmann.

3) Improve the BPF verifier is_branch_taken() logic to evaluate pointers
   being non-NULL, e.g. if after an initial test another non-NULL test on
   that pointer follows in a given path, then it can be pruned right away,
   from John Fastabend.

4) Larger rework of BPF sockmap selftests to make output easier to understand
   and to reduce overall runtime as well as adding new BPF kTLS selftests
   that run in combination with sockmap, also from John Fastabend.

5) Batch of misc updates to BPF selftests including fixing up test_align
   to match verifier output again and moving it under test_progs, allowing
   bpf_iter selftest to compile on machines with older vmlinux.h, and
   updating config options for lirc and v6 segment routing helpers, from
   Stanislav Fomichev, Andrii Nakryiko and Alan Maguire.

6) Conversion of BPF tracing samples outdated internal BPF loader to use
   libbpf API instead, from Daniel T. Lee.

7) Follow-up to BPF kernel test infrastructure in order to fix a flake in
   the XDP selftests, from Jesper Dangaard Brouer.

8) Minor improvements to libbpf's internal hashmap implementation, from
   Ian Rogers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 18:30:34 -07:00
Alan Maguire
a5dfaa2ab9 selftests/bpf: CONFIG_LIRC required for test_lirc_mode2.sh
test_lirc_mode2.sh assumes presence of /sys/class/rc/rc0/lirc*/uevent
which will not be present unless CONFIG_LIRC=y

Fixes: 6bdd533cee ("bpf: add selftest for lirc_mode2 type program")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1590147389-26482-3-git-send-email-alan.maguire@oracle.com
2020-05-23 01:12:31 +02:00
Alan Maguire
3c8e8cf4b1 selftests/bpf: CONFIG_IPV6_SEG6_BPF required for test_seg6_loop.o
test_seg6_loop.o uses the helper bpf_lwt_seg6_adjust_srh();
it will not be present if CONFIG_IPV6_SEG6_BPF is not specified.

Fixes: b061017f8b ("selftests/bpf: add realistic loop tests")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1590147389-26482-2-git-send-email-alan.maguire@oracle.com
2020-05-23 01:12:31 +02:00
Alan Maguire
6736aa793c selftests/bpf: Add general instructions for test execution
Getting a clean BPF selftests run involves ensuring latest trunk LLVM/clang
are used, pahole is recent (>=1.16) and config matches the specified
config file as closely as possible.  Add to bpf_devel_QA.rst and point
tools/testing/selftests/bpf/README.rst to it.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/1590146674-25485-1-git-send-email-alan.maguire@oracle.com
2020-05-23 01:11:09 +02:00
Amit Cohen
46ca11177e selftests: mlxsw: qos_mc_aware: Specify arping timeout as an integer
Starting from iputils s20190709 (used in Fedora 31), arping does not
support timeout being specified as a decimal:

$ arping -c 1 -I swp1 -b 192.0.2.66 -q -w 0.1
arping: invalid argument: '0.1'

Previously, such timeouts were rounded to an integer.

Fix this by specifying the timeout as an integer.

Fixes: a5ee171d08 ("selftests: mlxsw: qos_mc_aware: Add a test for UC awareness")
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 16:08:14 -07:00
Ido Schimmel
4d59e59cf4 selftests: netdevsim: Always initialize 'RET' variable
The variable is used by log_test() to check if the test case completely
successfully or not. In case it is not initialized at the start of a
test case, it is possible for the test case to fail despite not
encountering any errors.

Example:

```
...
TEST: Trap group statistics                                         [ OK ]
TEST: Trap policer                                                  [FAIL]
	Policer drop counter was not incremented
TEST: Trap policer binding                                          [FAIL]
	Policer drop counter was not incremented
```

Failure of trap_policer_test() caused trap_policer_bind_test() to fail
as well.

Fix by adding missing initialization of the variable.

Fixes: 5fbff58e27 ("selftests: netdevsim: Add test cases for devlink-trap policers")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 16:05:42 -07:00
David S. Miller
d3b968bc2d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2020-05-22

The following pull-request contains BPF updates for your *net* tree.

We've added 3 non-merge commits during the last 3 day(s) which contain
a total of 5 files changed, 69 insertions(+), 11 deletions(-).

The main changes are:

1) Fix to reject mmap()'ing read-only array maps as writable since BPF verifier
   relies on such map content to be frozen, from Andrii Nakryiko.

2) Fix breaking audit from secid_to_secctx() LSM hook by avoiding to use
   call_int_hook() since this hook is not stackable, from KP Singh.

3) Fix BPF flow dissector program ref leak on netns cleanup, from Jakub Sitnicki.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:35:35 -07:00
Roopa Prabhu
0534c5489c selftests: net: add fdb nexthop tests
This commit adds ipv4 and ipv6 fdb nexthop api tests to fib_nexthops.sh.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 14:00:38 -07:00
Srinivas Pandruvada
873e391ff3 tools/power/x86/intel-speed-select: Fix invalid core mask
The core mask display is wrong in some cases. This is showing more
cpus than the mask has. This is because mask is 64 bit but it used
with BIT() macro to get the presence of CPU which doesn't support
unsigned long long. Added a new macro for BIT_ULL and use that
to get the presence of a CPU.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2020-05-22 11:18:15 -07:00
Srinivas Pandruvada
e16ea66365 tools/power/x86/intel-speed-select: Increase CPU count
Increase CPU count so that more than 64 is supported in one request.

For example:
sudo ./intel-speed-select -d --cpu 0-66 core-power assoc -clos 0
The above command stops at 63. With this change, it can support more
CPU numbers from 0-255.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2020-05-22 11:18:15 -07:00
Prarit Bhargava
aa8b650b1a tools/power/x86/intel-speed-select: Fix json perf-profile output output
The 'intel-speed-select -f json perf-profile get-lock-status' command
outputs the package, die, and cpu data as separate fields.

ex)

  "package-0": {
    "die-0": {
      "cpu-0": {

Commit 74062363f8 ("tools/power/x86/intel-speed-select: Avoid duplicate Package strings for json") prettied this output so that it is a single line for
some json output commands and the same should be done for other commands.

Output package, die, and cpu info in a single line when using json output.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: platform-driver-x86@vger.kernel.org
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2020-05-22 11:18:15 -07:00
John Fastabend
d844a71bff bpf: Selftests, add printk to test_sk_lookup_kern to encode null ptr check
Adding a printk to test_sk_lookup_kern created the reported failure
where a pointer type is checked twice for NULL. Lets add it to the
progs test test_sk_lookup_kern.c so we test the case from C all the
way into the verifier.

We already have printk's in selftests so seems OK to add another one.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159009170603.6313.1715279795045285176.stgit@john-Precision-5820-Tower
2020-05-21 17:44:25 -07:00
John Fastabend
f9b16ec0ee bpf: Selftests, verifier case for non null pointer map value branch
When we have pointer type that is known to be non-null we only follow
the non-null branch. This adds tests to cover the map_value pointer
returned from a map lookup. To force an error if both branches are
followed we do an ALU op on R10.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159009168650.6313.7434084136067263554.stgit@john-Precision-5820-Tower
2020-05-21 17:44:25 -07:00
John Fastabend
c72b5cbb09 bpf: Selftests, verifier case for non null pointer check branch taken
When we have pointer type that is known to be non-null and comparing
against zero we only follow the non-null branch. This adds tests to
cover this case for reference tracking. Also add the other case when
comparison against a non-zero value and ensure we still fail with
unreleased reference.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159009166599.6313.1593680633787453767.stgit@john-Precision-5820-Tower
2020-05-21 17:44:25 -07:00
Jason A. Donenfeld
ee3c1aa3f3 wireguard: selftests: use newer iproute2 for gcc-10
gcc-10 switched to defaulting to -fno-common, which broke iproute2-5.4.
This was fixed in iproute-5.6, so switch to that. Because we're after a
stable testing surface, we generally don't like to bump these
unnecessarily, but in this case, being able to actually build is a basic
necessity.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-20 20:55:09 -07:00
Andrii Nakryiko
dfeb376dd4 bpf: Prevent mmap()'ing read-only maps as writable
As discussed in [0], it's dangerous to allow mapping BPF map, that's meant to
be frozen and is read-only on BPF program side, because that allows user-space
to actually store a writable view to the page even after it is frozen. This is
exacerbated by BPF verifier making a strong assumption that contents of such
frozen map will remain unchanged. To prevent this, disallow mapping
BPF_F_RDONLY_PROG mmap()'able BPF maps as writable, ever.

  [0] https://lore.kernel.org/bpf/CAEf4BzYGWYhXdp6BJ7_=9OQPJxQpgug080MMjdSB72i9R+5c6g@mail.gmail.com/

Fixes: fc9702273e ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY")
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/bpf/20200519053824.1089415-1-andriin@fb.com
2020-05-20 20:21:53 -07:00
Matt Helsley
0decf1f8de objtool: Enable compilation of objtool for all architectures
Objtool currently only compiles for x86 architectures. This is
fine as it presently does not support tooling for other
architectures. However, we would like to be able to convert other
kernel tools to run as objtool sub commands because they too
process ELF object files. This will allow us to convert tools
such as recordmcount to use objtool's ELF code.

Since much of recordmcount's ELF code is copy-paste code to/from
a variety of other kernel tools (look at modpost for example) this
means that if we can convert recordmcount we can convert more.

We define weak definitions for subcommand entry functions and other weak
definitions for shared functions critical to building existing
subcommands. These return 127 when the command is missing which signify
tools that do not exist on all architectures.  In this case the "check"
and "orc" tools do not exist on all architectures so we only add them
for x86. Future changes adding support for "check", to arm64 for
example, can then modify the SUBCMD_CHECK variable when building for
arm64.

Objtool is not currently wired in to KConfig to be built for other
architectures because it's not needed for those architectures and
there are no commands it supports other than those for x86. As more
command support is enabled on various architectures the necessary
KConfig changes can be made (e.g. adding "STACK_VALIDATION") to
trigger building objtool.

[ jpoimboe: remove aliases, add __weak macro, add error messages ]

Cc: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Matt Helsley <mhelsley@vmware.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-05-20 09:17:28 -05:00
Matt Helsley
d37c90d47f objtool: Move struct objtool_file into arch-independent header
The objtool_file structure describes the files objtool works on,
is used by the check subcommand, and the check.h header is included
by the orc subcommands so it's presently used by all subcommands.

Since the structure will be useful in all subcommands besides check,
and some subcommands may not want to include check.h to get the
definition, split the structure out into a new header meant for use
by all objtool subcommands.

Signed-off-by: Matt Helsley <mhelsley@vmware.com>
Reviewed-by: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-05-20 08:35:20 -05:00
Matt Helsley
f15c648f20 objtool: Exit successfully when requesting help
When the user requests help it's not an error so do not exit with
a non-zero exit code. This is not especially useful for a user but
any script that might wish to check that objtool --help is at least
available can't rely on the exit code to crudely check that, for
example, building an objtool executable succeeds.

Signed-off-by: Matt Helsley <mhelsley@vmware.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-05-20 08:32:52 -05:00
Josh Poimboeuf
ae033f088f objtool: Add check_kcov_mode() to the uaccess safelist
check_kcov_mode() is called by write_comp_data() and
__sanitizer_cov_trace_pc(), which are already on the uaccess safe list.
It's notrace and doesn't call out to anything else, so add it to the
list too.

This fixes the following warnings:

  kernel/kcov.o: warning: objtool: __sanitizer_cov_trace_pc()+0x15: call to check_kcov_mode() with UACCESS enabled
  kernel/kcov.o: warning: objtool: write_comp_data()+0x1b: call to check_kcov_mode() with UACCESS enabled

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2020-05-20 08:30:43 -05:00
Paolo Bonzini
9d5272f5e3 Merge tag 'noinstr-x86-kvm-2020-05-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into HEAD 2020-05-20 03:40:09 -04:00
Andrii Nakryiko
dda18a5c0b selftests/bpf: Convert bpf_iter_test_kern{3, 4}.c to define own bpf_iter_meta
b9f4c01f3e ("selftest/bpf: Make bpf_iter selftest compilable against old vmlinux.h")
missed the fact that bpf_iter_test_kern{3,4}.c are not just including
bpf_iter_test_kern_common.h and need similar bpf_iter_meta re-definition
explicitly.

Fixes: b9f4c01f3e ("selftest/bpf: Make bpf_iter selftest compilable against old vmlinux.h")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200519192341.134360-1-andriin@fb.com
2020-05-19 14:03:59 -07:00
Ralph Campbell
fee9f6d1b8 mm/hmm/test: add selftests for HMM
Add some basic stand alone self tests for HMM.
The test program and shell scripts use the test_hmm.ko driver to exercise
HMM functionality in the kernel.

Link: https://lore.kernel.org/r/20200422195028.3684-3-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-19 16:48:31 -03:00
Andrii Nakryiko
b9f4c01f3e selftest/bpf: Make bpf_iter selftest compilable against old vmlinux.h
It's good to be able to compile bpf_iter selftest even on systems that don't
have the very latest vmlinux.h, e.g., for libbpf tests against older kernels in
Travis CI. To that extent, re-define bpf_iter_meta and corresponding bpf_iter
context structs in each selftest. To avoid type clashes with vmlinux.h, rename
vmlinux.h's definitions to get them out of the way.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20200518234516.3915052-1-andriin@fb.com
2020-05-19 11:41:49 -07:00
Alexei Starovoitov
fb53d3b637 tools/bpf: sync bpf.h
Sync tools/include/uapi/linux/bpf.h from include/uapi.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-05-19 11:39:53 -07:00
Daniel Borkmann
566fc3f5d1 bpf, testing: Add get{peer, sock}name selftests to test_progs
Extend the existing connect_force_port test to assert get{peer,sock}name programs
as well. The workflow for e.g. IPv4 is as follows: i) server binds to concrete
port, ii) client calls getsockname() on server fd which exposes 1.2.3.4:60000 to
client, iii) client connects to service address 1.2.3.4:60000 binds to concrete
local address (127.0.0.1:22222) and remaps service address to a concrete backend
address (127.0.0.1:60123), iv) client then calls getsockname() on its own fd to
verify local address (127.0.0.1:22222) and getpeername() on its own fd which then
publishes service address (1.2.3.4:60000) instead of actual backend. Same workflow
is done for IPv6 just with different address/port tuples.

  # ./test_progs -t connect_force_port
  #14 connect_force_port:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/3343da6ad08df81af715a95d61a84fb4a960f2bf.1589841594.git.daniel@iogearbox.net
2020-05-19 11:32:04 -07:00
Daniel Borkmann
05ee19c18c bpf, bpftool: Enable get{peer, sock}name attach types
Make bpftool aware and add the new get{peer,sock}name attach types to its
cli, documentation and bash completion to allow attachment/detachment of
sock_addr programs there.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/9765b3d03e4c29210c4df56a9cc7e52f5f7bb5ef.1589841594.git.daniel@iogearbox.net
2020-05-19 11:32:04 -07:00
Daniel Borkmann
f15ed0185d bpf, libbpf: Enable get{peer, sock}name attach types
Trivial patch to add the new get{peer,sock}name attach types to the section
definitions in order to hook them up to sock_addr cgroup program type.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/7fcd4b1e41a8ebb364754a5975c75a7795051bd2.1589841594.git.daniel@iogearbox.net
2020-05-19 11:32:04 -07:00
Daniel Borkmann
1b66d25361 bpf: Add get{peer, sock}name attach types for sock_addr
As stated in 983695fa67 ("bpf: fix unconnected udp hooks"), the objective
for the existing cgroup connect/sendmsg/recvmsg/bind BPF hooks is to be
transparent to applications. In Cilium we make use of these hooks [0] in
order to enable E-W load balancing for existing Kubernetes service types
for all Cilium managed nodes in the cluster. Those backends can be local
or remote. The main advantage of this approach is that it operates as close
as possible to the socket, and therefore allows to avoid packet-based NAT
given in connect/sendmsg/recvmsg hooks we only need to xlate sock addresses.

This also allows to expose NodePort services on loopback addresses in the
host namespace, for example. As another advantage, this also efficiently
blocks bind requests for applications in the host namespace for exposed
ports. However, one missing item is that we also need to perform reverse
xlation for inet{,6}_getname() hooks such that we can return the service
IP/port tuple back to the application instead of the remote peer address.

The vast majority of applications does not bother about getpeername(), but
in a few occasions we've seen breakage when validating the peer's address
since it returns unexpectedly the backend tuple instead of the service one.
Therefore, this trivial patch allows to customise and adds a getpeername()
as well as getsockname() BPF cgroup hook for both IPv4 and IPv6 in order
to address this situation.

Simple example:

  # ./cilium/cilium service list
  ID   Frontend     Service Type   Backend
  1    1.2.3.4:80   ClusterIP      1 => 10.0.0.10:80

Before; curl's verbose output example, no getpeername() reverse xlation:

  # curl --verbose 1.2.3.4
  * Rebuilt URL to: 1.2.3.4/
  *   Trying 1.2.3.4...
  * TCP_NODELAY set
  * Connected to 1.2.3.4 (10.0.0.10) port 80 (#0)
  > GET / HTTP/1.1
  > Host: 1.2.3.4
  > User-Agent: curl/7.58.0
  > Accept: */*
  [...]

After; with getpeername() reverse xlation:

  # curl --verbose 1.2.3.4
  * Rebuilt URL to: 1.2.3.4/
  *   Trying 1.2.3.4...
  * TCP_NODELAY set
  * Connected to 1.2.3.4 (1.2.3.4) port 80 (#0)
  > GET / HTTP/1.1
  >  Host: 1.2.3.4
  > User-Agent: curl/7.58.0
  > Accept: */*
  [...]

Originally, I had both under a BPF_CGROUP_INET{4,6}_GETNAME type and exposed
peer to the context similar as in inet{,6}_getname() fashion, but API-wise
this is suboptimal as it always enforces programs having to test for ctx->peer
which can easily be missed, hence BPF_CGROUP_INET{4,6}_GET{PEER,SOCK}NAME split.
Similarly, the checked return code is on tnum_range(1, 1), but if a use case
comes up in future, it can easily be changed to return an error code instead.
Helper and ctx member access is the same as with connect/sendmsg/etc hooks.

  [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/61a479d759b2482ae3efb45546490bacd796a220.1589841594.git.daniel@iogearbox.net
2020-05-19 11:32:04 -07:00
Thomas Gleixner
1ed0948eea Merge tag 'noinstr-lds-2020-05-19' into core/rcu
Get the noinstr section and annotation markers to base the RCU parts on.
2020-05-19 15:50:34 +02:00
Ingo Molnar
7c0577f4e6 Linux 5.7-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl7BzV8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGg8EH/A2pXMTxtc96RI4S
 sttEsUQqbakFS0Z/2tQPpMGr/qW2e5eHgsTX/a3SiUeZiIXk6f4lMFkMuctzBf7p
 X77cNEDwGOEdbtCXTsMcmKSde7sP2zCXsPB8xTWLyE6rnaFRgikwwkeqgkIKhp1h
 bvOQV0t9HNGvxGAM0iZeOvQAvFl4vd7nS123/MYbir9cugfQUSJRueQ4BiCiJqVE
 6cNA7/vFzDJuFGszzIrJ7HXn/IdQMMWHkvTDjgBw0GZw1mDbGFbfbZwOeTz1ojCt
 smUQ4tIFxBa/VA5zx7dOy2P2keHbSVf4VLkZRPcceT7OqVS65ETmFDp+qt5NdWM5
 vZ8+7/0=
 =CyYH
 -----END PGP SIGNATURE-----

Merge tag 'v5.7-rc6' into objtool/core, to pick up fixes and resolve semantic conflict

Resolve structural conflict between:

  59566b0b62: ("x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up")

which introduced a new reference to 'ftrace_epilogue', and:

  0298739b79: ("x86,ftrace: Fix ftrace_regs_caller() unwind")

Which renamed it to 'ftrace_caller_end'. Rename the new usage site in the merge commit.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-05-18 13:09:37 +03:00
David Ahern
eb682677f5 selftests: Drop 'pref medium' in route checks
The 'pref medium' attribute was moved in iproute2 to be near the prefix
which is where it applies versus after the last nexthop. The nexthop
tests were updated to drop the string from route checking, but it crept
in again with the compat tests.

Fixes: 4dddb5be13 ("selftests: net: add new testcases for nexthop API compat mode sysctl")
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-17 12:26:55 -07:00
Ido Schimmel
04cc99d9bd selftests: mlxsw: Do not hard code trap group name
It can be derived dynamically from the trap's name, so drop it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 16:42:31 -07:00
Ido Schimmel
84e0d83567 selftests: devlink_lib: Remove double blank line
One blank line is enough.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 16:42:31 -07:00
Linus Torvalds
5d438e071f A new testcase for guest debugging (gdbstub) that exposed a bunch of
bugs, mostly for AMD processors.  And a few other x86 fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl6/0xcUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOZuwf/bQZw/SP9awLjOOVsRaSWUmwRGD4q
 6KVq9+JYsPU4CyJ7P+vdsFF39a0ixoAnKWqRe/vsXdXZrdYCDUuQxh+7X+lmjKAb
 dCQBnoqxI0w3yuxrm9Kn6Xs1AGIWibaRlZnXUKbuyn4ecFrh08OfYKGkYsEovhxK
 G4ftY4/xyM7Qvm0fq7ZmzxPrkzd74HDZBvB83R6uiyPiX3w4O9qumqkUogcVXIJX
 l3mnvSPClDDX4FOr8uhnU93varuR7Bek4Fh+Abj4uNks/F3z9ooJO9Hy9E+V5fhY
 g6Oj2IrxDwJ2G6hqyucr1kujukJC1bX2nMZ1O4gNayXsxZEU/JtI0Y26SA==
 =EzBt
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "A new testcase for guest debugging (gdbstub) that exposed a bunch of
  bugs, mostly for AMD processors. And a few other x86 fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Fix off-by-one error in kvm_vcpu_ioctl_x86_setup_mce
  KVM: x86: Fix pkru save/restore when guest CR4.PKE=0, move it to x86.c
  KVM: SVM: Disable AVIC before setting V_IRQ
  KVM: Introduce kvm_make_all_cpus_request_except()
  KVM: VMX: pass correct DR6 for GD userspace exit
  KVM: x86, SVM: isolate vcpu->arch.dr6 from vmcb->save.dr6
  KVM: SVM: keep DR6 synchronized with vcpu->arch.dr6
  KVM: nSVM: trap #DB and #BP to userspace if guest debugging is on
  KVM: selftests: Add KVM_SET_GUEST_DEBUG test
  KVM: X86: Fix single-step with KVM_SET_GUEST_DEBUG
  KVM: X86: Set RTM for DB_VECTOR too for KVM_EXIT_DEBUG
  KVM: x86: fix DR6 delivery for various cases of #DB injection
  KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly
2020-05-16 13:39:22 -07:00
John Fastabend
96586dd926 bpf: Selftests, add ktls tests to test_sockmap
Until now we have only had minimal ktls+sockmap testing when being
used with helpers and different sendmsg/sendpage patterns. Add a
pass with ktls here.

To run just ktls tests,

 $ ./test_sockmap --whitelist="ktls"

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939736278.15176.5435314315563203761.stgit@john-Precision-5820-Tower
2020-05-16 02:56:49 +02:00
John Fastabend
a7238f7c79 bpf: Selftests, add blacklist to test_sockmap
This adds a blacklist to test_sockmap. For example, now we can run
all apply and cork tests except those with timeouts by doing,

 $ ./test_sockmap --whitelist "apply,cork" --blacklist "hang"

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939734350.15176.6643981099665208826.stgit@john-Precision-5820-Tower
2020-05-16 02:56:49 +02:00
John Fastabend
065a74cbd0 bpf: Selftests, add whitelist option to test_sockmap
Allow running specific tests with a comma deliminated whitelist. For example
to run all apply and cork tests.

 $ ./test_sockmap --whitelist="cork,apply"

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939732464.15176.1959113294944564542.stgit@john-Precision-5820-Tower
2020-05-16 02:56:49 +02:00
John Fastabend
b98ca90c56 bpf: Selftests, provide verbose option for selftests execution
Pass options from command line args into individual tests which allows us
to use verbose option from command line with selftests. Now when verbose
option is set individual subtest details will be printed. Also we can
consolidate cgroup bring up and tear down.

Additionally just setting verbose is very noisy so introduce verbose=1
and verbose=2. Really verbose=2 is only useful when developing tests
or debugging some specific issue.

For example now we get output like this with --verbose,

#20/17 sockhash:txmsg test pull-data:OK
 [TEST 160]: (512, 1, 3, sendpage, pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 1 cnt 512 err 0
 [TEST 161]: (100, 1, 5, sendpage, pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 3 cnt 100 err 0
 [TEST 162]: (2, 1024, 256, sendpage, pop (4096,8192),): msg_loop_rx: iov_count 1 iov_buf 255 cnt 2 err 0
 [TEST 163]: (512, 1, 3, sendpage, redir,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 1 cnt 512 err 0
 [TEST 164]: (100, 1, 5, sendpage, redir,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 3 cnt 100 err 0
 [TEST 165]: (512, 1, 3, sendpage, cork 512,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 1 cnt 512 err 0
 [TEST 166]: (100, 1, 5, sendpage, cork 512,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 3 cnt 100 err 0
 [TEST 167]: (512, 1, 3, sendpage, redir,cork 4,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 1 cnt 512 err 0
 [TEST 168]: (100, 1, 5, sendpage, redir,cork 4,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 3 cnt 100 err 0

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939730412.15176.1975675235035143367.stgit@john-Precision-5820-Tower
2020-05-16 02:56:49 +02:00
John Fastabend
328aa08a08 bpf: Selftests, break down test_sockmap into subtests
At the moment test_sockmap runs all 800+ tests ungrouped which is not
ideal because it makes it hard to see what is failing but also more
importantly its hard to confirm all cases are tested. Additionally,
after inspecting we noticed the runtime is bloated because we run
many duplicate tests. Worse some of these tests are known error cases
that wait for the recvmsg handler to timeout which creats long delays.
Also we noted some tests were not clearing their options and as a
result the following tests would run with extra and incorrect options.

Fix this by reorganizing test code so its clear what tests are running
and when. Then it becomes easy to remove duplication and run tests with
only the set of send/recv patterns that are relavent.

To accomplish this break test_sockmap into subtests and remove
unnecessary duplication. The output is more readable now and
the runtime reduced.

Now default output prints subtests like this,

 $ ./test_sockmap
 # 1/ 6  sockmap:txmsg test passthrough:OK
 ...
 #22/ 1 sockhash:txmsg test push/pop data:OK
 Pass: 22 Fail: 0

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939728384.15176.13601520183665880762.stgit@john-Precision-5820-Tower
2020-05-16 02:56:49 +02:00
John Fastabend
18d4e900a4 bpf: Selftests, improve test_sockmap total bytes counter
The recv thread in test_sockmap waits to receive all bytes from sender but
in the case we use pop data it may wait for more bytes then actually being
sent. This stalls the test harness for multiple seconds. Because this
happens in multiple tests it slows time to run the selftest.

Fix by doing a better job of accounting for total bytes when pop helpers
are used.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939726542.15176.5964532245173539540.stgit@john-Precision-5820-Tower
2020-05-16 02:56:49 +02:00
John Fastabend
248aba1d52 bpf: Selftests, print error in test_sockmap error cases
Its helpful to know the error value if an error occurs.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939724566.15176.12079885932643225626.stgit@john-Precision-5820-Tower
2020-05-16 02:56:49 +02:00
John Fastabend
13a5f3ffd2 bpf: Selftests, sockmap test prog run without setting cgroup
Running test_sockmap with arguments to specify a test pattern requires
including a cgroup argument. Instead of requiring this if the option is
not provided create one

This is not used by selftest runs but I use it when I want to test a
specific test. Most useful when developing new code and/or tests.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939722675.15176.6294210959489131688.stgit@john-Precision-5820-Tower
2020-05-16 02:56:49 +02:00
John Fastabend
d79a32129b bpf: Selftests, remove prints from sockmap tests
The prints in the test_sockmap programs were only useful when we
didn't have enough control over test infrastructure to know from
user program what was being pushed into kernel side.

Now that we have or will shortly have better test controls lets
remove the printers. This means we can remove half the programs
and cleanup bpf side.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939720756.15176.9806965887313279429.stgit@john-Precision-5820-Tower
2020-05-16 02:56:49 +02:00
John Fastabend
991e35eebe bpf: Selftests, move sockmap bpf prog header into progs
Moves test_sockmap_kern.h into progs directory but does not change
code at all.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939718921.15176.5766299102332077086.stgit@john-Precision-5820-Tower
2020-05-16 02:56:49 +02:00
Stanislav Fomichev
3b09d27cc9 selftests/bpf: Move test_align under test_progs
There is a much higher chance we can see the regressions if the
test is part of test_progs.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200515194904.229296-2-sdf@google.com
2020-05-16 01:18:14 +02:00
Stanislav Fomichev
5366d22691 selftests/bpf: Fix test_align verifier log patterns
Commit 294f2fc6da ("bpf: Verifer, adjust_scalar_min_max_vals to always
call update_reg_bounds()") changed the way verifier logs some of its state,
adjust the test_align accordingly. Where possible, I tried to not copy-paste
the entire log line and resorted to dropping the last closing brace instead.

Fixes: 294f2fc6da ("bpf: Verifer, adjust_scalar_min_max_vals to always call update_reg_bounds()")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200515194904.229296-1-sdf@google.com
2020-05-16 01:18:07 +02:00
Ian Rogers
8d35d74f52 libbpf, hashmap: Fix signedness warnings
Fixes the following warnings:

  hashmap.c: In function ‘hashmap__clear’:
  hashmap.h:150:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
    150 |  for (bkt = 0; bkt < map->cap; bkt++)        \

  hashmap.c: In function ‘hashmap_grow’:
  hashmap.h:150:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
    150 |  for (bkt = 0; bkt < map->cap; bkt++)        \

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200515165007.217120-4-irogers@google.com
2020-05-16 01:06:05 +02:00
Ian Rogers
f516acd539 libbpf, hashmap: Remove unused #include
Remove #include of libbpf_internal.h that is unused.

Discussed in this thread:
https://lore.kernel.org/lkml/CAEf4BzZRmiEds_8R8g4vaAeWvJzPb4xYLnpF0X2VNY8oTzkphQ@mail.gmail.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200515165007.217120-3-irogers@google.com
2020-05-16 01:05:16 +02:00
David S. Miller
da07f52d3c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Move the bpf verifier trace check into the new switch statement in
HEAD.

Resolve the overlapping changes in hinic, where bug fixes overlap
the addition of VF support.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 13:48:59 -07:00
Linus Torvalds
f85c1598dd Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix sk_psock reference count leak on receive, from Xiyu Yang.

 2) CONFIG_HNS should be invisible, from Geert Uytterhoeven.

 3) Don't allow locking route MTUs in ipv6, RFCs actually forbid this,
    from Maciej Żenczykowski.

 4) ipv4 route redirect backoff wasn't actually enforced, from Paolo
    Abeni.

 5) Fix netprio cgroup v2 leak, from Zefan Li.

 6) Fix infinite loop on rmmod in conntrack, from Florian Westphal.

 7) Fix tcp SO_RCVLOWAT hangs, from Eric Dumazet.

 8) Various bpf probe handling fixes, from Daniel Borkmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (68 commits)
  selftests: mptcp: pm: rm the right tmp file
  dpaa2-eth: properly handle buffer size restrictions
  bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifier
  bpf: Add bpf_probe_read_{user, kernel}_str() to do_refine_retval_range
  bpf: Restrict bpf_probe_read{, str}() only to archs where they work
  MAINTAINERS: Mark networking drivers as Maintained.
  ipmr: Add lockdep expression to ipmr_for_each_table macro
  ipmr: Fix RCU list debugging warning
  drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c
  net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810
  tcp: fix error recovery in tcp_zerocopy_receive()
  MAINTAINERS: Add Jakub to networking drivers.
  MAINTAINERS: another add of Karsten Graul for S390 networking
  drivers: ipa: fix typos for ipa_smp2p structure doc
  pppoe: only process PADT targeted at local interfaces
  selftests/bpf: Enforce returning 0 for fentry/fexit programs
  bpf: Enforce returning 0 for fentry/fexit progs
  net: stmmac: fix num_por initialization
  security: Fix the default value of secid_to_secctx hook
  libbpf: Fix register naming in PT_REGS s390 macros
  ...
2020-05-15 13:10:06 -07:00
Linus Torvalds
ce24729667 linux-kselftest-5.7-rc6
This Kselftest update for Linux 5.7-rc6 consists of
 
 - lkdtm runner fixes to prevent dmesg clearing and shellcheck errors
 - ftrace test handling when test module doesn't exist
 - nsfs test fix to replace zero-length array with flexible-array
 - dmabuf-heaps test fix to return clear error value
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl6+0KwACgkQCwJExA0N
 QxyBog/8DME+7YtjZ8TqgeWuxMckDHqVBTLTobd3Wyd2Vsk6W3EWKaDo+R8sqRfN
 VmjExY1PH0+N4JAFuQxR1UAmO0E/YZ/HwGxU6IH4iAoXtVZ0Tc52jzlZkwo0/RwF
 uOXukaEjLNlMva06/CEptxZy4UAWKP1l9DaEYjY48zDC4zlXaPgGjk69MACZhK4E
 2M6mWJuUewf+PobZTGFv/4RPIAihZpx8gDMRBR/hy+RWKh1qNhpNeXQRrwTQm0u3
 5ewOQjP26VRam4pA/e7RsU1b2IR8nbDZFpX3dTEzkme9dJ2hoN3DZLOw0/WXS/aQ
 /Ha5bPBI7RMhq0FUQP1vMErZ6Da/YEkbTxmCpx1kL6vliKqMkHb+epLABTJlGGnD
 i+ENoLeJVhcyQ3vyWq3VNZDGlHYF3KuUftf21sdG4bXDajucnu/lSPvfSKzhMNIG
 c1+AcvTAeGTPHi2dCfdETjg1dakO4exrX9/ABvu3JY1pp9D8DLHZNc/O9iOvG0O5
 6Lp9LhtxCxKdAGJVxKxzXPjwPDSnzNV5sGm1ElFpXzA0Nv17VH/DIL8i6745sQXW
 ik29Z1QNv3QIOs5U1pe5wyY22D51UJOfS7hVWH+pWPveG46ApYeUxqmfNYoG/Vtr
 cMspPzCSgZTAvkswCGfnSahlqCSpJxzuxUEkM4HlPTulrW4jbDk=
 =2cqI
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:

 - lkdtm runner fixes to prevent dmesg clearing and shellcheck errors

 - ftrace test handling when test module doesn't exist

 - nsfs test fix to replace zero-length array with flexible-array

 - dmabuf-heaps test fix to return clear error value

* tag 'linux-kselftest-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/lkdtm: Use grep -E instead of egrep
  selftests/lkdtm: Don't clear dmesg when running tests
  selftests/ftrace: mark irqsoff_tracer.tc test as unresolved if the test module does not exist
  tools/testing: Replace zero-length array with flexible-array
  kselftests: dmabuf-heaps: Fix confused return value on expected error testing
2020-05-15 12:57:50 -07:00
David S. Miller
8e1381049e Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2020-05-15

The following pull-request contains BPF updates for your *net* tree.

We've added 9 non-merge commits during the last 2 day(s) which contain
a total of 14 files changed, 137 insertions(+), 43 deletions(-).

The main changes are:

1) Fix secid_to_secctx LSM hook default value, from Anders.

2) Fix bug in mmap of bpf array, from Andrii.

3) Restrict bpf_probe_read to archs where they work, from Daniel.

4) Enforce returning 0 for fentry/fexit progs, from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:57:21 -07:00
David S. Miller
3430223d39 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-05-15

The following pull-request contains BPF updates for your *net-next* tree.

We've added 37 non-merge commits during the last 1 day(s) which contain
a total of 67 files changed, 741 insertions(+), 252 deletions(-).

The main changes are:

1) bpf_xdp_adjust_tail() now allows to grow the tail as well, from Jesper.

2) bpftool can probe CONFIG_HZ, from Daniel.

3) CAP_BPF is introduced to isolate user processes that use BPF infra and
   to secure BPF networking services by dropping CAP_SYS_ADMIN requirement
   in certain cases, from Alexei.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:43:52 -07:00
Matthieu Baerts
9a2dbb59eb selftests: mptcp: pm: rm the right tmp file
"$err" is a variable pointing to a temp file. "$out" is not: only used
as a local variable in "check()" and representing the output of a
command line.

Fixes: eedbc68532 (selftests: add PM netlink functional tests)
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:33:56 -07:00
Vlad Buslov
e7534fd42a selftests: implement flower classifier terse dump tests
Implement two basic tests to verify terse dump functionality of flower
classifier:

- Test that verifies that terse dump works.

- Test that verifies that terse dump doesn't print filter key.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:23:11 -07:00
Alexei Starovoitov
8162600118 selftests/bpf: Use CAP_BPF and CAP_PERFMON in tests
Make all test_verifier test exercise CAP_BPF and CAP_PERFMON

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200513230355.7858-4-alexei.starovoitov@gmail.com
2020-05-15 17:29:41 +02:00
Daniel Borkmann
0ee52c0f6c bpf, bpftool: Allow probing for CONFIG_HZ from kernel config
In Cilium we've recently switched to make use of bpf_jiffies64() for
parts of our tc and XDP datapath since bpf_ktime_get_ns() is more
expensive and high-precision is not needed for our timeouts we have
anyway. Our agent has a probe manager which picks up the json of
bpftool's feature probe and we also use the macro output in our C
programs e.g. to have workarounds when helpers are not available on
older kernels.

Extend the kernel config info dump to also include the kernel's
CONFIG_HZ, and rework the probe_kernel_image_config() for allowing a
macro dump such that CONFIG_HZ can be propagated to BPF C code as a
simple define if available via config. Latter allows to have _compile-
time_ resolution of jiffies <-> sec conversion in our code since all
are propagated as known constants.

Given we cannot generally assume availability of kconfig everywhere,
we also have a kernel hz probe [0] as a fallback. Potentially, bpftool
could have an integrated probe fallback as well, although to derive it,
we might need to place it under 'bpftool feature probe full' or similar
given it would slow down the probing process overall. Yet 'full' doesn't
fit either for us since we don't want to pollute the kernel log with
warning messages from bpf_probe_write_user() and bpf_trace_printk() on
agent startup; I've left it out for the time being.

  [0] https://github.com/cilium/cilium/blob/master/bpf/cilium-probe-kernel-hz.c

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200513075849.20868-1-daniel@iogearbox.net
2020-05-15 08:18:53 -07:00
Sami Tolvanen
6b5dd716da objtool: optimize add_dead_ends for split sections
Instead of iterating through all instructions to find the last
instruction each time .rela.discard.(un)reachable points beyond the
section, use find_insn to locate the last instruction by looking at
the last bytes of the section instead.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200421220843.188260-3-samitolvanen@google.com
2020-05-15 10:35:13 +02:00
Sami Tolvanen
28fe1d7bf8 objtool: use gelf_getsymshndx to handle >64k sections
Currently, objtool fails to load the correct section for symbols when
the index is greater than SHN_LORESERVE. Use gelf_getsymshndx instead
of gelf_getsym to handle >64k sections.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200421220843.188260-2-samitolvanen@google.com
2020-05-15 10:35:13 +02:00