Commit Graph

735 Commits

Author SHA1 Message Date
Mark Starovoytov
791bb3fcaf net: macsec: add support for specifying offload upon link creation
This patch adds new netlink attribute to allow a user to (optionally)
specify the desired offload mode immediately upon MACSec link creation.

Separate iproute patch will be required to support this from user space.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29 21:34:21 -07:00
Antoine Tenart
21114b7fee net: macsec: add support for offloading to the MAC
This patch adds a new MACsec offloading option, MACSEC_OFFLOAD_MAC,
allowing a user to select a MAC as a provider for MACsec offloading
operations.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26 20:17:36 -07:00
David S. Miller
9fb16955fb Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Overlapping header include additions in macsec.c

A bug fix in 'net' overlapping with the removal of 'version'
string in ena_netdev.c

Overlapping test additions in selftests Makefile

Overlapping PCI ID table adjustments in iwlwifi driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 18:58:11 -07:00
Arnaldo Carvalho de Melo
564200ed8e tools headers uapi: Update linux/in.h copy
To get the changes in:

  2677625387 ("seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number")

That ends up automatically adding the new IPPROTO_ETHERNET to the socket
args beautifiers:

  $ tools/perf/trace/beauty/socket_ipproto.sh > before

Apply this patch:

  $ tools/perf/trace/beauty/socket_ipproto.sh > after
  $ diff -u before after
  --- before	2020-03-19 11:48:36.876673819 -0300
  +++ after	2020-03-19 11:49:00.148541377 -0300
  @@ -6,6 +6,7 @@
   	[132] = "SCTP",
   	[136] = "UDPLITE",
   	[137] = "MPLS",
  +	[143] = "ETHERNET",
   	[17] = "UDP",
   	[1] = "ICMP",
   	[22] = "IDP",
  $

Addresses this tools/perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
  diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h

Cc: David S. Miller <davem@davemloft.net>
Cc: Paolo Lungaroni <paolo.lungaroni@cnit.it>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-19 12:31:49 -03:00
Tobias Klauser
bcd66b10b5 tools/bpf: Move linux/types.h for selftests and bpftool
Commit fe4eb069ed ("bpftool: Use linux/types.h from source tree for
profiler build") added a build dependency on tools/testing/selftests/bpf
to tools/bpf/bpftool. This is suboptimal with respect to a possible
stand-alone build of bpftool.

Fix this by moving tools/testing/selftests/bpf/include/uapi/linux/types.h
to tools/include/uapi/linux/types.h.

This requires an adjustment in the include search path order for the
tests in tools/testing/selftests/bpf so that tools/include/linux/types.h
is selected when building host binaries and
tools/include/uapi/linux/types.h is selected when building bpf binaries.

Verified by compiling bpftool and the bpf selftests on x86_64 with this
change.

Fixes: fe4eb069ed ("bpftool: Use linux/types.h from source tree for profiler build")
Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200313113105.6918-1-tklauser@distanz.ch
2020-03-13 20:56:34 +01:00
Eelco Chaudron
d831ee84bf bpf: Add bpf_xdp_output() helper
Introduce new helper that reuses existing xdp perf_event output
implementation, but can be called from raw_tracepoint programs
that receive 'struct xdp_buff *' as a tracepoint argument.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158348514556.2239.11050972434793741444.stgit@xdp-tutorial
2020-03-12 17:47:38 -07:00
Carlos Neira
b4490c5c4e bpf: Added new helper bpf_get_ns_current_pid_tgid
New bpf helper bpf_get_ns_current_pid_tgid,
This helper will return pid and tgid from current task
which namespace matches dev_t and inode number provided,
this will allows us to instrument a process inside a container.

Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200304204157.58695-3-cneirabustos@gmail.com
2020-03-12 17:33:11 -07:00
Ian Rogers
441b62acd9 tools: Fix off-by 1 relative directory includes
This is currently working due to extra include paths in the build.

Committer testing:

  $ cd tools/include/uapi/asm/

Before this patch:

  $ ls -la ../../arch/x86/include/uapi/asm/errno.h
  ls: cannot access '../../arch/x86/include/uapi/asm/errno.h': No such file or directory
  $

After this patch;

  $ ls -la ../../../arch/x86/include/uapi/asm/errno.h
  -rw-rw-r--. 1 acme acme 31 Feb 20 12:42 ../../../arch/x86/include/uapi/asm/errno.h
  $

Check that that is still under tools/, i.e. hasn't escaped into the main
kernel sources:

  $ cd ../../../arch/x86/include/uapi/asm/
  $ pwd
  /home/acme/git/perf/tools/arch/x86/include/uapi/asm
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexios Zavras <alexios.zavras@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Li <liwei391@huawei.com>
Link: http://lore.kernel.org/lkml/20200306071110.130202-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-06 08:36:46 -03:00
KP Singh
ae24082331 bpf: Introduce BPF_MODIFY_RETURN
When multiple programs are attached, each program receives the return
value from the previous program on the stack and the last program
provides the return value to the attached function.

The fmod_ret bpf programs are run after the fentry programs and before
the fexit programs. The original function is only called if all the
fmod_ret programs return 0 to avoid any unintended side-effects. The
success value, i.e. 0 is not currently configurable but can be made so
where user-space can specify it at load time.

For example:

int func_to_be_attached(int a, int b)
{  <--- do_fentry

do_fmod_ret:
   <update ret by calling fmod_ret>
   if (ret != 0)
        goto do_fexit;

original_function:

    <side_effects_happen_here>

}  <--- do_fexit

The fmod_ret program attached to this function can be defined as:

SEC("fmod_ret/func_to_be_attached")
int BPF_PROG(func_name, int a, int b, int ret)
{
        // This will skip the original function logic.
        return 1;
}

The first fmod_ret program is passed 0 in its return argument.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200304191853.1529-4-kpsingh@chromium.org
2020-03-04 13:41:05 -08:00
Andrii Nakryiko
1aae4bdd78 bpf: Switch BPF UAPI #define constants used from BPF program side to enums
Switch BPF UAPI constants, previously defined as #define macro, to anonymous
enum values. This preserves constants values and behavior in expressions, but
has added advantaged of being captured as part of DWARF and, subsequently, BTF
type info. Which, in turn, greatly improves usefulness of generated vmlinux.h
for BPF applications, as it will not require BPF users to copy/paste various
flags and constants, which are frequently used with BPF helpers. Only those
constants that are used/useful from BPF program side are converted.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200303003233.3496043-2-andriin@fb.com
2020-03-04 17:00:05 +01:00
Willem de Bruijn
b0ac4941aa bpf: Sync uapi bpf.h to tools/
sync tools/include/uapi/linux/bpf.h to match include/uapi/linux/bpf.h

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200303200503.226217-3-willemdebruijn.kernel@gmail.com
2020-03-03 16:23:59 -08:00
David S. Miller
b105e8e281 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-02-21

The following pull-request contains BPF updates for your *net-next* tree.

We've added 25 non-merge commits during the last 4 day(s) which contain
a total of 33 files changed, 2433 insertions(+), 161 deletions(-).

The main changes are:

1) Allow for adding TCP listen sockets into sock_map/hash so they can be used
   with reuseport BPF programs, from Jakub Sitnicki.

2) Add a new bpf_program__set_attach_target() helper for adding libbpf support
   to specify the tracepoint/function dynamically, from Eelco Chaudron.

3) Add bpf_read_branch_records() BPF helper which helps use cases like profile
   guided optimizations, from Daniel Xu.

4) Enable bpf_perf_event_read_value() in all tracing programs, from Song Liu.

5) Relax BTF mandatory check if only used for libbpf itself e.g. to process
   BTF defined maps, from Andrii Nakryiko.

6) Move BPF selftests -mcpu compilation attribute from 'probe' to 'v3' as it has
   been observed that former fails in envs with low memlock, from Yonghong Song.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-21 15:22:45 -08:00
Linus Torvalds
3dc55dba67 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Limit xt_hashlimit hash table size to avoid OOM or hung tasks, from
    Cong Wang.

 2) Fix deadlock in xsk by publishing global consumer pointers when NAPI
    is finished, from Magnus Karlsson.

 3) Set table field properly to RT_TABLE_COMPAT when necessary, from
    Jethro Beekman.

 4) NLA_STRING attributes are not necessary NULL terminated, deal wiht
    that in IFLA_ALT_IFNAME. From Eric Dumazet.

 5) Fix checksum handling in atlantic driver, from Dmitry Bezrukov.

 6) Handle mtu==0 devices properly in wireguard, from Jason A.
    Donenfeld.

 7) Fix several lockdep warnings in bonding, from Taehee Yoo.

 8) Fix cls_flower port blocking, from Jason Baron.

 9) Sanitize internal map names in libbpf, from Toke Høiland-Jørgensen.

10) Fix RDMA race in qede driver, from Michal Kalderon.

11) Fix several false lockdep warnings by adding conditions to
    list_for_each_entry_rcu(), from Madhuparna Bhowmik.

12) Fix sleep in atomic in mlx5 driver, from Huy Nguyen.

13) Fix potential deadlock in bpf_map_do_batch(), from Yonghong Song.

14) Hey, variables declared in switch statement before any case
    statements are not initialized. I learn something every day. Get
    rids of this stuff in several parts of the networking, from Kees
    Cook.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (99 commits)
  bnxt_en: Issue PCIe FLR in kdump kernel to cleanup pending DMAs.
  bnxt_en: Improve device shutdown method.
  net: netlink: cap max groups which will be considered in netlink_bind()
  net: thunderx: workaround BGX TX Underflow issue
  ionic: fix fw_status read
  net: disable BRIDGE_NETFILTER by default
  net: macb: Properly handle phylink on at91rm9200
  s390/qeth: fix off-by-one in RX copybreak check
  s390/qeth: don't warn for napi with 0 budget
  s390/qeth: vnicc Fix EOPNOTSUPP precedence
  openvswitch: Distribute switch variables for initialization
  net: ip6_gre: Distribute switch variables for initialization
  net: core: Distribute switch variables for initialization
  udp: rehash on disconnect
  net/tls: Fix to avoid gettig invalid tls record
  bpf: Fix a potential deadlock with bpf_map_do_batch
  bpf: Do not grab the bucket spinlock by default on htab batch ops
  ice: Wait for VF to be reset/ready before configuration
  ice: Don't tell the OS that link is going down
  ice: Don't reject odd values of usecs set by user
  ...
2020-02-21 11:59:51 -08:00
Daniel Xu
67306f84ca selftests/bpf: Add bpf_read_branch_records() selftest
Add a selftest to test:

* default bpf_read_branch_records() behavior
* BPF_F_GET_BRANCH_RECORDS_SIZE flag behavior
* error path on non branch record perf events
* using helper to write to stack
* using helper to write to global

On host with hardware counter support:

    # ./test_progs -t perf_branches
    #27/1 perf_branches_hw:OK
    #27/2 perf_branches_no_hw:OK
    #27 perf_branches:OK
    Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED

On host without hardware counter support (VM):

    # ./test_progs -t perf_branches
    #27/1 perf_branches_hw:OK
    #27/2 perf_branches_no_hw:OK
    #27 perf_branches:OK
    Summary: 1/2 PASSED, 1 SKIPPED, 0 FAILED

Also sync tools/include/uapi/linux/bpf.h.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200218030432.4600-3-dxu@dxuuu.xyz
2020-02-19 15:01:07 -08:00
Toke Høiland-Jørgensen
f25975f42f bpf, uapi: Remove text about bpf_redirect_map() giving higher performance
The performance of bpf_redirect() is now roughly the same as that of
bpf_redirect_map(). However, David Ahern pointed out that the header file
has not been updated to reflect this, and still says that a significant
performance increase is possible when using bpf_redirect_map(). Remove this
text from the bpf_redirect_map() description, and reword the description in
bpf_redirect() slightly. Also fix the 'Return' section of the
bpf_redirect_map() documentation.

Fixes: 1d233886dd ("xdp: Use bulking for non-map XDP_REDIRECT and consolidate code paths")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200218130334.29889-1-toke@redhat.com
2020-02-18 15:31:31 +01:00
Arnaldo Carvalho de Melo
2a8d017d46 tools headers kvm: Sync linux/kvm.h with the kernel sources
To pick up the changes from:

  7de3f1423f ("KVM: s390: Add new reset vcpu API")

So far we're ignoring those arch specific ioctls, we need to revisit
this at some time to have arch specific tables, etc:

  $ grep S390 tools/perf/trace/beauty/kvm_ioctl.sh
  	egrep -v " ((ARM|PPC|S390)_|[GS]ET_(DEBUGREGS|PIT2|XSAVE|TSC_KHZ)|CREATE_SPAPR_TCE_64)" | \
  $

This addresses these tools/perf build warnings:

  Warning: Kernel ABI header at 'tools/arch/arm/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm/include/uapi/asm/kvm.h'
  diff -u tools/arch/arm/include/uapi/asm/kvm.h arch/arm/include/uapi/asm/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-12 12:45:24 -03:00
Arnaldo Carvalho de Melo
8c65582f82 tools include UAPI: Sync sound/asound.h copy
Picking the changes from:

  46b770f720 ("ALSA: uapi: Fix sparse warning")
  a103a39899 ("ALSA: control: Fix incompatible protocol error")
  bd3eb4e87e ("ALSA: ctl: bump protocol version up to v2.1.0")
  ff16351e3f ("ALSA: ctl: remove dimen member from elem_info structure")
  5422835666 ("ALSA: ctl: remove unused macro for timestamping of elem_value")
  7fd7d6c504 ("ALSA: uapi: Fix typos and header inclusion in asound.h")
  1cfaef9617 ("ALSA: bump uapi version numbers")
  80fe7430c7 ("ALSA: add new 32-bit layout for snd_pcm_mmap_status/control")
  07094ae6f9 ("ALSA: Avoid using timespec for struct snd_timer_tread")
  d9e5582c4b ("ALSA: Avoid using timespec for struct snd_rawmidi_status")
  3ddee7f88a ("ALSA: Avoid using timespec for struct snd_pcm_status")
  a4e7dd35b9 ("ALSA: Avoid using timespec for struct snd_ctl_elem_value")
  a07804cc74 ("ALSA: Avoid using timespec for struct snd_timer_status")

Which entails no changes in the tooling side.

To silence this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
  diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baolin Wang <baolin.wang@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-12 11:04:23 -03:00
Arnaldo Carvalho de Melo
47f8d94ac5 tools headers UAPI: Sync asm-generic/mman-common.h with the kernel
To pick the changes from:

  d41938d2cb ("mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use")

No changes in tooling, just a rebuild as files needed got touched.

This addresses the following perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/mman-common.h' differs from latest version at 'include/uapi/asm-generic/mman-common.h'
  diff -u tools/include/uapi/asm-generic/mman-common.h include/uapi/asm-generic/mman-common.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-12 10:53:06 -03:00
Arnaldo Carvalho de Melo
365f9cc195 tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
To pick the change in:

  cc662126b4 ("drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET")

That don't result in any changes in tooling, just silences this perf
build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
  diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h

Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-12 10:25:27 -03:00
Arnaldo Carvalho de Melo
f65b9dba57 tools headers uapi: Sync linux/fscrypt.h with the kernel sources
To pick the changes from:

  e933adde6f ("fscrypt: include <linux/ioctl.h> in UAPI header")
  93edd392ca ("fscrypt: support passing a keyring key to FS_IOC_ADD_ENCRYPTION_KEY")

That don't trigger any changes in tooling.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/fscrypt.h' differs from latest version at 'include/uapi/linux/fscrypt.h'
  diff -u tools/include/uapi/linux/fscrypt.h include/uapi/linux/fscrypt.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-12 10:13:25 -03:00
Arnaldo Carvalho de Melo
d6d829d92c tools headers UAPI: Sync sched.h with the kernel
To get the changes in:

  769071ac9f ("ns: Introduce Time Namespace")

Silencing this tools/perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/sched.h' differs from latest version at 'include/uapi/linux/sched.h'
  diff -u tools/include/uapi/linux/sched.h include/uapi/linux/sched.h

Which enables 'perf trace' to decode the CLONE_NEWTIME bit in the
'flags' argument to the clone syscalls.

Example of clone flags being decoded:

  [root@quaco ~]# perf trace -e clone*
       0.000 qemu-system-x8/23923 clone(clone_flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, newsp: 0x7f0dad7f9870, parent_tidptr: 0x7f0dad7fa9d0, child_tidptr: 0x7f0dad7fa9d0, tls: 0x7f0dad7fa700) = 6806 (qemu-system-x86)
           ? qemu-system-x8/6806  ... [continued]: clone())              = 0
  ^C[root@quaco ~]#

At some point this should enable things like:

  # perf trace -e 'clone*/clone_flags&NEWTIME/'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrei Vagin <avagin@openvz.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-11 16:41:50 -03:00
Arnaldo Carvalho de Melo
fc9199d46e tools headers UAPI: Sync prctl.h with the kernel sources
To get the changes in:

  8d19f1c8e1 ("prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim")

Which ends up having this effect in tooling, i.e. the addition of the
support to those prctl's options:

  $ tools/perf/trace/beauty/prctl_option.sh > before
  $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
  $ git diff
  diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h
  index 7da1b37b27aa..07b4f8131e36 100644
  --- a/tools/include/uapi/linux/prctl.h
  +++ b/tools/include/uapi/linux/prctl.h
  @@ -234,4 +234,8 @@ struct prctl_mm_map {
   #define PR_GET_TAGGED_ADDR_CTRL                56
   # define PR_TAGGED_ADDR_ENABLE         (1UL << 0)

  +/* Control reclaim behavior when allocating memory */
  +#define PR_SET_IO_FLUSHER              57
  +#define PR_GET_IO_FLUSHER              58
  +
   #endif /* _LINUX_PRCTL_H */
  $ tools/perf/trace/beauty/prctl_option.sh > after
  $ diff -u before after
  --- before	2020-02-11 15:24:35.339289912 -0300
  +++ after	2020-02-11 15:24:56.319711315 -0300
  @@ -51,6 +51,8 @@
   	[54] = "PAC_RESET_KEYS",
   	[55] = "SET_TAGGED_ADDR_CTRL",
   	[56] = "GET_TAGGED_ADDR_CTRL",
  +	[57] = "SET_IO_FLUSHER",
  +	[58] = "GET_IO_FLUSHER",
   };
   static const char *prctl_set_mm_options[] = {
   	[1] = "START_CODE",
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-11 16:41:50 -03:00
Arnaldo Carvalho de Melo
c452833387 tools include UAPI: Sync x86's syscalls_64.tbl, generic unistd.h and fcntl.h to pick up openat2 and pidfd_getfd
fddb5d430a ("open: introduce openat2(2) syscall")
  9a2cef09c8 ("arch: wire up pidfd_getfd syscall")

We also need to grab a copy of uapi/linux/openat2.h since it is now
needed by fcntl.h, add it to tools/perf/check_headers.h.

  $ diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
  --- tools/perf/arch/x86/entry/syscalls/syscall_64.tbl	2019-12-20 16:43:57.662429958 -0300
  +++ arch/x86/entry/syscalls/syscall_64.tbl	2020-02-10 16:36:22.070012468 -0300
  @@ -357,6 +357,8 @@
   433	common	fspick			__x64_sys_fspick
   434	common	pidfd_open		__x64_sys_pidfd_open
   435	common	clone3			__x64_sys_clone3/ptregs
  +437	common	openat2			__x64_sys_openat2
  +438	common	pidfd_getfd		__x64_sys_pidfd_getfd

   #
   # x32-specific system call numbers start at 512 to avoid cache impact
  $

Update tools/'s copy of that file:

  $ cp arch/x86/entry/syscalls/syscall_64.tbl tools/perf/arch/x86/entry/syscalls/syscall_64.tbl

See the result:

  $ diff -u /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c
  --- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before	2020-02-10 16:42:59.010636041 -0300
  +++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c	2020-02-10 16:43:24.149958337 -0300
  @@ -346,5 +346,7 @@
   	[433] = "fspick",
   	[434] = "pidfd_open",
   	[435] = "clone3",
  +	[437] = "openat2",
  +	[438] = "pidfd_getfd",
   };
  -#define SYSCALLTBL_x86_64_MAX_ID 435
  +#define SYSCALLTBL_x86_64_MAX_ID 438
  $

Now one can use:

  perf trace -e openat2,pidfd_getfd

To get just those syscalls or use in things like:

  perf trace -e open*

To get all the open variant (open, openat, openat2, etc) or:

  perf trace pidfd*

To get the pidfd syscalls.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-02-11 16:41:06 -03:00
Yury Norov
0bddc1bd05 bitops: more BITS_TO_* macros
Introduce BITS_TO_U64, BITS_TO_U32 and BITS_TO_BYTES as they are handy in
the following patches (BITS_TO_U32 specifically).  Reimplement tools/
version of the macros according to the kernel implementation.

Also fix indentation for BITS_PER_TYPE definition.

Link: http://lkml.kernel.org/r/20200102043031.30357-3-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Amritha Nambiar <amritha.nambiar@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: "Tobin C . Harding" <tobin@kernel.org>
Cc: Vineet Gupta <vineet.gupta1@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04 03:05:26 +00:00
Linus Torvalds
bd2463ac7d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Add WireGuard

 2) Add HE and TWT support to ath11k driver, from John Crispin.

 3) Add ESP in TCP encapsulation support, from Sabrina Dubroca.

 4) Add variable window congestion control to TIPC, from Jon Maloy.

 5) Add BCM84881 PHY driver, from Russell King.

 6) Start adding netlink support for ethtool operations, from Michal
    Kubecek.

 7) Add XDP drop and TX action support to ena driver, from Sameeh
    Jubran.

 8) Add new ipv4 route notifications so that mlxsw driver does not have
    to handle identical routes itself. From Ido Schimmel.

 9) Add BPF dynamic program extensions, from Alexei Starovoitov.

10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes.

11) Add support for macsec HW offloading, from Antoine Tenart.

12) Add initial support for MPTCP protocol, from Christoph Paasch,
    Matthieu Baerts, Florian Westphal, Peter Krystad, and many others.

13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu
    Cherian, and others.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits)
  net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC
  udp: segment looped gso packets correctly
  netem: change mailing list
  qed: FW 8.42.2.0 debug features
  qed: rt init valid initialization changed
  qed: Debug feature: ilt and mdump
  qed: FW 8.42.2.0 Add fw overlay feature
  qed: FW 8.42.2.0 HSI changes
  qed: FW 8.42.2.0 iscsi/fcoe changes
  qed: Add abstraction for different hsi values per chip
  qed: FW 8.42.2.0 Additional ll2 type
  qed: Use dmae to write to widebus registers in fw_funcs
  qed: FW 8.42.2.0 Parser offsets modified
  qed: FW 8.42.2.0 Queue Manager changes
  qed: FW 8.42.2.0 Expose new registers and change windows
  qed: FW 8.42.2.0 Internal ram offsets modifications
  MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver
  Documentation: net: octeontx2: Add RVU HW and drivers overview
  octeontx2-pf: ethtool RSS config support
  octeontx2-pf: Add basic ethtool support
  ...
2020-01-28 16:02:33 -08:00
David S. Miller
954b3c4397 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-01-22

The following pull-request contains BPF updates for your *net-next* tree.

We've added 92 non-merge commits during the last 16 day(s) which contain
a total of 320 files changed, 7532 insertions(+), 1448 deletions(-).

The main changes are:

1) function by function verification and program extensions from Alexei.

2) massive cleanup of selftests/bpf from Toke and Andrii.

3) batched bpf map operations from Brian and Yonghong.

4) tcp congestion control in bpf from Martin.

5) bulking for non-map xdp_redirect form Toke.

6) bpf_send_signal_thread helper from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-23 08:10:16 +01:00
Martin KaFai Lau
0a49c1a8e2 bpf: Sync uapi bpf.h to tools/
This patch sync uapi bpf.h to tools/.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200122233652.903348-1-kafai@fb.com
2020-01-22 16:30:10 -08:00
Alexei Starovoitov
2db6eab18b libbpf: Add support for program extensions
Add minimal support for program extensions. bpf_object_open_opts() needs to be
called with attach_prog_fd = target_prog_fd and BPF program extension needs to
have in .c file section definition like SEC("freplace/func_to_be_replaced").
libbpf will search for "func_to_be_replaced" in the target_prog_fd's BTF and
will pass it in attach_btf_id to the kernel. This approach works for tests, but
more compex use case may need to request function name (and attach_btf_id that
kernel sees) to be more dynamic. Such API will be added in future patches.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200121005348.2769920-3-ast@kernel.org
2020-01-22 23:04:53 +01:00
Yonghong Song
a1e3a3b8ba tools/bpf: Sync uapi header bpf.h
sync uapi header include/uapi/linux/bpf.h to
tools/include/uapi/linux/bpf.h

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-7-brianvv@google.com
2020-01-15 14:00:35 -08:00
Yonghong Song
8482941f09 bpf: Add bpf_send_signal_thread() helper
Commit 8b401f9ed2 ("bpf: implement bpf_send_signal() helper")
added helper bpf_send_signal() which permits bpf program to
send a signal to the current process. The signal may be
delivered to any threads in the process.

We found a use case where sending the signal to the current
thread is more preferable.
  - A bpf program will collect the stack trace and then
    send signal to the user application.
  - The user application will add some thread specific
    information to the just collected stack trace for
    later analysis.

If bpf_send_signal() is used, user application will need
to check whether the thread receiving the signal matches
the thread collecting the stack by checking thread id.
If not, it will need to send signal to another thread
through pthread_kill().

This patch proposed a new helper bpf_send_signal_thread(),
which sends the signal to the thread corresponding to
the current kernel task. This way, user space is guaranteed that
bpf_program execution context and user space signal handling
context are the same thread.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115035002.602336-1-yhs@fb.com
2020-01-15 11:44:51 -08:00
Antoine Tenart
76564261a7 net: macsec: introduce the macsec_context structure
This patch introduces the macsec_context structure. It will be used
in the kernel to exchange information between the common MACsec
implementation (macsec.c) and the MACsec hardware offloading
implementations. This structure contains pointers to MACsec specific
structures which contain the actual MACsec configuration, and to the
underlying device (phydev for now).

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-14 11:31:41 -08:00
Andrii Nakryiko
533420a415 tools: Sync uapi/linux/if_link.h
Sync uapi/linux/if_link.h into tools to avoid out of sync warnings during
libbpf build.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200113073143.1779940-2-andriin@fb.com
2020-01-13 17:48:12 -08:00
Alexei Starovoitov
2d3eb67f64 libbpf: Sanitize global functions
In case the kernel doesn't support BTF_FUNC_GLOBAL sanitize BTF produced by the
compiler for global functions.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-2-ast@kernel.org
2020-01-10 17:20:07 +01:00
Mat Martineau
faf391c382 tcp: Define IPPROTO_MPTCP
To open a MPTCP socket with socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP),
IPPROTO_MPTCP needs a value that differs from IPPROTO_TCP. The existing
IPPROTO numbers mostly map directly to IANA-specified protocol numbers.
MPTCP does not have a protocol number allocated because MPTCP packets
use the TCP protocol number. Use private number not used OTA.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09 18:41:41 -08:00
Andrey Ignatov
f5bfcd953d bpf: Document BPF_F_QUERY_EFFECTIVE flag
Document BPF_F_QUERY_EFFECTIVE flag, mostly to clarify how it affects
attach_flags what may not be obvious and what may lead to confision.

Specifically attach_flags is returned only for target_fd but if programs
are inherited from an ancestor cgroup then returned attach_flags for
current cgroup may be confusing. For example, two effective programs of
same attach_type can be returned but w/o BPF_F_ALLOW_MULTI in
attach_flags.

Simple repro:
  # bpftool c s /sys/fs/cgroup/path/to/task
  ID       AttachType      AttachFlags     Name
  # bpftool c s /sys/fs/cgroup/path/to/task effective
  ID       AttachType      AttachFlags     Name
  95043    ingress                         tw_ipt_ingress
  95048    ingress                         tw_ingress

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200108014006.938363-1-rdna@fb.com
2020-01-09 09:40:06 -08:00
Martin KaFai Lau
17328d618c bpf: Synch uapi bpf.h to tools/
This patch sync uapi bpf.h to tools/

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200109003512.3856559-1-kafai@fb.com
2020-01-09 08:46:18 -08:00
Vitaly Chikunov
6c4798d3f0 tools lib: Fix builds when glibc contains strlcpy()
Disable a couple of compilation warnings (which are treated as errors)
on strlcpy() definition and declaration, allowing users to compile perf
and kernel (objtool) when:

1. glibc have strlcpy() (such as in ALT Linux since 2004) objtool and
   perf build fails with this (in gcc):

  In file included from exec-cmd.c:3:
  tools/include/linux/string.h:20:15: error: redundant redeclaration of ‘strlcpy’ [-Werror=redundant-decls]
     20 | extern size_t strlcpy(char *dest, const char *src, size_t size);

2. clang ignores `-Wredundant-decls', but produces another warning when
   building perf:

    CC       util/string.o
  ../lib/string.c:99:8: error: attribute declaration must precede definition [-Werror,-Wignored-attributes]
  size_t __weak strlcpy(char *dest, const char *src, size_t size)
  ../../tools/include/linux/compiler.h:66:34: note: expanded from macro '__weak'
  # define __weak                 __attribute__((weak))
  /usr/include/bits/string_fortified.h:151:8: note: previous definition is here
  __NTH (strlcpy (char *__restrict __dest, const char *__restrict __src,

Committer notes:

The

 #pragma GCC diagnostic

directive was introduced in gcc 4.6, so check for that as well.

Fixes: ce99091 ("perf tools: Move strlcpy() from perf to tools/lib/string.c")
Fixes: 0215d59 ("tools lib: Reinstate strlcpy() header guard with __UCLIBC__")
Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=118481
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Reviewed-by: Dmitry Levin <ldv@altlinux.org>
Cc: Dmitry Levin <ldv@altlinux.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: kbuild test robot <lkp@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Cc: Vineet Gupta <vineet.gupta1@synopsys.com>
Link: http://lore.kernel.org/lkml/20191224172029.19690-1-vt@altlinux.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06 11:46:10 -03:00
Alexey Budankov
8812ad412f tools bitmap: Implement bitmap_equal() operation at bitmap API
Extend tools bitmap API with bitmap_equal() implementation.

The implementation has been derived from the kernel.

Extend tools bitmap API with bitmap_free() implementation for symmetry
with bitmap_alloc() function.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/43757993-0b28-d8af-a6c7-ede12e3a6877@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-01-06 11:46:04 -03:00
David S. Miller
2bbc078f81 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-12-27

The following pull-request contains BPF updates for your *net-next* tree.

We've added 127 non-merge commits during the last 17 day(s) which contain
a total of 110 files changed, 6901 insertions(+), 2721 deletions(-).

There are three merge conflicts. Conflicts and resolution looks as follows:

1) Merge conflict in net/bpf/test_run.c:

There was a tree-wide cleanup c593642c8b ("treewide: Use sizeof_field() macro")
which gets in the way with b590cb5f80 ("bpf: Switch to offsetofend in
BPF_PROG_TEST_RUN"):

  <<<<<<< HEAD
          if (!range_is_zero(__skb, offsetof(struct __sk_buff, priority) +
                             sizeof_field(struct __sk_buff, priority),
  =======
          if (!range_is_zero(__skb, offsetofend(struct __sk_buff, priority),
  >>>>>>> 7c8dce4b16

There are a few occasions that look similar to this. Always take the chunk with
offsetofend(). Note that there is one where the fields differ in here:

  <<<<<<< HEAD
          if (!range_is_zero(__skb, offsetof(struct __sk_buff, tstamp) +
                             sizeof_field(struct __sk_buff, tstamp),
  =======
          if (!range_is_zero(__skb, offsetofend(struct __sk_buff, gso_segs),
  >>>>>>> 7c8dce4b16

Just take the one with offsetofend() /and/ gso_segs. Latter is correct due to
850a88cc40 ("bpf: Expose __sk_buff wire_len/gso_segs to BPF_PROG_TEST_RUN").

2) Merge conflict in arch/riscv/net/bpf_jit_comp.c:

(I'm keeping Bjorn in Cc here for a double-check in case I got it wrong.)

  <<<<<<< HEAD
          if (is_13b_check(off, insn))
                  return -1;
          emit(rv_blt(tcc, RV_REG_ZERO, off >> 1), ctx);
  =======
          emit_branch(BPF_JSLT, RV_REG_T1, RV_REG_ZERO, off, ctx);
  >>>>>>> 7c8dce4b16

Result should look like:

          emit_branch(BPF_JSLT, tcc, RV_REG_ZERO, off, ctx);

3) Merge conflict in arch/riscv/include/asm/pgtable.h:

  <<<<<<< HEAD
  =======
  #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
  #define VMALLOC_END      (PAGE_OFFSET - 1)
  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)

  #define BPF_JIT_REGION_SIZE     (SZ_128M)
  #define BPF_JIT_REGION_START    (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
  #define BPF_JIT_REGION_END      (VMALLOC_END)

  /*
   * Roughly size the vmemmap space to be large enough to fit enough
   * struct pages to map half the virtual address space. Then
   * position vmemmap directly below the VMALLOC region.
   */
  #define VMEMMAP_SHIFT \
          (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT)
  #define VMEMMAP_SIZE    BIT(VMEMMAP_SHIFT)
  #define VMEMMAP_END     (VMALLOC_START - 1)
  #define VMEMMAP_START   (VMALLOC_START - VMEMMAP_SIZE)

  #define vmemmap         ((struct page *)VMEMMAP_START)

  >>>>>>> 7c8dce4b16

Only take the BPF_* defines from there and move them higher up in the
same file. Remove the rest from the chunk. The VMALLOC_* etc defines
got moved via 01f52e16b8 ("riscv: define vmemmap before pfn_to_page
calls"). Result:

  [...]
  #define __S101  PAGE_READ_EXEC
  #define __S110  PAGE_SHARED_EXEC
  #define __S111  PAGE_SHARED_EXEC

  #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
  #define VMALLOC_END      (PAGE_OFFSET - 1)
  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)

  #define BPF_JIT_REGION_SIZE     (SZ_128M)
  #define BPF_JIT_REGION_START    (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
  #define BPF_JIT_REGION_END      (VMALLOC_END)

  /*
   * Roughly size the vmemmap space to be large enough to fit enough
   * struct pages to map half the virtual address space. Then
   * position vmemmap directly below the VMALLOC region.
   */
  #define VMEMMAP_SHIFT \
          (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT)
  #define VMEMMAP_SIZE    BIT(VMEMMAP_SHIFT)
  #define VMEMMAP_END     (VMALLOC_START - 1)
  #define VMEMMAP_START   (VMALLOC_START - VMEMMAP_SIZE)

  [...]

Let me know if there are any other issues.

Anyway, the main changes are:

1) Extend bpftool to produce a struct (aka "skeleton") tailored and specific
   to a provided BPF object file. This provides an alternative, simplified API
   compared to standard libbpf interaction. Also, add libbpf extern variable
   resolution for .kconfig section to import Kconfig data, from Andrii Nakryiko.

2) Add BPF dispatcher for XDP which is a mechanism to avoid indirect calls by
   generating a branch funnel as discussed back in bpfconf'19 at LSF/MM. Also,
   add various BPF riscv JIT improvements, from Björn Töpel.

3) Extend bpftool to allow matching BPF programs and maps by name,
   from Paul Chaignon.

4) Support for replacing cgroup BPF programs attached with BPF_F_ALLOW_MULTI
   flag for allowing updates without service interruption, from Andrey Ignatov.

5) Cleanup and simplification of ring access functions for AF_XDP with a
   bonus of 0-5% performance improvement, from Magnus Karlsson.

6) Enable BPF JITs for x86-64 and arm64 by default. Also, final version of
   audit support for BPF, from Daniel Borkmann and latter with Jiri Olsa.

7) Move and extend test_select_reuseport into BPF program tests under
   BPF selftests, from Jakub Sitnicki.

8) Various BPF sample improvements for xdpsock for customizing parameters
   to set up and benchmark AF_XDP, from Jay Jayatheerthan.

9) Improve libbpf to provide a ulimit hint on permission denied errors.
   Also change XDP sample programs to attach in driver mode by default,
   from Toke Høiland-Jørgensen.

10) Extend BPF test infrastructure to allow changing skb mark from tc BPF
    programs, from Nikita V. Shirokov.

11) Optimize prologue code sequence in BPF arm32 JIT, from Russell King.

12) Fix xdp_redirect_cpu BPF sample to manually attach to tracepoints after
    libbpf conversion, from Jesper Dangaard Brouer.

13) Minor misc improvements from various others.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-27 14:20:10 -08:00
Andrey Ignatov
7dd68b3279 bpf: Support replacing cgroup-bpf program in MULTI mode
The common use-case in production is to have multiple cgroup-bpf
programs per attach type that cover multiple use-cases. Such programs
are attached with BPF_F_ALLOW_MULTI and can be maintained by different
people.

Order of programs usually matters, for example imagine two egress
programs: the first one drops packets and the second one counts packets.
If they're swapped the result of counting program will be different.

It brings operational challenges with updating cgroup-bpf program(s)
attached with BPF_F_ALLOW_MULTI since there is no way to replace a
program:

* One way to update is to detach all programs first and then attach the
  new version(s) again in the right order. This introduces an
  interruption in the work a program is doing and may not be acceptable
  (e.g. if it's egress firewall);

* Another way is attach the new version of a program first and only then
  detach the old version. This introduces the time interval when two
  versions of same program are working, what may not be acceptable if a
  program is not idempotent. It also imposes additional burden on
  program developers to make sure that two versions of their program can
  co-exist.

Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH
command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI
flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag
and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to
replace

That way user can replace any program among those attached with
BPF_F_ALLOW_MULTI flag without the problems described above.

Details of the new API:

* If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid
  descriptor of BPF program, BPF_PROG_ATTACH will return corresponding
  error (EINVAL or EBADF).

* If replace_bpf_fd has valid descriptor of BPF program but such a
  program is not attached to specified cgroup, BPF_PROG_ATTACH will
  return ENOENT.

BPF_F_REPLACE is introduced to make the user intent clear, since
replace_bpf_fd alone can't be used for this (its default value, 0, is a
valid fd). BPF_F_REPLACE also makes it possible to extend the API in the
future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed).

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Narkyiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com
2019-12-19 21:22:25 -08:00
Björn Töpel
eb9928bed0 riscv, bpf: Add missing uapi header for BPF_PROG_TYPE_PERF_EVENT programs
Add missing uapi header the BPF_PROG_TYPE_PERF_EVENT programs by
exporting struct user_regs_struct instead of struct pt_regs which is
in-kernel only.

Signed-off-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191216091343.23260-9-bjorn.topel@gmail.com
2019-12-19 16:03:31 +01:00
Andrii Nakryiko
166750bc1d libbpf: Support libbpf-provided extern variables
Add support for extern variables, provided to BPF program by libbpf. Currently
the following extern variables are supported:
  - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
    executing, follows KERNEL_VERSION() macro convention, can be 4- and 8-byte
    long;
  - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
    boolean, strings, and integer values are supported.

Set of possible values is determined by declared type of extern variable.
Supported types of variables are:
- Tristate values. Are represented as `enum libbpf_tristate`. Accepted values
  are **strictly** 'y', 'n', or 'm', which are represented as TRI_YES, TRI_NO,
  or TRI_MODULE, respectively.
- Boolean values. Are represented as bool (_Bool) types. Accepted values are
  'y' and 'n' only, turning into true/false values, respectively.
- Single-character values. Can be used both as a substritute for
  bool/tristate, or as a small-range integer:
  - 'y'/'n'/'m' are represented as is, as characters 'y', 'n', or 'm';
  - integers in a range [-128, 127] or [0, 255] (depending on signedness of
    char in target architecture) are recognized and represented with
    respective values of char type.
- Strings. String values are declared as fixed-length char arrays. String of
  up to that length will be accepted and put in first N bytes of char array,
  with the rest of bytes zeroed out. If config string value is longer than
  space alloted, it will be truncated and warning message emitted. Char array
  is always zero terminated. String literals in config have to be enclosed in
  double quotes, just like C-style string literals.
- Integers. 8-, 16-, 32-, and 64-bit integers are supported, both signed and
  unsigned variants. Libbpf enforces parsed config value to be in the
  supported range of corresponding integer type. Integers values in config can
  be:
  - decimal integers, with optional + and - signs;
  - hexadecimal integers, prefixed with 0x or 0X;
  - octal integers, starting with 0.

Config file itself is searched in /boot/config-$(uname -r) location with
fallback to /proc/config.gz, unless config path is specified explicitly
through bpf_object_open_opts' kernel_config_path option. Both gzipped and
plain text formats are supported. Libbpf adds explicit dependency on zlib
because of this, but this shouldn't be a problem, given libelf already depends
on zlib.

All detected extern variables, are put into a separate .extern internal map.
It, similarly to .rodata map, is marked as read-only from BPF program side, as
well as is frozen on load. This allows BPF verifier to track extern values as
constants and perform enhanced branch prediction and dead code elimination.
This can be relied upon for doing kernel version/feature detection and using
potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF
program, while still having a single version of BPF program running on old and
new kernels. Selftests are validating this explicitly for unexisting BPF
helper.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-3-andriin@fb.com
2019-12-15 16:41:12 -08:00
Arnaldo Carvalho de Melo
b444268801 tools headers kvm: Sync linux/kvm.h with the kernel sources
To pick up the changes from:

  22945688ac ("KVM: PPC: Book3S HV: Support reset of secure guest")

No tools changes are caused by this, as the only defines so far used
from these files are for syscall arg pretty printing are:

  $ grep KVM tools/perf/trace/beauty/*.sh
  tools/perf/trace/beauty/kvm_ioctl.sh:regex='^#[[:space:]]*define[[:space:]]+KVM_(\w+)[[:space:]]+_IO[RW]*\([[:space:]]*KVMIO[[:space:]]*,[[:space:]]*(0x[[:xdigit:]]+).*'
  $

This addresses these tools/perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Link: https://lkml.kernel.org/n/tip-bdbe4x02johhul05a03o27zj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-11 10:08:35 -03:00
Arnaldo Carvalho de Melo
fd9bee5e24 tools headers UAPI: Update tools's copy of drm.h headers
Picking the changes from:

  2093dea3de ("drm/syncobj: extend syncobj query ability v3")

Which doesn't affect tooling, just silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'
  diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Chunming Zhou <david1.zhou@amd.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-t1xqmjffo4rxdw395dsnu34j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-04 16:22:38 -03:00
Arnaldo Carvalho de Melo
0b3fca6ad3 tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
To pick the change in:

  a0e047156c ("drm/i915/gem: Make context persistence optional")
  9cd20ef780 ("drm/i915/perf: allow holding preemption on filtered ctx")
  7831e9a965 ("drm/i915/perf: Allow dynamic reconfiguration of the OA stream")
  4f6ccc74a8 ("drm/i915: add support for perf configuration queries")
  b8d49f28aa ("drm/i915/perf: introduce a versioning of the i915-perf uapi")
  601734f7aa ("drm/i915/tgl: s/ss/eu fuse reading support")

That don't result in any changes in tooling, just silences this perf
build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
  diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-qwzjrgwj55y3g6rjdf9spkpr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-04 16:22:28 -03:00
Arnaldo Carvalho de Melo
2603a4903b tools headers UAPI: Sync sched.h with the kernel
To get the changes in:

  0acefef584 ("Merge tag 'threads-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux")
  49cb2fc42c ("fork: extend clone3() to support setting a PID")
  fa729c4df5 ("clone3: validate stack arguments")
  b612e5df45 ("clone3: add CLONE_CLEAR_SIGHAND")

This file gets rebuilt, but no changes ensues:

   CC       /tmp/build/perf/trace/beauty/clone.o

This addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/sched.h' differs from latest version at 'include/uapi/linux/sched.h'
  diff -u tools/include/uapi/linux/sched.h include/uapi/linux/sched.

The CLONE_CLEAR_SIGHAND one will be used in tools/perf/trace/beauty/clone.c
in a followup patch to show that string when this bit is set in the
syscall arg. Keeping a copy of this file allows us to build this in
older systems and have the binary support printing that flag whenever
that system gets its kernel updated to one where this feature is
present.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Reber <areber@redhat.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-nprqsvvzbhzoy64cbvos6c5b@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-02 12:56:39 -03:00
Arnaldo Carvalho de Melo
1fc3d0ee24 tools headers kvm: Sync kvm headers with the kernel sources
To pick up the changes from:

  14edff8831 Merge tag 'kvmarm-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
  a4b28f5c67 Merge remote-tracking branch 'kvmarm/kvm-arm64/stolen-time' into kvmarm-master/next
  58772e9a3d ("KVM: arm64: Provide VCPU attributes for stolen time")
  da345174ce ("KVM: arm/arm64: Allow user injection of external data aborts")
  c726200dd1 ("KVM: arm/arm64: Allow reporting non-ISV data aborts to userspace")
  efe5ddcae4 ("KVM: PPC: Book3S HV: XIVE: Allow userspace to set the # of VPs")

No tools changes are caused by this, as the only defines so far used
from these files are for syscall arg pretty printing are:

  $ grep KVM tools/perf/trace/beauty/*.sh
  tools/perf/trace/beauty/kvm_ioctl.sh:regex='^#[[:space:]]*define[[:space:]]+KVM_(\w+)[[:space:]]+_IO[RW]*\([[:space:]]*KVMIO[[:space:]]*,[[:space:]]*(0x[[:xdigit:]]+).*'
  $

Some are also include by:

  tools/perf/arch/x86/util/kvm-stat.c
  tools/perf/arch/powerpc/util/kvm-stat.c

This addresses these tools/perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
  Warning: Kernel ABI header at 'tools/arch/powerpc/include/uapi/asm/kvm.h' differs from latest version at 'arch/powerpc/include/uapi/asm/kvm.h'
  diff -u tools/arch/powerpc/include/uapi/asm/kvm.h arch/powerpc/include/uapi/asm/kvm.h
  Warning: Kernel ABI header at 'tools/arch/arm/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm/include/uapi/asm/kvm.h'
  diff -u tools/arch/arm/include/uapi/asm/kvm.h arch/arm/include/uapi/asm/kvm.h
  Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
  diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Fabiano Rosas <farosas@linux.ibm.com>
Cc: Greg Kurz <groug@kaod.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Steven Price <steven.price@arm.com>
Link: https://lkml.kernel.org/n/tip-qrjdudhq25mk5bfnhveofbm4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-02 12:54:13 -03:00
Arnaldo Carvalho de Melo
c66f2566db tools headers uapi: Sync linux/stat.h with the kernel sources
To pick the changes from:

  3ad2522c64 ("statx: define STATX_ATTR_VERITY")

That don't trigger any changes in tooling.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/stat.h' differs from latest version at 'include/uapi/linux/stat.h'
  diff -u tools/include/uapi/linux/stat.h include/uapi/linux/stat.h

At some point we wi'll beautify structs passed in pointers to syscalls
and then we'll need to have tables for these defines, for now update the
file to silence the warning as this file is used for doing this type of
number -> string translations for other defines found in these file.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-thcy60dpry5qrpn7nmc58bwg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-02 12:24:52 -03:00
Arnaldo Carvalho de Melo
ad46f35cca tools headers uapi: Sync linux/fscrypt.h with the kernel sources
To pick the changes from:

  b103fb7653 ("fscrypt: add support for IV_INO_LBLK_64 policies")

That don't trigger any changes in tooling.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/fscrypt.h' differs from latest version at 'include/uapi/linux/fscrypt.h'
  diff -u tools/include/uapi/linux/fscrypt.h include/uapi/linux/fscrypt.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-cgfz3ffe07pw2m8hmstvkudl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-12-02 12:19:24 -03:00
Linus Torvalds
3f59dbcace Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "The main kernel side changes in this cycle were:

   - Various Intel-PT updates and optimizations (Alexander Shishkin)

   - Prohibit kprobes on Xen/KVM emulate prefixes (Masami Hiramatsu)

   - Add support for LSM and SELinux checks to control access to the
     perf syscall (Joel Fernandes)

   - Misc other changes, optimizations, fixes and cleanups - see the
     shortlog for details.

  There were numerous tooling changes as well - 254 non-merge commits.
  Here are the main changes - too many to list in detail:

   - Enhancements to core tooling infrastructure, perf.data, libperf,
     libtraceevent, event parsing, vendor events, Intel PT, callchains,
     BPF support and instruction decoding.

   - There were updates to the following tools:

        perf annotate
        perf diff
        perf inject
        perf kvm
        perf list
        perf maps
        perf parse
        perf probe
        perf record
        perf report
        perf script
        perf stat
        perf test
        perf trace

   - And a lot of other changes: please see the shortlog and Git log for
     more details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (279 commits)
  perf parse: Fix potential memory leak when handling tracepoint errors
  perf probe: Fix spelling mistake "addrees" -> "address"
  libtraceevent: Fix memory leakage in copy_filter_type
  libtraceevent: Fix header installation
  perf intel-bts: Does not support AUX area sampling
  perf intel-pt: Add support for decoding AUX area samples
  perf intel-pt: Add support for recording AUX area samples
  perf pmu: When using default config, record which bits of config were changed by the user
  perf auxtrace: Add support for queuing AUX area samples
  perf session: Add facility to peek at all events
  perf auxtrace: Add support for dumping AUX area samples
  perf inject: Cut AUX area samples
  perf record: Add aux-sample-size config term
  perf record: Add support for AUX area sampling
  perf auxtrace: Add support for AUX area sample recording
  perf auxtrace: Move perf_evsel__find_pmu()
  perf record: Add a function to test for kernel support for AUX area sampling
  perf tools: Add kernel AUX area sampling definitions
  perf/core: Make the mlock accounting simple again
  perf report: Jump to symbol source view from total cycles view
  ...
2019-11-26 15:04:47 -08:00