Commit Graph

15630 Commits

Author SHA1 Message Date
Alexei Starovoitov
4f7b3e8258 bpf: improve verifier branch analysis
pathological bpf programs may try to force verifier to explode in
the number of branch states:
  20: (d5) if r1 s<= 0x24000028 goto pc+0
  21: (b5) if r0 <= 0xe1fa20 goto pc+2
  22: (d5) if r1 s<= 0x7e goto pc+0
  23: (b5) if r0 <= 0xe880e000 goto pc+0
  24: (c5) if r0 s< 0x2100ecf4 goto pc+0
  25: (d5) if r1 s<= 0xe880e000 goto pc+1
  26: (c5) if r0 s< 0xf4041810 goto pc+0
  27: (d5) if r1 s<= 0x1e007e goto pc+0
  28: (b5) if r0 <= 0xe86be000 goto pc+0
  29: (07) r0 += 16614
  30: (c5) if r0 s< 0x6d0020da goto pc+0
  31: (35) if r0 >= 0x2100ecf4 goto pc+0

Teach verifier to recognize always taken and always not taken branches.
This analysis is already done for == and != comparison.
Expand it to all other branches.

It also helps real bpf programs to be verified faster:
                       before  after
bpf_lb-DLB_L3.o         2003    1940
bpf_lb-DLB_L4.o         3173    3089
bpf_lb-DUNKNOWN.o       1080    1065
bpf_lxc-DDROP_ALL.o     29584   28052
bpf_lxc-DUNKNOWN.o      36916   35487
bpf_netdev.o            11188   10864
bpf_overlay.o           6679    6643
bpf_lcx_jit.o           39555   38437

Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-04 17:22:02 +01:00
Joe Stringer
d74286d2c2 bpf: Improve socket lookup reuseport documentation
Improve the wording around socket lookup for reuseport sockets, and
ensure that both bpf.h headers are in sync.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30 17:17:38 -08:00
Joe Stringer
f71c6143c2 bpf: Support sk lookup in netns with id 0
David Ahern and Nicolas Dichtel report that the handling of the netns id
0 is incorrect for the BPF socket lookup helpers: rather than finding
the netns with id 0, it is resolving to the current netns. This renders
the netns_id 0 inaccessible.

To fix this, adjust the API for the netns to treat all negative s32
values as a lookup in the current netns (including u64 values which when
truncated to s32 become negative), while any values with a positive
value in the signed 32-bit integer space would result in a lookup for a
socket in the netns corresponding to that id. As before, if the netns
with that ID does not exist, no socket will be found. Any netns outside
of these ranges will fail to find a corresponding socket, as those
values are reserved for future usage.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Joey Pabalinas <joeypabalinas@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30 17:17:38 -08:00
Daniel Borkmann
b7df9ada9a bpf: fix pointer offsets in context for 32 bit
Currently, pointer offsets in three BPF context structures are
broken in two scenarios: i) 32 bit compiled applications running
on 64 bit kernels, and ii) LLVM compiled BPF programs running
on 32 bit kernels. The latter is due to BPF target machine being
strictly 64 bit. So in each of the cases the offsets will mismatch
in verifier when checking / rewriting context access. Fix this by
providing a helper macro __bpf_md_ptr() that will enforce padding
up to 64 bit and proper alignment, and for context access a macro
bpf_ctx_range_ptr() which will cover full 64 bit member range on
32 bit archs. For flow_keys, we additionally need to force the
size check to sizeof(__u64) as with other pointer types.

Fixes: d58e468b11 ("flow_dissector: implements flow dissector BPF hook")
Fixes: 4f738adba3 ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data")
Fixes: 2dbb9b9e6d ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT")
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David S. Miller <davem@davemloft.net>
Tested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30 17:04:35 -08:00
David Miller
c01ac66b38 bpf: Fix verifier log string check for bad alignment.
The message got changed a lot time ago.

This was responsible for 36 test case failures on sparc64.

Fixes: f1174f77b5 ("bpf/verifier: rework value tracking")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30 14:43:45 -08:00
Yonghong Song
528bff0cdb tools: bpftool: fix a bitfield pretty print issue
Commit b12d6ec097 ("bpf: btf: add btf print functionality")
added btf pretty print functionality to bpftool.
There is a problem though in printing a bitfield whose type
has modifiers.

For example, for a type like
  typedef int ___int;
  struct tmp_t {
          int a:3;
          ___int b:3;
  };
Suppose we have a map
  struct bpf_map_def SEC("maps") tmpmap = {
          .type = BPF_MAP_TYPE_HASH,
          .key_size = sizeof(__u32),
          .value_size = sizeof(struct tmp_t),
          .max_entries = 1,
  };
and the hash table is populated with one element with
key 0 and value (.a = 1 and .b = 2).

In BTF, the struct member "b" will have a type "typedef" which
points to an int type. The current implementation does not
pass the bit offset during transition from typedef to int type,
hence incorrectly print the value as
  $ bpftool m d id 79
  [{
          "key": 0,
          "value": {
              "a": 0x1,
              "b": 0x1
          }
      }
  ]

This patch fixed the issue by carrying bit_offset along the type
chain during bit_field print. The correct result can be printed as
  $ bpftool m d id 76
  [{
          "key": 0,
          "value": {
              "a": 0x1,
              "b": 0x2
          }
      }
  ]

The kernel pretty print is implemented correctly and does not
have this issue.

Fixes: b12d6ec097 ("bpf: btf: add btf print functionality")
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-28 16:40:02 -08:00
Yonghong Song
d08489125e tools/bpf: add addition type tests to test_btf
The following additional unit testcases are added to test_btf:
...
BTF raw test[42] (typedef (invalid name, name_off = 0)): OK
BTF raw test[43] (typedef (invalid name, invalid identifier)): OK
BTF raw test[44] (ptr type (invalid name, name_off <> 0)): OK
BTF raw test[45] (volatile type (invalid name, name_off <> 0)): OK
BTF raw test[46] (const type (invalid name, name_off <> 0)): OK
BTF raw test[47] (restrict type (invalid name, name_off <> 0)): OK
BTF raw test[48] (fwd type (invalid name, name_off = 0)): OK
BTF raw test[49] (fwd type (invalid name, invalid identifier)): OK
BTF raw test[50] (array type (invalid name, name_off <> 0)): OK
BTF raw test[51] (struct type (name_off = 0)): OK
BTF raw test[52] (struct type (invalid name, invalid identifier)): OK
BTF raw test[53] (struct member (name_off = 0)): OK
BTF raw test[54] (struct member (invalid name, invalid identifier)): OK
BTF raw test[55] (enum type (name_off = 0)): OK
BTF raw test[56] (enum type (invalid name, invalid identifier)): OK
BTF raw test[57] (enum member (invalid name, name_off = 0)): OK
BTF raw test[58] (enum member (invalid name, invalid identifier)): OK
...

Fixes: c0fa1b6c3e ("bpf: btf: Add BTF tests")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-28 16:03:05 -08:00
Martin KaFai Lau
8800cd031a tools/bpf: fix two test_btf unit test cases
There are two unit test cases, which should encode
TYPEDEF type, but instead encode PTR type.
The error is flagged out after enforcing name
checking in the previous patch.

Fixes: c0fa1b6c3e ("bpf: btf: Add BTF tests")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-28 16:03:05 -08:00
David S. Miller
e9d8faf93d Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Disable BH while holding list spinlock in nf_conncount, from
   Taehee Yoo.

2) List corruption in nf_conncount, also from Taehee.

3) Fix race that results in leaving around an empty list node in
   nf_conncount, from Taehee Yoo.

4) Proper chain handling for inactive chains from the commit path,
   from Florian Westphal. This includes a selftest for this.

5) Do duplicate rule handles when replacing rules, also from Florian.

6) Remove net_exit path in xt_RATEEST that results in splat, from Taehee.

7) Possible use-after-free in nft_compat when releasing extensions.
   From Florian.

8) Memory leak in xt_hashlimit, from Taehee.

9) Call ip_vs_dst_notifier after ipv6_dev_notf, from Xin Long.

10) Fix cttimeout with udplite and gre, from Florian.

11) Preserve oif for IPv6 link-local generated traffic from mangle
    table, from Alin Nastac.

12) Missing error handling in masquerade notifiers, from Taehee Yoo.

13) Use mutex to protect registration/unregistration of masquerade
    extensions in order to prevent a race, from Taehee.

14) Incorrect condition check in tree_nodes_free(), also from Taehee.

15) Fix chain counter leak in rule replacement path, from Taehee.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-28 11:02:45 -08:00
David S. Miller
6950012742 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2018-11-25

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix an off-by-one bug when adjusting subprog start offsets after
   patching, from Edward.

2) Fix several bugs such as overflow in size allocation in queue /
   stack map creation, from Alexei.

3) Fix wrong IPv6 destination port byte order in bpf_sk_lookup_udp
   helper, from Andrey.

4) Fix several bugs in bpftool such as preventing an infinite loop
   in get_fdinfo, error handling and man page references, from Quentin.

5) Fix a warning in bpf_trace_printk() that wasn't catching an
   invalid format string, from Martynas.

6) Fix a bug in BPF cgroup local storage where non-atomic allocation
   was used in atomic context, from Roman.

7) Fix a NULL pointer dereference bug in bpftool from reallocarray()
   error handling, from Jakub and Wen.

8) Add a copy of pkt_cls.h and tc_bpf.h uapi headers to the tools
   include infrastructure so that bpftool compiles on older RHEL7-like
   user space which does not ship these headers, from Yonghong.

9) Fix BPF kselftests for user space where to get ping test working
   with ping6 and ping -6, from Li.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-25 20:04:58 -08:00
Linus Torvalds
b88af99487 Power management fixes for 4.20-rc4
- Fix tasks freezer deadlock in de_thread() that occurs if one
    of its sub-threads has been frozen already (Chanho Min).
 
  - Avoid registering a platform device by the ti-cpufreq driver
    on platforms that cannot use it (Dave Gerlach).
 
  - Fix a mistake in the ti-opp-supply operating performance points
    (OPP) driver that caused an incorrect reference voltage to be
    used and make it adjust the minimum voltage dynamically to avoid
    hangs or crashes in some cases (Keerthy).
 
  - Fix issues related to compiler flags in the cpupower utility
    and correct a linking problem in it by renaming a file with
    a duplicate name (Jiri Olsa, Konstantin Khlebnikov).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJb99ANAAoJEILEb/54YlRxUk4QAIkpXmLdEm0zvQXJZKJxuxu/
 8eiglPcoCAAVgR3uRjezTyQeFCDjqGIqmpiipFOkxgAaeb+aiFta2B88B4oZ3OAf
 G5yJG4CMs5+Tp/oH2+McX4PMo2WfoEIDoOGVOdtlk4iGAIh9tT5KvNfCUyxHLP7c
 cFjdeUQuc8PRPzutESxYHB23FmECCEprVmEFVckQ7vSCnWX6dm1mFteCiNZADH8/
 YNgRWRYtyWXlPRNCPJuCvYmVHLgm0Tw6CVhG4ttXT9wIkYyiLK9Flx9X6gsBWdoS
 ej1jbJM7R/EAUzHvCjUCSfMKDNWvQySR3gC2m0vG5Goext2UXWieSHaI6BfqjPP2
 wTBX9lAKu3iIGISoorjJZAMe4v26Tpbs0v5ApsOGr50WNZKZtRkBwaJ0EFT2hEa8
 MsDtUjr+d7CIQ+ZDHmAbOzghpDlSuhGypfkD7IPtA/AY1dy1wJ3BUYya8ucqSdr3
 iKkQcndN3ASR1+Fyb3c1cflh9fWe84aeSmYPihcX2m+/D43keYXbj2ANdCH1dF3E
 cwAXw9+VteUQ1tihqgJapFogK3VphMbRErPSGXryuyiMxr38g7tgHAd2rId0JG4g
 QftqpZa19aEed8tJNaQWWjBaBFgYeAT9BQqcP6CjFOO9klPH//NSiqzQUFMdzsEc
 Qql1V/aoOIsflb7gJH3/
 =JQxr
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix two issues in the Operating Performance Points (OPP)
  framework, one cpufreq driver issue, one problem related to the tasks
  freezer and a few build-related issues in the cpupower utility.

  Specifics:

   - Fix tasks freezer deadlock in de_thread() that occurs if one of its
     sub-threads has been frozen already (Chanho Min).

   - Avoid registering a platform device by the ti-cpufreq driver on
     platforms that cannot use it (Dave Gerlach).

   - Fix a mistake in the ti-opp-supply operating performance points
     (OPP) driver that caused an incorrect reference voltage to be used
     and make it adjust the minimum voltage dynamically to avoid hangs
     or crashes in some cases (Keerthy).

   - Fix issues related to compiler flags in the cpupower utility and
     correct a linking problem in it by renaming a file with a duplicate
     name (Jiri Olsa, Konstantin Khlebnikov)"

* tag 'pm-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  exec: make de_thread() freezable
  cpufreq: ti-cpufreq: Only register platform_device when supported
  opp: ti-opp-supply: Correct the supply in _get_optimal_vdd_voltage call
  opp: ti-opp-supply: Dynamically update u_volt_min
  tools cpupower: Override CFLAGS assignments
  tools cpupower debug: Allow to use outside build flags
  tools/power/cpupower: fix compilation with STATIC=true
2018-11-23 10:52:57 -08:00
Jakub Kicinski
dde7011a82 tools: bpftool: fix potential NULL pointer dereference in do_load
This patch fixes a possible null pointer dereference in
do_load, detected by the semantic patch deref_null.cocci,
with the following warning:

./tools/bpf/bpftool/prog.c:1021:23-25: ERROR: map_replace is NULL but dereferenced.

The following code has potential null pointer references:
881             map_replace = reallocarray(map_replace, old_map_fds + 1,
882                                        sizeof(*map_replace));
883             if (!map_replace) {
884                     p_err("mem alloc failed");
885                     goto err_free_reuse_maps;
886             }

...
1019 err_free_reuse_maps:
1020         for (i = 0; i < old_map_fds; i++)
1021                 close(map_replace[i].fd);
1022         free(map_replace);

Fixes: 3ff5a4dc5d ("tools: bpftool: allow reuse of maps with bpftool prog load")
Co-developed-by: Wen Yang <wen.yang99@zte.com.cn>
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-22 00:45:51 +01:00
Rafael J. Wysocki
0db699f747 linux-cpupower-4.20-rc4
This cpupower update for Linux 4.20-rc4 consists of compile fixes to allow
 use of outside build flags and override of CFLAGS from Jiri Olsa, and fix
 to compilation with STATIC=true from Konstantin Khlebnikov.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAlv0cXUACgkQCwJExA0N
 Qxwjxg/8C9QnM5XrjEqlNz4ESwZ17RWEEZaGcteaUG3yFYtHMYnqzjzDJxW8yjMj
 XKI/v1IcMNxODoW61mKJ8QPCT4wOfL4Wi7uBHk48SzNCRXCNevFxOENXfDRCHYyH
 0dKakVLRclR79n/TbH64WNRVUg4rkaZrpQGxNenDA2LJA4UW/ReUU2Dd8Qbyc6+p
 1NA2wKONpLjoxSeyVjxu4Zz8mSucxLrTEzE7kElmqN0ZB6G5HE2yijCwBoTi9p95
 mWqxRLDWKxnTZ5MDlS661RBw3tshTa2rtkv2QwijI23Pned5Z9imXDfZ8aadvntJ
 8YJdScmhN53yJjRLf8idOchN24qI5RgHHyjvDjNJBrG85oFYuv2bDJSqjZUOX62V
 oXhRp9bjcw9Frhe70/+yKN/EfXKEaKqXlpMiuBraPh8e+UQwxZU/iOHbGogQyo5Z
 ot7fRJqiU8Cx7HplGwf3LO6sEXO7eOImBTVtYB3y4ctI5++ce/JACSDj7+RRE1/u
 VW2x40uzE0lwVDNxJslhNKVPqu/kG6HhVXURxsWpqiw22OI/DQXrt17nbTcFPt1j
 SDbxx2T+FgNPOjN69YwFzL7HjAqf6fdVwTQbfOiTA3ODA/i7nY9x/Y9FKqhctPj/
 0afbO0s2plqQ0dgYhJdu6tDEqdHycvXSsW6zcBqw0mNIyw+QkOI=
 =eU6f
 -----END PGP SIGNATURE-----

Merge tag 'linux-cpupower-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux

Pull cpupower utility updates for 4.20-rc4 from Shuah Khan:

"This cpupower update for Linux 4.20-rc4 consists of compile fixes to allow
 use of outside build flags and override of CFLAGS from Jiri Olsa, and fix
 to compilation with STATIC=true from Konstantin Khlebnikov."

* tag 'linux-cpupower-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
  tools cpupower: Override CFLAGS assignments
  tools cpupower debug: Allow to use outside build flags
  tools/power/cpupower: fix compilation with STATIC=true
2018-11-21 13:33:06 +01:00
Linus Torvalds
f2ce1065e7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix some potentially uninitialized variables and use-after-free in
    kvaser_usb can drier, from Jimmy Assarsson.

 2) Fix leaks in qed driver, from Denis Bolotin.

 3) Socket leak in l2tp, from Xin Long.

 4) RSS context allocation fix in bnxt_en from Michael Chan.

 5) Fix cxgb4 build errors, from Ganesh Goudar.

 6) Route leaks in ipv6 when removing exceptions, from Xin Long.

 7) Memory leak in IDR allocation handling of act_pedit, from Davide
    Caratti.

 8) Use-after-free of bridge vlan stats, from Nikolay Aleksandrov.

 9) When MTU is locked, do not force DF bit on ipv4 tunnels. From
    Sabrina Dubroca.

10) When NAPI cached skb is reused, we must set it to the proper initial
    state which includes skb->pkt_type. From Eric Dumazet.

11) Lockdep and non-linear SKB handling fix in tipc from Jon Maloy.

12) Set RX queue properly in various tuntap receive paths, from Matthew
    Cover.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
  tuntap: fix multiqueue rx
  ipv6: Fix PMTU updates for UDP/raw sockets in presence of VRF
  tipc: don't assume linear buffer when reading ancillary data
  tipc: fix lockdep warning when reinitilaizing sockets
  net-gro: reset skb->pkt_type in napi_reuse_skb()
  tc-testing: tdc.py: Guard against lack of returncode in executed command
  tc-testing: tdc.py: ignore errors when decoding stdout/stderr
  ip_tunnel: don't force DF when MTU is locked
  MAINTAINERS: Add entry for CAKE qdisc
  net: bridge: fix vlan stats use-after-free on destruction
  socket: do a generic_file_splice_read when proto_ops has no splice_read
  net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs
  Revert "net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs"
  net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs
  net/sched: act_pedit: fix memory leak when IDR allocation fails
  net: lantiq: Fix returned value in case of error in 'xrx200_probe()'
  ipv6: fix a dst leak when removing its exception
  net: mvneta: Don't advertise 2.5G modes
  drivers/net/ethernet/qlogic/qed/qed_rdma.h: fix typo
  net/mlx4: Fix UBSAN warning of signed integer overflow
  ...
2018-11-19 09:24:04 -08:00
Linus Torvalds
25e19c1fe4 libnvdimm 4.20-rc3
- Address Range Scrub overflow continuation handling has been broken
   since it was initially merged. It was only recently that error injection
   and platform-BIOS support enabled this corner case to be exercised.
 
 - The recent attempt to provide more isolation for the kernel Address
   Range Scrub state machine from userapace initiated sessions triggers a
   lockdep report. Revert and try again at the next merge window.
 
 - Fix a kasan reported buffer overflow in libnvdimm unit test
   infrastrucutre (nfit_test)
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJb8MdUAAoJEB7SkWpmfYgCifYP/A+OQ19HybqcY2nfvqXUdQum
 Q5x3qcKmGmEbKbnCUMOHZJpEjW4c/Cpm6OKhuFDQJ4tijn1XG3/ATSi7PXrZxs/o
 CRK8MIg5Wz/mMvYRvypIkCHHHr9+Y1NjqmQynM4LLzNG24GMXaeHHuZUTnrZCDmu
 0+jBTylNgVYdykoIxgHDYDB+cd6w4NtAP5OD9D46pdsmzX9ac+OQyZMyNB3glUhd
 /ZFAoywVNfvfJVWEci9RoHiKttWxgVoCuNbSlCs2Y6ymepA44ApR9AgLHtaC9pFO
 DrPkfCzPSmf4PVSxLJd79+/sw9YOcBD7LZ5IxzozxRMuRn5pIofdZIsBg9PlwT5B
 NL9jQK87XPiG0vNxhJu3wzP+FlyCXxGxkWfApp7w4rlWBV7RgugOZHyH051rdKzQ
 44JAPzLLCfA5Mj4o2tIbSx42f2JNX93XDEX8fkUB+qs3GzyOcMtlcmz9UjmnrT0R
 o9KHKhDn81Vivxh33Ts2G0iHktO83XSUBDWApSd6erjEUXMsCLY0D8y+nDGTOMUh
 kVcY8q93sgZGLVbcxt0eGc8Q7osZYawQGRGucflTETFcxNwMyLL4F9lWgPirGeYF
 i1JDWeTrhcImYufNj8o78LsbT5xh6YjbZZ8Q1obIgPXpDtxHNIXO6COId49Zp2cK
 obftWyVp+7kYe79NWzmD
 =sfNx
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:
 "A small batch of fixes for v4.20-rc3.

  The overflow continuation fix addresses something that has been broken
  for several releases. Arguably it could wait even longer, but it's a
  one line fix and this finishes the last of the known address range
  scrub bug reports. The revert addresses a lockdep regression. The unit
  tests are not critical to fix, but no reason to hold this fix back.

  Summary:

   - Address Range Scrub overflow continuation handling has been broken
     since it was initially merged. It was only recently that error
     injection and platform-BIOS support enabled this corner case to be
     exercised.

   - The recent attempt to provide more isolation for the kernel Address
     Range Scrub state machine from userapace initiated sessions
     triggers a lockdep report. Revert and try again at the next merge
     window.

   - Fix a kasan reported buffer overflow in libnvdimm unit test
     infrastrucutre (nfit_test)"

* tag 'libnvdimm-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  Revert "acpi, nfit: Further restrict userspace ARS start requests"
  acpi, nfit: Fix ARS overflow continuation
  tools/testing/nvdimm: Fix the array size for dimm devices.
2018-11-18 12:21:09 -08:00
Brenda J. Butler
c6cecf4ae4 tc-testing: tdc.py: Guard against lack of returncode in executed command
Add some defensive coding in case one of the subprocesses created by tdc
returns nothing. If no object is returned from exec_cmd, then tdc will
halt with an unhandled exception.

Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Signed-off-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17 21:54:53 -08:00
Lucas Bates
5aaf642852 tc-testing: tdc.py: ignore errors when decoding stdout/stderr
Prevent exceptions from being raised while decoding output
from an executed command. There is no impact on tdc's
execution and the verify command phase would fail the pattern
match.

Signed-off-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17 21:54:53 -08:00
Edward Cree
afd5942408 bpf: fix off-by-one error in adjust_subprog_starts
When patching in a new sequence for the first insn of a subprog, the start
 of that subprog does not change (it's the first insn of the sequence), so
 adjust_subprog_starts should check start <= off (rather than < off).
Also added a test to test_verifier.c (it's essentially the syz reproducer).

Fixes: cc8b0b92a1 ("bpf: introduce function calls (function boundaries)")
Reported-by: syzbot+4fc427c7af994b0948be@syzkaller.appspotmail.com
Signed-off-by: Edward Cree <ecree@solarflare.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-16 21:10:00 -08:00
Linus Torvalds
ef268de197 powerpc fixes for 4.20 #3
Two weeks worth of fixes since rc1.
 
  - I broke 16-byte alignment of the stack when we moved PPR into pt_regs.
    Despite being required by the ABI this broke almost nothing, we eventually
    hit it in code where GCC does arithmetic on the stack pointer assuming the
    bottom 4 bits are clear. Fix it by padding the in-kernel pt_regs by 8 bytes.
 
  - A couple of commits fixing minor bugs in the recent SLB rewrite.
 
  - A build fix related to tracepoints in KVM in some configurations.
 
  - Our old "IO workarounds" code written for Cell couldn't coexist in a kernel
    that runs on Power9 with the Radix MMU, fix that.
 
  - Remove the NPU DMA ops, these just printed a warning and should never have
    been called.
 
  - Suppress an overly chatty message triggered by CPU hotplug in some configs.
 
  - Two small selftest fixes.
 
 Thanks to:
   Alistair Popple, Gustavo Romero, Nicholas Piggin, Satheesh Rajendran, Scott Wood.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJb7qyjAAoJEFHr6jzI4aWA73kP/3x+vghoHthxrazbN+9Z0bWQ
 +c5fHobHLuREjzLhy73lbCE9NOhZTlmdgfAB/MRWW9aVIzHViuhUjRLpLqw0+LBA
 mWIrVloJeMcbupE+zFnc6qh1WY/YZ8lsPZxmb5YSqDSxtcdh8JzDK+RgWn9XkiFa
 sjppaZoLLf/Wxz4VT4v75o8WXEFavpbEaS2PLdWhwT1//H4QpKYWY80tPCijdRhp
 0susCzObBfdwxS4qlwmLBmCxbGhqLzBg1vnPPGq6GypRELIeqR+jHWOjzYmmvQRh
 hLffVTaHIVFgO9c4ruCFmMsJCA1hf186w/62IHXLgOfp7eQJYPQYCn7uVVTWoVDC
 5hpPR71xOkovLJbipk07lshj/kVJQVapCbyGtOz8DKgQnAWcSq23HMuJBHwSEnKH
 xtIk6iupik+7gWjDDY0Yz0xiCmZLRS5heWNAJgXpPFNMSR47EkmU4bZpMrWwbetf
 CashFhbYcFzpPaP7bHgxLk79fhUitkwCFFlSQK3Hj4bog+U2KmKnKgLphn8If8jy
 iHuggxvxCwDvfn+d9X29VJu7Bxz4M5l0ouEp0+hWAockGCsaDxXNuzJNMk1V5GyO
 Ytg6UtGBxLwKBKLqNeXmuY0/CD1lALU5mD/KSIKDjm7skM7T0U3s8T+7/GpFqQ7J
 uQ9aA49aWy4kUy+xCay5
 =fEqA
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.20-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Two weeks worth of fixes since rc1.

   - I broke 16-byte alignment of the stack when we moved PPR into
     pt_regs. Despite being required by the ABI this broke almost
     nothing, we eventually hit it in code where GCC does arithmetic on
     the stack pointer assuming the bottom 4 bits are clear. Fix it by
     padding the in-kernel pt_regs by 8 bytes.

   - A couple of commits fixing minor bugs in the recent SLB rewrite.

   - A build fix related to tracepoints in KVM in some configurations.

   - Our old "IO workarounds" code written for Cell couldn't coexist in
     a kernel that runs on Power9 with the Radix MMU, fix that.

   - Remove the NPU DMA ops, these just printed a warning and should
     never have been called.

   - Suppress an overly chatty message triggered by CPU hotplug in some
     configs.

   - Two small selftest fixes.

  Thanks to: Alistair Popple, Gustavo Romero, Nicholas Piggin, Satheesh
  Rajendran, Scott Wood"

* tag 'powerpc-4.20-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  selftests/powerpc: Adjust wild_bctr to build with old binutils
  powerpc/64: Fix kernel stack 16-byte alignment
  powerpc/numa: Suppress "VPHN is not supported" messages
  selftests/powerpc: Fix wild_bctr test to work on ppc64
  powerpc/io: Fix the IO workarounds code to work with Radix
  powerpc/mm/64s: Fix preempt warning in slb_allocate_kernel()
  KVM: PPC: Move and undef TRACE_INCLUDE_PATH/FILE
  powerpc/mm/64s: Only use slbfee on CPUs that support it
  powerpc/mm/64s: Use PPC_SLBFEE macro
  powerpc/mm/64s: Consolidate SLB assertions
  powerpc/powernv/npu: Remove NPU DMA ops
2018-11-16 10:14:54 -06:00
Gustavo Romero
b2fed34a62 selftests/powerpc: Adjust wild_bctr to build with old binutils
Currently the selftest wild_bctr can fail to build when an old gcc is
used, notably on gcc using a binutils version <= 2.27, because the
assembler does not support the integer suffix UL.

This patch adjusts the wild_bctr test so the REG_POISON value is still
treated as an unsigned long for the shifts on compilation but the UL
suffix is absent on the stringification, so the inline asm code
generated has no UL suffixes.

Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
[mpe: Wrap long line]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-15 23:05:17 +11:00
Florian Westphal
25d8bcedbf selftests: add script to stress-test nft packet path vs. control plane
Start flood ping for each cpu while loading/flushing rulesets to make
sure we do not access already-free'd rules from nf_tables evaluation loop.

Also add this to TARGETS so 'make run_tests' in selftest dir runs it
automatically.

This would have caught the bug fixed in previous change
("netfilter: nf_tables: do not skip inactive chains during generation update")
sooner.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-11-12 16:13:35 +01:00
Michael Ellerman
2c7645b0f7 selftests/powerpc: Fix wild_bctr test to work on ppc64
The selftest I recently added to test branching to an out-of-bounds
NIP doesn't work on 64-bit big endian. It does fail but not in the
right way. That is it SEGVs trying to load from the opd at BAD_NIP,
but it never gets as far as branching to BAD_NIP.

To fix it we need to create an opd which is reachable but which holds
the bad address.

Fixes: b7683fc66e ("selftests/powerpc: Add a test of wild bctr")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-12 14:47:54 +11:00
Li Zhijian
da85d8bfd1 kselftests/bpf: use ping6 as the default ipv6 ping binary when it exists
At commit deee2cae27 ("kselftests/bpf: use ping6 as the default ipv6 ping
binary if it exists"), it fixed similar issues for shell script, but it
missed a same issue in the C code.

Fixes: 371e4fcc9d ("selftests/bpf: cgroup local storage-based network counters")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
CC: Philip Li <philip.li@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09 10:55:09 +01:00
Quentin Monnet
f98e46a251 tools: bpftool: update references to other man pages in documentation
Update references to other bpftool man pages at the bottom of each
manual page. Also reference the "bpf(2)" and "bpf-helpers(7)" man pages.

References are sorted by number of man section, then by
"prog-and-map-go-first", the other pages in alphabetical order.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09 08:20:52 +01:00
Quentin Monnet
f120919f99 tools: bpftool: pass an argument to silence open_obj_pinned()
Function open_obj_pinned() prints error messages when it fails to open a
link in the BPF virtual file system. However, in some occasions it is
not desirable to print an error, for example when we parse all links
under the bpffs root, and the error is due to some paths actually being
symbolic links.

Example output:

    # ls -l /sys/fs/bpf/
    lrwxrwxrwx 1 root root 0 Oct 18 19:00 ip -> /sys/fs/bpf/tc/
    drwx------ 3 root root 0 Oct 18 19:00 tc
    lrwxrwxrwx 1 root root 0 Oct 18 19:00 xdp -> /sys/fs/bpf/tc/

    # bpftool --bpffs prog show
    Error: bpf obj get (/sys/fs/bpf): Permission denied
    Error: bpf obj get (/sys/fs/bpf): Permission denied

    # strace -e bpf bpftool --bpffs prog show
    bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/ip", bpf_fd=0}, 72) = -1 EACCES (Permission denied)
    Error: bpf obj get (/sys/fs/bpf): Permission denied
    bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/xdp", bpf_fd=0}, 72) = -1 EACCES (Permission denied)
    Error: bpf obj get (/sys/fs/bpf): Permission denied
    ...

To fix it, pass a bool as a second argument to the function, and prevent
it from printing an error when the argument is set to true.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09 08:20:52 +01:00
Quentin Monnet
a8bfd2bc29 tools: bpftool: fix plain output and doc for --bpffs option
Edit the documentation of the -f|--bpffs option to make it explicit that
it dumps paths of pinned programs when bpftool is used to list the
programs only, so that users do not believe they will see the name of
the newly pinned program with "bpftool prog pin" or "bpftool prog load".

Also fix the plain output: do not add a blank line after each program
block, in order to remain consistent with what bpftool does when the
option is not passed.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09 08:20:52 +01:00
Quentin Monnet
53909030aa tools: bpftool: prevent infinite loop in get_fdinfo()
Function getline() returns -1 on failure to read a line, thus creating
an infinite loop in get_fdinfo() if the key is not found. Fix it by
calling the function only as long as we get a strictly positive return
value.

Found by copying the code for a key which is not always present...

Fixes: 71bb428fe2 ("tools: bpf: add bpftool")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09 08:20:52 +01:00
Yonghong Song
49a249c387 tools/bpftool: copy a few net uapi headers to tools directory
Commit f6f3bac08f ("tools/bpf: bpftool: add net support")
added certain networking support to bpftool.
The implementation relies on a relatively recent uapi header file
linux/tc_act/tc_bpf.h on the host which contains the marco
definition of TCA_ACT_BPF_ID.

Unfortunately, this is not the case for all distributions.
See the email message below where rhel-7.2 does not have
an up-to-date linux/tc_act/tc_bpf.h.
  https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1799211.html
Further investigation found that linux/pkt_cls.h is also needed for macro
TCA_BPF_TAG.

This patch fixed the issue by copying linux/tc_act/tc_bpf.h
and linux/pkt_cls.h from kernel include/uapi directory to
tools/include/uapi directory so building the bpftool does not depend
on host system for these files.

Fixes: f6f3bac08f ("tools/bpf: bpftool: add net support")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Cc: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09 08:18:31 +01:00
Ingo Molnar
45fd808091 perf/urgent improvements and fixes:
Intel PT sql viewer: (Adrian Hunter)
 
 - Fall back to /usr/local/lib/libxed.so
 - Add Selected branches report
 - Add help window
 - Fix table find when table re-ordered
 
 Intel PT debug log (Adrian Hunter)
 
 - Add more event information
 - Add MTC and CYC timestamps
 
 perf record: (Andi Kleen)
 
 - Support weak groups, just like with 'perf stat'
 
 perf trace: (Arnaldo Carvalho de Melo)
 
 - Start augmenting raw_syscalls:{sys_enter,sys_exit}: goal is to have a
   generic, arch independent eBPF kernel component that is programmed with
   syscall table details, what to copy, how many bytes, pid, arg filters from the
   userspace via eBPF maps by the 'perf trace' tool that continues to use all its
   argument beautifiers, just taking advantage of the extra pointer contents.
 
 JVMTI: (Gustavo Romero)
 
 - Fix undefined symbol scnprintf in libperf-jvmti.so
 
 perf top: (Jin Yao)
 
 - Display the LBR stats in callchain entries
 
 perf stat: (Thomas Richter)
 
 - Handle different PMU names with common prefix
 
 arm64: Will (Deacon)
 
 - Fix arm64 tools build failure wrt smp_load_{acquire,release}.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCW+GBMAAKCRCyPKLppCJ+
 J5hwAP9+7F2HKvjwHj4g6YeAvCp2WzXbO9UzakfTNtkAwWDZHwD/aN8T8RdgiaCm
 FqlDoftwvSQSpbKvaiN7M1GSk14a+AQ=
 =gWMp
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-4.20-20181106' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent improvements and fixes from Arnaldo Carvalho de Melo:

Intel PT SQL viewer: (Adrian Hunter)

- Fall back to /usr/local/lib/libxed.so
- Add Selected branches report
- Add help window
- Fix table find when table re-ordered

Intel PT debug log (Adrian Hunter)

- Add more event information
- Add MTC and CYC timestamps

perf record: (Andi Kleen)

- Support weak groups, just like with 'perf stat'

perf trace: (Arnaldo Carvalho de Melo)

- Start augmenting raw_syscalls:{sys_enter,sys_exit}: goal is to have a
  generic, arch independent eBPF kernel component that is programmed with
  syscall table details, what to copy, how many bytes, pid, arg filters from the
  userspace via eBPF maps by the 'perf trace' tool that continues to use all its
  argument beautifiers, just taking advantage of the extra pointer contents.

JVMTI: (Gustavo Romero)

- Fix undefined symbol scnprintf in libperf-jvmti.so

perf top: (Jin Yao)

- Display the LBR stats in callchain entries

perf stat: (Thomas Richter)

- Handle different PMU names with common prefix

arm64: Will (Deacon)

- Fix arm64 tools build failure wrt smp_load_{acquire,release}.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-06 20:03:11 +01:00
Jiri Olsa
dbc4ca339c tools cpupower: Override CFLAGS assignments
So user could specify outside CFLAGS values.

Cc: Thomas Renninger <trenn@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-11-06 08:54:16 -07:00
Jiri Olsa
4bf3bd0f15 tools cpupower debug: Allow to use outside build flags
Adding CFLAGS and LDFLAGS to be used during the build.

Cc: Thomas Renninger <trenn@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-11-06 08:54:16 -07:00
Konstantin Khlebnikov
9de9aa45e9 tools/power/cpupower: fix compilation with STATIC=true
Rename duplicate sysfs_read_file into cpupower_read_sysfs and fix linking.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Thomas Renninger <trenn@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
2018-11-06 08:54:16 -07:00
Jiri Olsa
8e88c29b35 perf tools: Do not zero sample_id_all for group members
Andi reported following malfunction:

  # perf record -e '{ref-cycles,cycles}:S' -a sleep 1
  # perf script
  non matching sample_id_all

That's because we disable sample_id_all bit for non-sampling group
members. We can't do that, because it needs to be the same over the
whole event list. This patch keeps it untouched again.

Reported-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180923150420.27327-1-jolsa@kernel.org
Fixes: e9add8bac6 ("perf evsel: Disable write_backward for leader sampling group events")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-06 08:29:56 -03:00
Masayoshi Mizuma
af31b04b67 tools/testing/nvdimm: Fix the array size for dimm devices.
KASAN reports following global out of bounds access while
nfit_test is being loaded. The out of bound access happens
the following reference to dimm_fail_cmd_flags[dimm]. 'dimm' is
over than the index value, NUM_DCR (==5).

  static int override_return_code(int dimm, unsigned int func, int rc)
  {
          if ((1 << func) & dimm_fail_cmd_flags[dimm]) {

dimm_fail_cmd_flags[] definition:
  static unsigned long dimm_fail_cmd_flags[NUM_DCR];

'dimm' is the return value of get_dimm(), and get_dimm() returns
the index of handle[] array. The handle[] has 7 index. Let's use
ARRAY_SIZE(handle) as the array size.

KASAN report:

==================================================================
BUG: KASAN: global-out-of-bounds in nfit_test_ctl+0x47bb/0x55b0 [nfit_test]
Read of size 8 at addr ffffffffc10cbbe8 by task kworker/u41:0/8
...
Call Trace:
 dump_stack+0xea/0x1b0
 ? dump_stack_print_info.cold.0+0x1b/0x1b
 ? kmsg_dump_rewind_nolock+0xd9/0xd9
 print_address_description+0x65/0x22e
 ? nfit_test_ctl+0x47bb/0x55b0 [nfit_test]
 kasan_report.cold.6+0x92/0x1a6
 nfit_test_ctl+0x47bb/0x55b0 [nfit_test]
...
The buggy address belongs to the variable:
 dimm_fail_cmd_flags+0x28/0xffffffffffffa440 [nfit_test]
==================================================================

Fixes: 39611e83a2 ("tools/testing/nvdimm: Make DSM failure code injection...")
Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-11-05 11:34:08 -08:00
Gustavo Romero
6ac2226229 perf tools: Fix undefined symbol scnprintf in libperf-jvmti.so
Currently jvmti agent can not be used because function scnprintf is not
present in the agent libperf-jvmti.so. As a result the JVM when using
such agent to record JITed code profiling information will fail on
looking up scnprintf:

  java: symbol lookup error: lib/libperf-jvmti.so: undefined symbol: scnprintf

This commit fixes that by reverting to the use of snprintf, that can be
looked up, instead of scnprintf, adding a proper check for the returned
value in order to print a better error message when the jitdump file
pathname is too long. Checking the returned value also helps to comply
with some recent gcc versions, like gcc8, which will fail due to
truncated writing checks related to the -Werror=format-truncation= flag.

Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
LPU-Reference: 1541117601-18937-2-git-send-email-gromero@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-mvpxxxy7wnzaj74cq75muw3f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 16:28:00 -03:00
Arnaldo Carvalho de Melo
e2c39f36c3 perf beauty: Use SRCARCH, ARCH=x86_64 must map to "x86" to find the headers
Guenter reported that using ARCH=x86_64 to build perf has regressed:

  $ make -C tools/perf O=/tmp/build/perf ARCH=x86_64
  make: Entering directory '/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j4' parallel build
    HOSTCC   /tmp/build/perf/fixdep.o
    HOSTLD   /tmp/build/perf/fixdep-in.o
    LINK     /tmp/build/perf/fixdep

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  <SNIP>
  ...                           bpf: [ on  ]

    GEN      /tmp/build/perf/common-cmds.h
  make[2]: *** No rule to make target '/home/acme/git/perf/tools/arch/x86_64/include/uapi/asm//mman.h', needed by '/tmp/build/perf/trace/beauty/generated/mmap_flags_array.c'.  Stop.
  make[2]: *** Waiting for unfinished jobs....
    PERF_VERSION = 4.19.gf6c23e3
  make[1]: *** [Makefile.perf:207: sub-make] Error 2
  make: *** [Makefile:70: all] Error 2
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $

This is because we must use $(SRCARCH) where we were using $(ARCH), so
that, just like the top level Makefile, we get this done:

  # Additional ARCH settings for x86
  ifeq ($(ARCH),i386)
          SRCARCH := x86
  endif
  ifeq ($(ARCH),x86_64)
          SRCARCH := x86
  endif

Which is done in tools/scripts/Makefile.arch, so switch to use
$(SRCARCH).

Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: fbd7458db7 ("perf beauty: Wire up the mmap flags table generator to the Makefile")
Link: https://lkml.kernel.org/r/20181105184612.GD7077@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 15:46:51 -03:00
Adrian Hunter
f6c23e3b55 perf intel-pt: Add MTC and CYC timestamps to debug log
One cause of decoding errors is un-synchronized side-band data.
Timestamps are needed to debug such cases. TSC packet timestamps are
logged. Log also MTC and CYC timestamps.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/20181105073505.8129-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 14:53:54 -03:00
Adrian Hunter
93f8be2799 perf intel-pt: Add more event information to debug log
More event information is useful for debugging, especially MMAP events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/20181105073505.8129-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 14:53:37 -03:00
Adrian Hunter
35fa1cee21 perf scripts python: exported-sql-viewer.py: Fix table find when table re-ordered
Table rows can be re-ordered by selecting a column to sort by. After
re-ordering, the "find" operation was highlighting the wrong row, fix
it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20181104151238.15947-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 14:53:00 -03:00
Adrian Hunter
65b24292e8 perf scripts python: exported-sql-viewer.py: Add help window
Add a window to display help. It is also possible to display the help
only, by using the option "--help-only" instead of a database name.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20181104151238.15947-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 14:52:45 -03:00
Adrian Hunter
210cf1f961 perf scripts python: exported-sql-viewer.py: Add Selected branches report
Fetching data from the database can be slow. Add a report that provides
the ability to select a subset of branches.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20181104151238.15947-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 14:51:55 -03:00
Adrian Hunter
5ed4419d47 perf scripts python: exported-sql-viewer.py: Fall back to /usr/local/lib/libxed.so
Fall back to /usr/local/lib/libxed.so to cater for distributions that do
not have /usr/local/lib in the library path by default.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20181104151238.15947-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 14:51:31 -03:00
Jin Yao
590ac60d8a perf top: Display the LBR stats in callchain entry
'perf report' has supported the displaying of LBR stats (such as cycles,
predicted%) in callchain entry.

For example:

  $ perf report --branch-history --stdio

  --1.01%--intel_idle mwait.h:29
            intel_idle cpufeature.h:164 (cycles:5)
            intel_idle cpufeature.h:164 (predicted:76.4%)
            intel_idle mwait.h:102 (cycles:41)
            intel_idle current.h:15

While 'perf top' doesn't support that.

For example:

  $ perf top -a -b --call-graph branch

  -   13.86%     0.23%  [kernel]		[k] __x86_indirect_thunk_rax
     - 13.65% __x86_indirect_thunk_rax
        + 1.69% do_syscall_64
        + 1.68% do_select
        + 1.41% ktime_get
        + 0.70% __schedule
        + 0.62% do_sys_poll
          0.58% __x86_indirect_thunk_rax

Actually it's very easy to enable this feature in 'perf top'.

With this patch, the result is:

  $ perf top -a -b --call-graph branch

  $ -   13.58%     0.00%  [kernel]		[k] __x86_indirect_thunk_rax
     $ - 13.57% __x86_indirect_thunk_rax (predicted:93.9%)
        $ + 1.78% do_select (cycles:2)
        $ + 1.68% perf_pmu_disable.part.99 (cycles:1)
        $ + 1.45% ___sys_recvmsg (cycles:25)
        $ + 0.81% unix_stream_sendmsg (cycles:18)
        $ + 0.80% ktime_get (cycles:400)
          $ 0.58% pick_next_task_fair (cycles:47)
        $ + 0.56% i915_request_retire (cycles:2)
        $ + 0.52% do_sys_poll (cycles:4)

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1540983995-20462-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 14:37:11 -03:00
Thomas Richter
ea1fa48c05 perf stat: Handle different PMU names with common prefix
On s390 the CPU Measurement Facility for counters now supports
2 PMUs named cpum_cf (CPU Measurement Facility for counters) and
cpum_cf_diag (CPU Measurement Facility for diagnostic counters)
for one and the same CPU.

Running command

 [root@s35lp76 perf]# ./perf stat -e tx_c_tend \
	 -- ~/mytests/cf-tx-events 1

 Measuring transactions
 TX_C_TABORT_NO_SPECIAL: 0 expected:0
 TX_C_TABORT_SPECIAL: 0 expected:0
 TX_C_TEND: 1 expected:1
 TX_NC_TABORT: 11 expected:11
 TX_NC_TEND: 1 expected:1

 Performance counter stats for '/root/mytests/cf-tx-events 1':

  2      tx_c_tend

      0.002120091 seconds time elapsed

      0.000121000 seconds user
      0.002127000 seconds sys

 [root@s35lp76 perf]#

displays output which is unexpected (and wrong):

  2      tx_c_tend

The test program definitely triggers only one transaction, as shown
in line 'TX_C_TEND: 1 expected:1'.

This is caused by the following call sequence:

pmu_lookup() scans and installs a PMU.
+--> pmu_aliases() parses all aliases in directory
		.../<pmu-name>/events/* which are file names.
     +--> pmu_aliases_parse() Read each file in directory and create
                      an new alias entry. This is done with
          +--> perf_pmu__new_alias() and
	       +--> __perf_pmu__new_alias() which also check for
	                   identical alias names.

After pmu_aliases() returns, a complete list of event names
for this pmu has been created. Now function

pmu_add_cpu_aliases()   is called to add the events listed in the json
|                       files to the alias list of the cpu.
+--> perf_pmu__find_map()  Returns a pointer to the json events.

Now function pmu_add_cpu_aliases() scans through all events listed
in the JSON files for this CPU.
Each json event pmu name is compared with the current PMU being
built up and if they mismatch, the json event is added to the
current PMUs alias list.
To avoid duplicate entries the following comparison is done:

	if (!is_arm_pmu_core(name)) {
	     pname = pe->pmu ? pe->pmu : "cpu";
	     if (strncmp(pname, name, strlen(pname)))
		     continue;
     }

The culprit is the strncmp() function.

Using current s390 PMU naming, the first PMU is 'cpum_cf'
and a long list of events is added, among them 'tx_c_tend'

When the second PMU named 'cpum_cf_diag' is added, only one event
named 'CF_DIAG' is added by the pmu_aliases()  function.

Now function pmu_add_cpu_aliases() is invoked for PMU 'cpum_cf_diag'.
Since the CPUID string is the same for both PMUs, json file events
for PMU named 'cpum_cf' are added to the PMU 'cpm_cf_diag'

This happens because the strncmp() actually compares:

     strncmp("cpum_cf", "cpum_cf_diag", 6);

The first parameter is the pmu name taken from the event in
the json file. The second parameter is the pmu name of the PMU
currently being built.
They are different, but the length of the compare only tests the
common prefix and this returns 0(true) when it should return false.

Now all events for PMU cpum_cf are added to the alias list for pmu
cpum_cf_diag.

Later on in function parse_events_add_pmu() the event 'tx_c_end' is
searched in all available PMUs and found twice, adding it two
times to the evsel_list global variable which is the root
of all events. This results in a counter value of 2 instead
of 1.

Output with this patch:

 [root@s35lp76 perf]# ./perf stat -e tx_c_tend \
			-- ~/mytests/cf-tx-events 1
 Measuring transactions
 TX_C_TABORT_NO_SPECIAL: 0 expected:0
 TX_C_TABORT_SPECIAL: 0 expected:0
 TX_C_TEND: 1 expected:1
 TX_NC_TABORT: 11 expected:11
 TX_NC_TEND: 1 expected:1

 Performance counter stats for '/root/mytests/cf-tx-events 1':

                  1      tx_c_tend

      0.001815365 seconds time elapsed

      0.000123000 seconds user
      0.001756000 seconds sys

 [root@s35lp76 perf]#

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Reviewed-by: Sebastien Boisvert <sboisvert@gydle.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 292c34c102 ("perf pmu: Fix core PMU alias list for X86 platform")
Link: http://lkml.kernel.org/r/20181023151616.78193-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 14:37:10 -03:00
Andi Kleen
cf99ad1424 perf record: Support weak groups
Implement a weak group fallback for 'perf record', similar to the
existing 'perf stat' support.  This allows to use groups that might be
longer than the available counters without failing.

Before:

  $ perf record  -e '{cycles,cache-misses,cache-references,cpu_clk_unhalted.thread,cycles,cycles,cycles}' -a sleep 1
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles).
  /bin/dmesg | grep -i perf may provide additional information.

After:

  $ ./perf record  -e '{cycles,cache-misses,cache-references,cpu_clk_unhalted.thread,cycles,cycles,cycles}:W' -a sleep 1
  WARNING: No sample_id_all support, falling back to unordered processing
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 8.136 MB perf.data (134069 samples) ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181001195927.14211-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 14:37:10 -03:00
Andi Kleen
c3537fc251 perf evlist: Move perf_evsel__reset_weak_group into evlist
- Move the function from builtin-stat to evlist for reuse
- Rename to evlist to match purpose better
- Pass the evlist as first argument.
- No functional changes

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181001195927.14211-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 14:37:09 -03:00
Arnaldo Carvalho de Melo
79ef68c7e1 perf augmented_syscalls: Start collecting pathnames in the BPF program
This is the start of having the raw_syscalls:sys_enter BPF handler
collecting pointer arguments, namely pathnames, and with two syscalls
that have that pointer in different arguments, "open" as it as its first
argument, "openat" as the second.

With this in place the existing beautifiers in 'perf trace' works, those
args are shown instead of just the pointer that comes with the syscalls
tracepoints.

This also serves to show and document pitfalls in the process of using
just that place in the kernel (raw_syscalls:sys_enter) plus tables
provided by userspace to collect syscall pointer arguments.

One is the need to use a barrier, as suggested by Edward, to avoid clang
optimizations that make the kernel BPF verifier to refuse loading our
pointer contents collector.

The end result should be a generic eBPF program that works in all
architectures, with the differences amongst archs resolved by the
userspace component, 'perf trace', that should get all its tables
created automatically from the kernel components where they are defined,
via string table constructors for things not expressed in BTF/DWARF
(enums, structs, etc), and otherwise using those observability files
(BTF).

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-37dz54pmotgpnwg9tb6zuk9j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-11-05 12:41:10 -03:00
Linus Torvalds
601a88077c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "A number of fixes and some late updates:

   - make in_compat_syscall() behavior on x86-32 similar to other
     platforms, this touches a number of generic files but is not
     intended to impact non-x86 platforms.

   - objtool fixes

   - PAT preemption fix

   - paravirt fixes/cleanups

   - cpufeatures updates for new instructions

   - earlyprintk quirk

   - make microcode version in sysfs world-readable (it is already
     world-readable in procfs)

   - minor cleanups and fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  compat: Cleanup in_compat_syscall() callers
  x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT
  objtool: Support GCC 9 cold subfunction naming scheme
  x86/numa_emulation: Fix uniform-split numa emulation
  x86/paravirt: Remove unused _paravirt_ident_32
  x86/mm/pat: Disable preemption around __flush_tlb_all()
  x86/paravirt: Remove GPL from pv_ops export
  x86/traps: Use format string with panic() call
  x86: Clean up 'sizeof x' => 'sizeof(x)'
  x86/cpufeatures: Enumerate MOVDIR64B instruction
  x86/cpufeatures: Enumerate MOVDIRI instruction
  x86/earlyprintk: Add a force option for pciserial device
  objtool: Support per-function rodata sections
  x86/microcode: Make revision and processor flags world-readable
2018-11-03 18:25:17 -07:00
Linus Torvalds
01897f3e05 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates and fixes from Ingo Molnar:
 "These are almost all tooling updates: 'perf top', 'perf trace' and
  'perf script' fixes and updates, an UAPI header sync with the merge
  window versions, license marker updates, much improved Sparc support
  from David Miller, and a number of fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (66 commits)
  perf intel-pt/bts: Calculate cpumode for synthesized samples
  perf intel-pt: Insert callchain context into synthesized callchains
  perf tools: Don't clone maps from parent when synthesizing forks
  perf top: Start display thread earlier
  tools headers uapi: Update linux/if_link.h header copy
  tools headers uapi: Update linux/netlink.h header copy
  tools headers: Sync the various kvm.h header copies
  tools include uapi: Update linux/mmap.h copy
  perf trace beauty: Use the mmap flags table generated from headers
  perf beauty: Wire up the mmap flags table generator to the Makefile
  perf beauty: Add a generator for MAP_ mmap's flag constants
  tools include uapi: Update asound.h copy
  tools arch uapi: Update asm-generic/unistd.h and arm64 unistd.h copies
  tools include uapi: Update linux/fs.h copy
  perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc}
  perf cs-etm: Correct CPU mode for samples
  perf unwind: Take pgoff into account when reporting elf to libdwfl
  perf top: Do not use overwrite mode by default
  perf top: Allow disabling the overwrite mode
  perf trace: Beautify mount's first pathname arg
  ...
2018-11-03 18:13:43 -07:00
Ingo Molnar
23a12ddee1 Merge branch 'core/urgent' into x86/urgent, to pick up objtool fix
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-03 23:42:16 +01:00