Commit Graph

952976 Commits

Author SHA1 Message Date
David S. Miller
3f53094ded Merge branch 'net-ipa-miscellaneous-cleanups'
Alex Elder says:

====================
net: ipa: miscellaneous cleanups

This series contains some minor cleanups I've been meaning to get
around to for a while.  The first few remove the definitions of some
currently-unused symbols.  Several fix some warnings that are reported
when the build is done with "W=2".  All are simple and have no effect
on the operation of the code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:34:56 -07:00
Alex Elder
bf8fd8d326 net: ipa: fix two comments
In ipa_uc_response_hdlr() a comment uses the wrong function name
when it describes where a clock reference is taken.  Fix this.

Also fix the comment in ipa_uc_response_hdlr() to correctly refer to
ipa_uc_setup(), which is where the clock reference described here is
taken.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:34:56 -07:00
Alex Elder
84cec844b8 net: ipa: rename a phandle variable
When "W=2" is supplied to the build command, we get a warning about
shadowing a global declaration (of a typedef) for a variable defined
in ipa_probe().  Rename the variable to get rid of the warning.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:34:56 -07:00
Alex Elder
993cac15d1 net: ipa: fix two mild warnings
Fix two spots where a variable "channel_id" is unnecessarily
redefined inside loops in "gsi.c".  This is warned about if
"W=2" is added to the build command.

Note that this problem is harmless, so there's no need to backport
it as a bugfix.

Remove a comment in gsi_init() about waking the system; the GSI
interrupt does not wake the system any more.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:34:56 -07:00
Alex Elder
fb980ef741 net: ipa: share field mask values for GSI general interrupt
The GSI general interrupt is managed by three registers: enable;
status; and clear.  The three registers have same set of field bits
at the same locations.  Use a common set of field masks for all
three registers to avoid duplication.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:34:56 -07:00
Alex Elder
d61bb7166d net: ipa: share field mask values for GSI global interrupt
The GSI global interrupt is managed by three registers: enable;
status; and clear.  The three registers have same set of field bits
at the same locations.  Use a common set of field masks for all
three registers to avoid duplication.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:34:56 -07:00
Alex Elder
e6580d0eb7 net: ipa: share field mask values for GSI interrupt type
The GSI interrupt type register and interrupt type mask register
have the same field bits at the same locations.  Use a common set of
field masks for both registers rather than essentially duplicating
them.  The only place the interrupt mask register uses any of these
is in gsi_irq_enable().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:34:56 -07:00
Alex Elder
e0ebe20491 net: ipa: remove unused status structure field masks
Most of the field masks used for fields in a status structure are
unused.  Remove their definitions; we can add them back again when
we actually use them to handle arriving status messages.  These are
warned about if "W=2" is added to the build command.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:34:55 -07:00
Alex Elder
ab4f71e519 net: ipa: kill unused status exceptions
Only the deaggregation status exception type is ever actually used.
If any other status exception type is reported we basically ignore
it, and consume the packet.  Remove the unused definitions of status
exception type symbols; they can be added back when we actually
handle them.

Separately, two consecutive if statements test the same condition
near the top of ipa_endpoint_suspend_one().  Instead, use a single
test with a block that combines the previously-separate lines of
code.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:34:55 -07:00
Alex Elder
4592139028 net: ipa: kill unused status opcodes
Three status opcodes are not currently supported.  Symbols
representing their numeric values are defined but never used.
Remove those unused definitions; they can be defined again
when they actually get used.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:34:55 -07:00
Alex Elder
901594589f net: ipa: kill definition of TRE_FLAGS_IEOB_FMASK
In "gsi_trans.c", the field mask TRE_FLAGS_IEOB_FMASK is defined but
never used.  Although there's no harm in defining this, remove it
for now and redefine it at some future date if it becomes needed.
This is warned about if "W=2" is added to the build command.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 18:34:55 -07:00
Alexei Starovoitov
98b972d20a Merge branch 'bpf: add helpers to support BTF-based kernel'
Alan Maguire says:

====================
This series attempts to provide a simple way for BPF programs (and in
future other consumers) to utilize BPF Type Format (BTF) information
to display kernel data structures in-kernel.  The use case this
functionality is applied to here is to support a snprintf()-like
helper to copy a BTF representation of kernel data to a string,
and a BPF seq file helper to display BTF data for an iterator.

There is already support in kernel/bpf/btf.c for "show" functionality;
the changes here generalize that support from seq-file specific
verifier display to the more generic case and add another specific
use case; rather than seq_printf()ing the show data, it is copied
to a supplied string using a snprintf()-like function.  Other future
consumers of the show functionality could include a bpf_printk_btf()
function which printk()ed the data instead.  Oops messaging in
particular would be an interesting application for such functionality.

The above potential use case hints at a potential reply to
a reasonable objection that such typed display should be
solved by tracing programs, where the in-kernel tracing records
data and the userspace program prints it out.  While this
is certainly the recommended approach for most cases, I
believe having an in-kernel mechanism would be valuable
also.  Critically in BPF programs it greatly simplifies
debugging and tracing of such data to invoking a simple
helper.

One challenge raised in an earlier iteration of this work -
where the BTF printing was implemented as a printk() format
specifier - was that the amount of data printed per
printk() was large, and other format specifiers were far
simpler.  Here we sidestep that concern by printing
components of the BTF representation as we go for the
seq file case, and in the string case the snprintf()-like
operation is intended to be a basis for perf event or
ringbuf output.  The reasons for avoiding bpf_trace_printk
are that

1. bpf_trace_printk() strings are restricted in size and
cannot display anything beyond trivial data structures; and
2. bpf_trace_printk() is for debugging purposes only.

As Alexei suggested, a bpf_trace_puts() helper could solve
this in the future but it still would be limited by the
1000 byte limit for traced strings.

Default output for an sk_buff looks like this (zeroed fields
are omitted):

(struct sk_buff){
 .transport_header = (__u16)65535,
 .mac_header = (__u16)65535,
 .end = (sk_buff_data_t)192,
 .head = (unsigned char *)0x000000007524fd8b,
 .data = (unsigned char *)0x000000007524fd8b,
 .truesize = (unsigned int)768,
 .users = (refcount_t){
  .refs = (atomic_t){
   .counter = (int)1,
  },
 },
}

Flags can modify aspects of output format; see patch 3
for more details.

Changes since v6:

- Updated safe data size to 32, object name size to 80.
  This increases the number of safe copies done, but performance is
  not a key goal here. WRT name size the largest type name length
  in bpf-next according to "pahole -s" is 64 bytes, so that still gives
  room for additional type qualifiers, parens etc within the name limit
  (Alexei, patch 2)
- Remove inlines and converted as many #defines to functions as was
  possible.  In a few cases - btf_show_type_value[s]() specifically -
  I left these as macros as btf_show_type_value[s]() prepends and
  appends format strings to the format specifier (in order to include
  indentation, delimiters etc so a macro makes that simpler (Alexei,
  patch 2)
- Handle btf_resolve_size() error in btf_show_obj_safe() (Alexei, patch 2)
- Removed clang loop unroll in BTF snprintf test (Alexei)
- switched to using bpf_core_type_id_kernel(type) as suggested by Andrii,
  and Alexei noted that __builtin_btf_type_id(,1) should be used (patch 4)
- Added skip logic if __builtin_btf_type_id is not available (patches 4,8)
- Bumped limits on bpf iters to support printing larger structures (Alexei,
  patch 5)
- Updated overflow bpf_iter tests to reflect new iter max size (patch 6)
- Updated seq helper to use type id only (Alexei, patch 7)
- Updated BTF task iter test to use task struct instead of struct fs_struct
  since new limits allow a task_struct to be displayed (patch 8)
- Fixed E2BIG handling in iter task (Alexei, patch 8)

Changes since v5:

- Moved btf print prepare into patch 3, type show seq
  with flags into patch 2 (Alexei, patches 2,3)
- Fixed build bot warnings around static declarations
  and printf attributes
- Renamed functions to snprintf_btf/seq_printf_btf
  (Alexei, patches 3-6)

Changes since v4:

- Changed approach from a BPF trace event-centric design to one
  utilizing a snprintf()-like helper and an iter helper (Alexei,
  patches 3,5)
- Added tests to verify BTF output (patch 4)
- Added support to tests for verifying BTF type_id-based display
  as well as type name via __builtin_btf_type_id (Andrii, patch 4).
- Augmented task iter tests to cover the BTF-based seq helper.
  Because a task_struct's BTF-based representation would overflow
  the PAGE_SIZE limit on iterator data, the "struct fs_struct"
  (task->fs) is displayed for each task instead (Alexei, patch 6).

Changes since v3:

- Moved to RFC since the approach is different (and bpf-next is
  closed)
- Rather than using a printk() format specifier as the means
  of invoking BTF-enabled display, a dedicated BPF helper is
  used.  This solves the issue of printk() having to output
  large amounts of data using a complex mechanism such as
  BTF traversal, but still provides a way for the display of
  such data to be achieved via BPF programs.  Future work could
  include a bpf_printk_btf() function to invoke display via
  printk() where the elements of a data structure are printk()ed
 one at a time.  Thanks to Petr Mladek, Andy Shevchenko and
  Rasmus Villemoes who took time to look at the earlier printk()
  format-specifier-focused version of this and provided feedback
  clarifying the problems with that approach.
- Added trace id to the bpf_trace_printk events as a means of
  separating output from standard bpf_trace_printk() events,
  ensuring it can be easily parsed by the reader.
- Added bpf_trace_btf() helper tests which do simple verification
  of the various display options.

Changes since v2:

- Alexei and Yonghong suggested it would be good to use
  probe_kernel_read() on to-be-shown data to ensure safety
  during operation.  Safe copy via probe_kernel_read() to a
  buffer object in "struct btf_show" is used to support
  this.  A few different approaches were explored
  including dynamic allocation and per-cpu buffers. The
  downside of dynamic allocation is that it would be done
  during BPF program execution for bpf_trace_printk()s using
  %pT format specifiers. The problem with per-cpu buffers
  is we'd have to manage preemption and since the display
  of an object occurs over an extended period and in printk
  context where we'd rather not change preemption status,
  it seemed tricky to manage buffer safety while considering
  preemption.  The approach of utilizing stack buffer space
  via the "struct btf_show" seemed like the simplest approach.
  The stack size of the associated functions which have a
  "struct btf_show" on their stack to support show operation
  (btf_type_snprintf_show() and btf_type_seq_show()) stays
  under 500 bytes. The compromise here is the safe buffer we
  use is small - 256 bytes - and as a result multiple
  probe_kernel_read()s are needed for larger objects. Most
  objects of interest are smaller than this (e.g.
  "struct sk_buff" is 224 bytes), and while task_struct is a
  notable exception at ~8K, performance is not the priority for
  BTF-based display. (Alexei and Yonghong, patch 2).
- safe buffer use is the default behaviour (and is mandatory
  for BPF) but unsafe display - meaning no safe copy is done
  and we operate on the object itself - is supported via a
  'u' option.
- pointers are prefixed with 0x for clarity (Alexei, patch 2)
- added additional comments and explanations around BTF show
  code, especially around determining whether objects such
  zeroed. Also tried to comment safe object scheme used. (Yonghong,
  patch 2)
- added late_initcall() to initialize vmlinux BTF so that it would
  not have to be initialized during printk operation (Alexei,
  patch 5)
- removed CONFIG_BTF_PRINTF config option as it is not needed;
  CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and
  determining behaviour of type-based printk can be done via
  retrieval of BTF data; if it's not there BTF was unavailable
  or broken (Alexei, patches 4,6)
- fix bpf_trace_printk test to use vmlinux.h and globals via
  skeleton infrastructure, removing need for perf events
  (Andrii, patch 8)

Changes since v1:

- changed format to be more drgn-like, rendering indented type info
  along with type names by default (Alexei)
- zeroed values are omitted (Arnaldo) by default unless the '0'
  modifier is specified (Alexei)
- added an option to print pointer values without obfuscation.
  The reason to do this is the sysctls controlling pointer display
  are likely to be irrelevant in many if not most tracing contexts.
  Some questions on this in the outstanding questions section below...
- reworked printk format specifer so that we no longer rely on format
  %pT<type> but instead use a struct * which contains type information
  (Rasmus). This simplifies the printk parsing, makes use more dynamic
  and also allows specification by BTF id as well as name.
- removed incorrect patch which tried to fix dereferencing of resolved
  BTF info for vmlinux; instead we skip modifiers for the relevant
  case (array element type determination) (Alexei).
- fixed issues with negative snprintf format length (Rasmus)
- added test cases for various data structure formats; base types,
  typedefs, structs, etc.
- tests now iterate through all typedef, enum, struct and unions
  defined for vmlinux BTF and render a version of the target dummy
  value which is either all zeros or all 0xff values; the idea is this
  exercises the "skip if zero" and "print everything" cases.
- added support in BPF for using the %pT format specifier in
  bpf_trace_printk()
- added BPF tests which ensure %pT format specifier use works (Alexei).
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28 18:26:59 -07:00
Alan Maguire
b72091bd4e selftests/bpf: Add test for bpf_seq_printf_btf helper
Add a test verifying iterating over tasks and displaying BTF
representation of task_struct succeeds.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-9-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Alan Maguire
eb411377ae bpf: Add bpf_seq_printf_btf helper
A helper is added to allow seq file writing of kernel data
structures using vmlinux BTF.  Its signature is

long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr,
                        u32 btf_ptr_size, u64 flags);

Flags and struct btf_ptr definitions/use are identical to the
bpf_snprintf_btf helper, and the helper returns 0 on success
or a negative error value.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-8-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Alan Maguire
eb58bbf2e5 selftests/bpf: Fix overflow tests to reflect iter size increase
bpf iter size increase to PAGE_SIZE << 3 means overflow tests assuming
page size need to be bumped also.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-7-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Alan Maguire
af65320948 bpf: Bump iter seq size to support BTF representation of large data structures
BPF iter size is limited to PAGE_SIZE; if we wish to display BTF-based
representations of larger kernel data structures such as task_struct,
this will be insufficient.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-6-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Alan Maguire
076a95f5af selftests/bpf: Add bpf_snprintf_btf helper tests
Tests verifying snprintf()ing of various data structures,
flags combinations using a tp_btf program. Tests are skipped
if __builtin_btf_type_id is not available to retrieve BTF
type ids.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-5-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Alan Maguire
c4d0bfb450 bpf: Add bpf_snprintf_btf helper
A helper is added to support tracing kernel type information in BPF
using the BPF Type Format (BTF).  Its signature is

long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr,
		      u32 btf_ptr_size, u64 flags);

struct btf_ptr * specifies

- a pointer to the data to be traced
- the BTF id of the type of data pointed to
- a flags field is provided for future use; these flags
  are not to be confused with the BTF_F_* flags
  below that control how the btf_ptr is displayed; the
  flags member of the struct btf_ptr may be used to
  disambiguate types in kernel versus module BTF, etc;
  the main distinction is the flags relate to the type
  and information needed in identifying it; not how it
  is displayed.

For example a BPF program with a struct sk_buff *skb
could do the following:

	static struct btf_ptr b = { };

	b.ptr = skb;
	b.type_id = __builtin_btf_type_id(struct sk_buff, 1);
	bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0);

Default output looks like this:

(struct sk_buff){
 .transport_header = (__u16)65535,
 .mac_header = (__u16)65535,
 .end = (sk_buff_data_t)192,
 .head = (unsigned char *)0x000000007524fd8b,
 .data = (unsigned char *)0x000000007524fd8b,
 .truesize = (unsigned int)768,
 .users = (refcount_t){
  .refs = (atomic_t){
   .counter = (int)1,
  },
 },
}

Flags modifying display are as follows:

- BTF_F_COMPACT:	no formatting around type information
- BTF_F_NONAME:		no struct/union member names/types
- BTF_F_PTR_RAW:	show raw (unobfuscated) pointer values;
			equivalent to %px.
- BTF_F_ZERO:		show zero-valued struct/union members;
			they are not displayed by default

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-4-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Alan Maguire
31d0bc8163 bpf: Move to generic BTF show support, apply it to seq files/strings
generalize the "seq_show" seq file support in btf.c to support
a generic show callback of which we support two instances; the
current seq file show, and a show with snprintf() behaviour which
instead writes the type data to a supplied string.

Both classes of show function call btf_type_show() with different
targets; the seq file or the string to be written.  In the string
case we need to track additional data - length left in string to write
and length to return that we would have written (a la snprintf).

By default show will display type information, field members and
their types and values etc, and the information is indented
based upon structure depth. Zeroed fields are omitted.

Show however supports flags which modify its behaviour:

BTF_SHOW_COMPACT - suppress newline/indent.
BTF_SHOW_NONAME - suppress show of type and member names.
BTF_SHOW_PTR_RAW - do not obfuscate pointer values.
BTF_SHOW_UNSAFE - do not copy data to safe buffer before display.
BTF_SHOW_ZERO - show zeroed values (by default they are not shown).

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-3-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Alan Maguire
76654e67f3 bpf: Provide function to get vmlinux BTF information
It will be used later for BPF structure display support

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1601292670-1616-2-git-send-email-alan.maguire@oracle.com
2020-09-28 18:26:58 -07:00
Andrii Nakryiko
a871b04310 libbpf: Add btf__new_empty() to create an empty BTF object
Add an ability to create an empty BTF object from scratch. This is going to be
used by pahole for BTF encoding. And also by selftest for convenient creation
of BTF objects.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-7-andriin@fb.com
2020-09-28 17:27:32 -07:00
Andrii Nakryiko
919d2b1dbb libbpf: Allow modification of BTF and add btf__add_str API
Allow internal BTF representation to switch from default read-only mode, in
which raw BTF data is a single non-modifiable block of memory with BTF header,
types, and strings layed out sequentially and contiguously in memory, into
a writable representation with types and strings data split out into separate
memory regions, that can be dynamically expanded.

Such writable internal representation is transparent to users of libbpf APIs,
but allows to append new types and strings at the end of BTF, which is
a typical use case when generating BTF programmatically. All the basic
guarantees of BTF types and strings layout is preserved, i.e., user can get
`struct btf_type *` pointer and read it directly. Such btf_type pointers might
be invalidated if BTF is modified, so some care is required in such mixed
read/write scenarios.

Switch from read-only to writable configuration happens automatically the
first time when user attempts to modify BTF by either adding a new type or new
string. It is still possible to get raw BTF data, which is a single piece of
memory that can be persisted in ELF section or into a file as raw BTF. Such
raw data memory is also still owned by BTF and will be freed either when BTF
object is freed or if another modification to BTF happens, as any modification
invalidates BTF raw representation.

This patch adds the first two BTF manipulation APIs: btf__add_str(), which
allows to add arbitrary strings to BTF string section, and btf__find_str()
which allows to find existing string offset, but not add it if it's missing.
All the added strings are automatically deduplicated. This is achieved by
maintaining an additional string lookup index for all unique strings. Such
index is built when BTF is switched to modifiable mode. If at that time BTF
strings section contained duplicate strings, they are not de-duplicated. This
is done specifically to not modify the existing content of BTF (types, their
string offsets, etc), which can cause confusion and is especially important
property if there is struct btf_ext associated with struct btf. By following
this "imperfect deduplication" process, btf_ext is kept consitent and correct.
If deduplication of strings is necessary, it can be forced by doing BTF
deduplication, at which point all the strings will be eagerly deduplicated and
all string offsets both in struct btf and struct btf_ext will be updated.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-6-andriin@fb.com
2020-09-28 17:27:31 -07:00
Andrii Nakryiko
7d9c71e10b libbpf: Extract generic string hashing function for reuse
Calculating a hash of zero-terminated string is a common need when using
hashmap, so extract it for reuse.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-5-andriin@fb.com
2020-09-28 17:27:31 -07:00
Andrii Nakryiko
192f5a1fe6 libbpf: Generalize common logic for managing dynamically-sized arrays
Managing dynamically-sized array is a common, but not trivial functionality,
which significant amount of logic and code to implement properly. So instead
of re-implementing it all the time, extract it into a helper function ans
reuse.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-4-andriin@fb.com
2020-09-28 17:27:31 -07:00
Andrii Nakryiko
b86042478f libbpf: Remove assumption of single contiguous memory for BTF data
Refactor internals of struct btf to remove assumptions that BTF header, type
data, and string data are layed out contiguously in a memory in a single
memory allocation. Now we have three separate pointers pointing to the start
of each respective are: header, types, strings. In the next patches, these
pointers will be re-assigned to point to independently allocated memory areas,
if BTF needs to be modified.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-3-andriin@fb.com
2020-09-28 17:27:31 -07:00
Andrii Nakryiko
740e69c3c5 libbpf: Refactor internals of BTF type index
Refactor implementation of internal BTF type index to not use direct pointers.
Instead it uses offset relative to the start of types data section. This
allows for types data to be reallocatable, enabling implementation of
modifiable BTF.

As now getting type by ID has an extra indirection step, convert all internal
type lookups to a new helper btf_type_id(), that returns non-const pointer to
a type by its ID.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200926011357.2366158-2-andriin@fb.com
2020-09-28 17:27:31 -07:00
Toke Høiland-Jørgensen
b000def2e0 selftests: Remove fmod_ret from test_overhead
The test_overhead prog_test included an fmod_ret program that attached to
__set_task_comm() in the kernel. However, this function was never listed as
allowed for return modification, so this only worked because of the
verifier skipping tests when a trampoline already existed for the attach
point. Now that the verifier checks have been fixed, remove fmod_ret from
the test so it works again.

Fixes: 4eaf0b5c5e ("selftest/bpf: Fmod_ret prog and implement test_overhead as part of bench")
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28 17:20:28 -07:00
Toke Høiland-Jørgensen
f7b12b6fea bpf: verifier: refactor check_attach_btf_id()
The check_attach_btf_id() function really does three things:

1. It performs a bunch of checks on the program to ensure that the
   attachment is valid.

2. It stores a bunch of state about the attachment being requested in
   the verifier environment and struct bpf_prog objects.

3. It allocates a trampoline for the attachment.

This patch splits out (1.) and (3.) into separate functions which will
perform the checks, but return the computed values instead of directly
modifying the environment. This is done in preparation for reusing the
checks when the actual attachment is happening, which will allow tracing
programs to have multiple (compatible) attachments.

This also fixes a bug where a bunch of checks were skipped if a trampoline
already existed for the tracing target.

Fixes: 6ba43b761c ("bpf: Attachment verification for BPF_MODIFY_RETURN")
Fixes: 1e6c62a882 ("bpf: Introduce sleepable BPF programs")
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28 17:10:34 -07:00
Toke Høiland-Jørgensen
efc68158c4 bpf: change logging calls from verbose() to bpf_log() and use log pointer
In preparation for moving code around, change a bunch of references to
env->log (and the verbose() logging helper) to use bpf_log() and a direct
pointer to struct bpf_verifier_log. While we're touching the function
signature, mark the 'prog' argument to bpf_check_type_match() as const.

Also enhance the bpf_verifier_log_needed() check to handle NULL pointers
for the log struct so we can re-use the code with logging disabled.

Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28 17:09:59 -07:00
Toke Høiland-Jørgensen
1af9270e90 bpf: disallow attaching modify_return tracing functions to other BPF programs
From the checks and commit messages for modify_return, it seems it was
never the intention that it should be possible to attach a tracing program
with expected_attach_type == BPF_MODIFY_RETURN to another BPF program.
However, check_attach_modify_return() will only look at the function name,
so if the target function starts with "security_", the attach will be
allowed even for bpf2bpf attachment.

Fix this oversight by also blocking the modification if a target program is
supplied.

Fixes: 18644cec71 ("bpf: Fix use-after-free in fmod_ret check")
Fixes: 6ba43b761c ("bpf: Attachment verification for BPF_MODIFY_RETURN")
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28 17:08:07 -07:00
Alexei Starovoitov
84a20d8e87 Merge branch 'Sockmap copying'
Lorenz Bauer says:

====================
Changes in v2:
- Check sk_fullsock in map_update_elem (Martin)

Enable calling map_update_elem on sockmaps from bpf_iter context. This
in turn allows us to copy a sockmap by iterating its elements.

The change itself is tiny, all thanks to the ground work from Martin,
whose series [1] this patch is based on. I updated the tests to do some
copying, and also included two cleanups.

I'm sending this out now rather than when Martin's series has landed
because I hope this can get in before the merge window (potentially)
closes this weekend.

1: https://lore.kernel.org/bpf/20200925000337.3853598-1-kafai@fb.com/
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28 16:48:02 -07:00
Lorenz Bauer
5b87adc3ce selftest: bpf: Test copying a sockmap and sockhash
Since we can now call map_update_elem(sockmap) from bpf_iter context
it's possible to copy a sockmap or sockhash in the kernel. Add a
selftest which exercises this.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200928090805.23343-5-lmb@cloudflare.com
2020-09-28 16:48:02 -07:00
Lorenz Bauer
2787031733 selftests: bpf: Remove shared header from sockmap iter test
The shared header to define SOCKMAP_MAX_ENTRIES is a bit overkill.
Dynamically allocate the sock_fd array based on bpf_map__max_entries
instead.

Suggested-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200928090805.23343-4-lmb@cloudflare.com
2020-09-28 16:48:02 -07:00
Lorenz Bauer
26c3270ddb selftests: bpf: Add helper to compare socket cookies
We compare socket cookies to ensure that insertion into a sockmap worked.
Pull this out into a helper function for use in other tests.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200928090805.23343-3-lmb@cloudflare.com
2020-09-28 16:47:58 -07:00
Lorenz Bauer
6550f2dddf bpf: sockmap: Enable map_update_elem from bpf_iter
Allow passing a pointer to a BTF struct sock_common* when updating
a sockmap or sockhash. Since BTF pointers can fault and therefore be
NULL at runtime we need to add an additional !sk check to
sock_map_update_elem. Since we may be passed a request or timewait
socket we also need to check sk_fullsock. Doing this allows calling
map_update_elem on sockmap from bpf_iter context, which uses
BTF pointers.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200928090805.23343-2-lmb@cloudflare.com
2020-09-28 16:40:46 -07:00
David S. Miller
280095713c Merge branch 'ibmvnic-refactor-some-send-handle-functions'
Lijun Pan says:

====================
ibmvnic: refactor some send/handle functions

This patch series rename and factor some send crq request functions.
The new naming aligns better with handle* functions such that it make
the code easier to read and search by new contributors.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 16:06:08 -07:00
Lijun Pan
46899bde9f ibmvnic: create send_control_ip_offload
Factor send_control_ip_offload out of handle_query_ip_offload_rsp.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 16:06:08 -07:00
Lijun Pan
16e811fe0f ibmvnic: create send_query_ip_offload
Factor send_query_ip_offload out of handle_request_cap_rsp to
pair with handle_query_ip_offload_rsp.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 16:06:08 -07:00
Lijun Pan
69980d021c ibmvnic: rename send_map_query to send_query_map
The new name send_query_map pairs with handle_query_map_rsp.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 16:06:08 -07:00
Lijun Pan
09081b9d3a ibmvnic: rename ibmvnic_send_req_caps to send_request_cap
The new name send_request_cap pairs with handle_request_cap_rsp.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 16:06:08 -07:00
Lijun Pan
491099ad74 ibmvnic: rename send_cap_queries to send_query_cap
The new name send_query_cap pairs with handle_query_cap_rsp.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 16:06:08 -07:00
Lijun Pan
b9cd795b0e ibmvnic: set up 200GBPS speed
Set up the speed according to crq->query_phys_parms.rsp.speed.
Fix IBMVNIC_10GBPS typo.

Fixes: f8d6ae0d27 ("ibmvnic: Report actual backing device speed and duplex values")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 16:03:51 -07:00
Rikard Falkeborn
7028977965 atm: atmtcp: Constify atmtcp_v_dev_ops
The only usage of atmtcp_v_dev_ops is to pass its address to
atm_dev_register() which takes a pointer to const, and comparing its
address to another address, which does not modify it. Make it const to
allow the compiler to put it in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 16:03:30 -07:00
Davide Caratti
e5f7e211b6 ip6gre: avoid tx_error when sending MLD/DAD on external tunnels
similarly to what has been done with commit 9d149045b3 ("geneve: change
from tx_error to tx_dropped on missing metadata"), avoid reporting errors
to userspace in case the kernel doesn't find any tunnel information for a
skb that is going to be transmitted: an increase of tx_dropped is enough.

tested with the following script:

 # for t in ip6gre ip6gretap ip6erspan; do
 > ip link add dev gre6-test0 type $t external
 > ip address add dev gre6-test0 2001:db8::1/64
 > ip link set dev gre6-test0 up
 > sleep 30
 > ip -s -j link show dev gre6-test0 | jq \
 > '.[0].stats64.tx | {"errors": .errors, "dropped": .dropped}'
 > ip link del dev gre6-test0
 > done

Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 16:01:37 -07:00
David S. Miller
be589d0428 Merge branch 'net-smc-introduce-SMC-Dv2-support'
Karsten Graul says:

====================
net/smc: introduce SMC-Dv2 support

SMC-Dv2 support (see https://www.ibm.com/support/pages/node/6326337)
provides multi-subnet support for SMC-D, eliminating the current
same-subnet restriction. The new version detects if any of the virtual
ISM devices are on the same system and can therefore be used for an
SMC-Dv2 connection. Furthermore, SMC-Dv2 eliminates the need for
PNET IDs on s390.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
e8d726c8e8 net/smc: CLC decline - V2 enhancements
This patch covers the small SMCD version 2 changes for CLC decline.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
b81a5eb789 net/smc: introduce CLC first contact extension
SMC Version 2 defines a first contact extension for CLC accept
and CLC confirm. This patch covers sending and receiving of the
CLC first contact extension.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
a7c9c5f4af net/smc: CLC accept / confirm V2
The new format of SMCD V2 CLC accept and confirm is introduced,
and building and checking of SMCD V2 CLC accepts / confirms is adapted
accordingly.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
5c21c4ccaf net/smc: determine accepted ISM devices
SMCD Version 2 allows to propose up to 8 additional ISM devices
offered to the peer as candidates for SMCD communication.
This patch covers the server side, i.e. selection of an ISM device
matching one of the proposed ISM devices, that will be used for
CLC accept

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
8c3dca341a net/smc: build and send V2 CLC proposal
The new format of an SMCD V2 CLC proposal is introduced, and
building and checking of SMCD V2 CLC proposals is adapted
accordingly.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00