Commit Graph

22574 Commits

Author SHA1 Message Date
Jiri Olsa
9d88a1a170 perf tools: Move clockid_res_ns under clock struct
Move the clockid_res_ns struct member to the clock struct, so we have
the clock related stuff in one place.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Geneviève Bastien <gbastien@versatic.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200805093444.314999-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:42:20 -03:00
Jiri Olsa
d1e325cf40 perf header: Store clock references for -k/--clockid option
Add a new CLOCK_DATA feature that stores reference times when
-k/--clockid option is specified.

It contains the clock id and its reference time together with wall clock
time taken at the 'same time', both values are in nanoseconds.

The format of data is as below:

  struct {
       u32 version;  /* version = 1 */
       u32 clockid;
       u64 wall_clock_ns;
       u64 clockid_time_ns;
  };

This clock reference times will be used in following changes to display
wall clock for perf events.

It's available only for recording with clockid specified, because it's
the only case where we can get reference time to wallclock time. It's
can't do that with perf clock yet.

Committer testing:

  $ perf record -h -k

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -k, --clockid <clockid>
                            clockid to use for events, see clock_gettime()

  $ perf record -k monotonic sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.017 MB perf.data (8 samples) ]
  $ perf report --header-only | grep clockid -A1
  # event : name = cycles:u, , id = { 88815, 88816, 88817, 88818, 88819, 88820, 88821, 88822 }, size = 120, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, read_format = ID, disabled = 1, inherit = 1, exclude_kernel = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, use_clockid = 1, ksymbol = 1, bpf_event = 1, clockid = 1
  # CPU_TOPOLOGY info available, use -I to display
  --
  # clockid frequency: 1000 MHz
  # cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake
  # clockid: monotonic (1)
  # reference time: 2020-08-06 09:40:21.619290 = 1596717621.619290 (TOD) = 21931.077673635 (monotonic)
  $

Original-patch-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Geneviève Bastien <gbastien@versatic.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200805093444.314999-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:35:06 -03:00
Jiri Olsa
cc3365bbd0 perf tools: Add clockid_name function
Add the clockid_name() function to get the clock name based on its
clockid.  It will be used in the following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Geneviève Bastien <gbastien@versatic.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200805093444.314999-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:33:57 -03:00
Jiri Olsa
6953beb4dd perf clockid: Move parse_clockid() to new clockid object
Move parse_clockid and all needed clcckid related stuff into clockid
object. We are going to add clockid_name function in following change,
so it's better it's placed in separated object and not in
builtin-record.c.

No functional change is intended.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Geneviève Bastien <gbastien@versatic.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200805093444.314999-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:30:52 -03:00
Tzvetomir Stoyanov (VMware)
7d65864b3b tools lib traceevent: Handle possible strdup() error in tep_add_plugin_path() API
Free allocated resources and return -1 in case strdup() fails in
tep_add_plugin_path() API.

Link: https://lore.kernel.org/r/CAM9d7chfvJwodpVrHGc5E2J80peRojmYV_fD8x3cpn9HFRUw2g@mail.gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200714103027.2477584-9-tz.stoyanov@gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200716092014.2613403-9-tz.stoyanov@gmail.com

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200722011755.720803193@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:27:50 -03:00
Tzvetomir Stoyanov (VMware)
db885ed481 libtraceevent: Fixed description of tep_add_plugin_path() API
Changed the description of tep_add_plugin_path() API to reflect the
logic of the function. The suffix of plugin files is not hardcoded
to ".so", it depends on the custom plugin loader callback.

Link: https://lore.kernel.org/CAM9d7cgMgqFDvKhs6xwdBSMsaG=3ZG0RtxwgQDCTLGkML1MY4Q@mail.gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200714103027.2477584-8-tz.stoyanov@gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200716092014.2613403-8-tz.stoyanov@gmail.com

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200722011755.581468845@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:27:33 -03:00
Tzvetomir Stoyanov (VMware)
602e29fe07 libtraceevent: Fixed type in PRINT_FMT_STING
PRINT_FMT_STING -> PRINT_FMT_STRING

Link: https://lore.kernel.org/r/CAM9d7cj1LJ=QO8QxhBo_oDM9APpAswX4BbTwge0JhZ3Y4-Bv9w@mail.gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200714103027.2477584-7-tz.stoyanov@gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200716092014.2613403-7-tz.stoyanov@gmail.com

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200722011755.442308322@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:27:09 -03:00
Tzvetomir Stoyanov (VMware)
d339a19a87 libtraceevent: Fixed broken indentation in parse_ip4_print_args()
Fixed the "break" indentation in a switch() inside
parse_ip4_print_args() static function.

Link: https://lore.kernel.org/r/CAM9d7cjboXGg+iMOA4BQo=E01iLGcJNB1MyPJ4doPP1XeGVJRA@mail.gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200714103027.2477584-6-tz.stoyanov@gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200716092014.2613403-6-tz.stoyanov@gmail.com

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200722011755.310486074@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:26:37 -03:00
Tzvetomir Stoyanov (VMware)
b796162bc4 libtraceevent: Improve error handling of tep_plugin_add_option() API
In case of memory error, ensure all allocated resources are freed.
Do not append broken option in trace_plugin_options list.

Link: https://lore.kernel.org/r/CAM9d7cizjF+fbK7YzmsBDgrx__4YAOsmEq67D3sWET8FF+YdFA@mail.gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200714103027.2477584-5-tz.stoyanov@gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200716092014.2613403-5-tz.stoyanov@gmail.com

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200722011755.158091410@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:26:33 -03:00
Tzvetomir Stoyanov (VMware)
7db6330dca libtraceevent: Fix typo in tep_plugin_add_option() description
A typo "optiona" -> "optional" is fixed in description of
tep_plugin_add_option() API.

Link: https://lore.kernel.org/r/CAM9d7cizjF+fbK7YzmsBDgrx__4YAOsmEq67D3sWET8FF+YdFA@mail.gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200714103027.2477584-4-tz.stoyanov@gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200716092014.2613403-4-tz.stoyanov@gmail.com

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200722011755.014613924@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:26:30 -03:00
Tzvetomir Stoyanov (VMware)
058612a6f7 libtraceevent: Handle strdup() error in parse_option_name()
Modified internal function parse_option_name() to return error and
handle that error in function callers in case strdup() fails.

Link: https://lore.kernel.org/r/CAM9d7cizjF+fbK7YzmsBDgrx__4YAOsmEq67D3sWET8FF+YdFA@mail.gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200714103027.2477584-3-tz.stoyanov@gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200716092014.2613403-3-tz.stoyanov@gmail.com

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200722011754.869289038@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:26:28 -03:00
Tzvetomir Stoyanov (VMware)
9dc7dc75b1 libtraceevent: Document tep_load_plugins_hook()
Add description of tep_load_plugins_hook() traceevent API. Updated
library man pages with description of the tep_load_plugins_hook() API.

Link: https://lore.kernel.org/r/CAM9d7cgLBWCrEHwz+Lhv5x5EXGcNWB0QQoeGh3OKh2JfR=dV9Q@mail.gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200714103027.2477584-2-tz.stoyanov@gmail.com
Link: https://lore.kernel.org/linux-trace-devel/20200716092014.2613403-2-tz.stoyanov@gmail.com

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200722011754.720060785@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:26:23 -03:00
Alexey Budankov
4b0297ef8a perf evsel: Extend message to mention CAP_SYS_PTRACE and perf security doc link
Adjust limited access message to mention CAP_SYS_PTRACE capability for
processes of unprivileged users. Add link to perf security document in
the end of the section about capabilities.

The change has been inspired by this discussion:
https://lore.kernel.org/lkml/20200722113007.GI77866@kernel.org/

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/6f8a7425-6e7d-19aa-1605-e59836b9e2a6@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:04:23 -03:00
Adrian Hunter
347a7389a7 perf intel-pt: Add support for decoding PSB+ only
A single q option decodes ip from only FUP/TIP packets. Make it so that
repeating the q option (i.e. qq) decodes only PSB+, getting ip if there
is a FUP packet within PSB+ (i.e. between PSB and PSBEND).

Example:

 $ perf record -e intel_pt//u grep -rI pudding drivers
 [ perf record: Woken up 52 times to write data ]
 [ perf record: Captured and wrote 57.870 MB perf.data ]
 $ time perf script --itrace=bi | wc -l
 58948289

 real    1m23.863s
 user    1m23.251s
 sys     0m7.452s
 $ time perf script --itrace=biq | wc -l
 3385694

 real    0m4.453s
 user    0m4.455s
 sys     0m0.328s
 $ time perf script --itrace=biqq | wc -l
 1883

 real    0m0.047s
 user    0m0.043s
 sys     0m0.009s

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:02:43 -03:00
Adrian Hunter
7c1b16ba0e perf intel-pt: Add support for decoding FUP/TIP only
Use the new itrace 'q' option to add support for a mode of decoding that
ignores TNT, does not walk object code, but gets the ip from FUP and TIP
packets.

Example:

 $ perf record -e intel_pt//u grep -rI pudding drivers
 [ perf record: Woken up 52 times to write data ]
 [ perf record: Captured and wrote 57.870 MB perf.data ]
 $ time perf script --itrace=bi | wc -l
 58948289

 real    1m23.863s
 user    1m23.251s
 sys     0m7.452s
 $ time perf script --itrace=biq | wc -l
 3385694

 real    0m4.453s
 user    0m4.455s
 sys     0m0.328s

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:02:14 -03:00
Adrian Hunter
51971536ef perf auxtrace: Add itrace 'q' option for quicker, less detailed decoding
The 'q' option is for modes of decoding that are quicker because they
skip or omit decoding some aspects of trace data.

If supported, the 'q' option may be repeated to increase the effect.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:24:03 -03:00
Adrian Hunter
d4575f5fce perf intel-pt: Time filter logged perf events
Change the debug logging (when used with the --time option) to time
filter logged perf events, but allow that to be overridden by using
"d+a" instead of plain "d".

That can reduce the size of the log file.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:23:19 -03:00
Adrian Hunter
8b83fccdd2 perf intel-pt: Use itrace debug log flags to suppress some messages
The "d" option may be followed by flags which affect what debug messages
will or will not be logged. Each flag must be preceded by either '+' or
'-'. The flags support by Intel PT are:

		-a	Suppress logging of perf events

Suppressing perf events is useful for decreasing the size of the log.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:23:00 -03:00
Adrian Hunter
935aac2d2d perf auxtrace: Add optional log flags to the itrace 'd' option
Allow the 'd' option to be followed by flags which will affect what debug
messages will or will not be reported. Each flag must be preceded by either
'+' or '-'. The flags are:
	a	all perf events

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:22:38 -03:00
Adrian Hunter
1d846aeb86 perf intel-pt: Use itrace error flags to suppress some errors
The itrace "e" option may be followed by flags which affect what errors
will or will not be reported.  Each flag must be preceded by either '+' or '-'.
The flags supported by Intel PT are:

		-o	Suppress overflow errors
		-l	Suppress trace data lost errors
For example, for errors but not overflow or data lost errors:

	--itrace=e-o-l

Suppressing those errors can be useful for testing and debugging because
they are not due to decoding.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:22:07 -03:00
Adrian Hunter
cb971438b7 perf auxtrace: Add optional error flags to the itrace 'e' option
Allow the 'e' option to be followed by flags which will affect what errors
will or will not be reported. Each flag must be preceded by either '+' or
'-'. The flags are:
	o	overflow
	l	trace data lost

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:21:31 -03:00
Adrian Hunter
1e8f786944 perf auxtrace: Add missing itrace options to help text
Add missing itrace options o, G and L.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:21:06 -03:00
Adrian Hunter
2c9a11af84 perf tools: Improve aux_output not supported error
For example:

 Before:
   $ perf record -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -- ls -l
   Error:
   branch-loads: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'

 After:
   $ perf record -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -- ls -l
   Error:
   branch-loads: PMU Hardware doesn't support 'aux_output' feature

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:20:17 -03:00
Adrian Hunter
a58a057ce6 perf intel-pt: Fix duplicate branch after CBR
CBR events can result in a duplicate branch event, because the state
type defaults to a branch. Fix by clearing the state type.

Example: trace 'sleep' and hope for a frequency change

 Before:

   $ perf record -e intel_pt//u sleep 0.1
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.034 MB perf.data ]
   $ perf script --itrace=bpe > before.txt

 After:

   $ perf script --itrace=bpe > after.txt
   $ diff -u before.txt after.txt
   --- before.txt  2020-07-07 14:42:18.191508098 +0300
   +++ after.txt   2020-07-07 14:42:36.587891753 +0300
   @@ -29673,7 +29673,6 @@
               sleep 93431 [007] 15411.619905:          1  branches:u:                 0 [unknown] ([unknown]) =>     7f0818abb2e0 clock_nanosleep@@GLIBC_2.17+0x0 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
               sleep 93431 [007] 15411.619905:          1  branches:u:      7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>                0 [unknown] ([unknown])
               sleep 93431 [007] 15411.720069:         cbr:  cbr: 15 freq: 1507 MHz ( 56%)         7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
   -           sleep 93431 [007] 15411.720069:          1  branches:u:      7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>                0 [unknown] ([unknown])
               sleep 93431 [007] 15411.720076:          1  branches:u:                 0 [unknown] ([unknown]) =>     7f0818abb30e clock_nanosleep@@GLIBC_2.17+0x2e (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
               sleep 93431 [007] 15411.720077:          1  branches:u:      7f0818abb323 clock_nanosleep@@GLIBC_2.17+0x43 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>     7f0818ac0eb7 __nanosleep+0x17 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
               sleep 93431 [007] 15411.720077:          1  branches:u:      7f0818ac0ebf __nanosleep+0x1f (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>     55cb7e4c2827 rpl_nanosleep+0x97 (/usr/bin/sleep)

Fixes: 91de8684f1 ("perf intel-pt: Cater for CBR change in PSB+")
Fixes: abe5a1d3e4 ("perf intel-pt: Decoder to output CBR changes immediately")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200710151104.15137-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:18:32 -03:00
Adrian Hunter
401136bb08 perf intel-pt: Fix FUP packet state
While walking code towards a FUP ip, the packet state is
INTEL_PT_STATE_FUP or INTEL_PT_STATE_FUP_NO_TIP. That was mishandled
resulting in the state becoming INTEL_PT_STATE_IN_SYNC prematurely.  The
result was an occasional lost EXSTOP event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200710151104.15137-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:17:24 -03:00
Arnaldo Carvalho de Melo
94fb1afb14 Mgerge remote-tracking branch 'torvalds/master' into perf/core
To sync headers, for instance, in this case tools/perf was ahead of
upstream till Linus merged tip/perf/core to get the
PERF_RECORD_TEXT_POKE changes:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
  diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:15:47 -03:00
Linus Torvalds
47ec5303d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Support 6Ghz band in ath11k driver, from Rajkumar Manoharan.

 2) Support UDP segmentation in code TSO code, from Eric Dumazet.

 3) Allow flashing different flash images in cxgb4 driver, from Vishal
    Kulkarni.

 4) Add drop frames counter and flow status to tc flower offloading,
    from Po Liu.

 5) Support n-tuple filters in cxgb4, from Vishal Kulkarni.

 6) Various new indirect call avoidance, from Eric Dumazet and Brian
    Vazquez.

 7) Fix BPF verifier failures on 32-bit pointer arithmetic, from
    Yonghong Song.

 8) Support querying and setting hardware address of a port function via
    devlink, use this in mlx5, from Parav Pandit.

 9) Support hw ipsec offload on bonding slaves, from Jarod Wilson.

10) Switch qca8k driver over to phylink, from Jonathan McDowell.

11) In bpftool, show list of processes holding BPF FD references to
    maps, programs, links, and btf objects. From Andrii Nakryiko.

12) Several conversions over to generic power management, from Vaibhav
    Gupta.

13) Add support for SO_KEEPALIVE et al. to bpf_setsockopt(), from Dmitry
    Yakunin.

14) Various https url conversions, from Alexander A. Klimov.

15) Timestamping and PHC support for mscc PHY driver, from Antoine
    Tenart.

16) Support bpf iterating over tcp and udp sockets, from Yonghong Song.

17) Support 5GBASE-T i40e NICs, from Aleksandr Loktionov.

18) Add kTLS RX HW offload support to mlx5e, from Tariq Toukan.

19) Fix the ->ndo_start_xmit() return type to be netdev_tx_t in several
    drivers. From Luc Van Oostenryck.

20) XDP support for xen-netfront, from Denis Kirjanov.

21) Support receive buffer autotuning in MPTCP, from Florian Westphal.

22) Support EF100 chip in sfc driver, from Edward Cree.

23) Add XDP support to mvpp2 driver, from Matteo Croce.

24) Support MPTCP in sock_diag, from Paolo Abeni.

25) Commonize UDP tunnel offloading code by creating udp_tunnel_nic
    infrastructure, from Jakub Kicinski.

26) Several pci_ --> dma_ API conversions, from Christophe JAILLET.

27) Add FLOW_ACTION_POLICE support to mlxsw, from Ido Schimmel.

28) Add SK_LOOKUP bpf program type, from Jakub Sitnicki.

29) Refactor a lot of networking socket option handling code in order to
    avoid set_fs() calls, from Christoph Hellwig.

30) Add rfc4884 support to icmp code, from Willem de Bruijn.

31) Support TBF offload in dpaa2-eth driver, from Ioana Ciornei.

32) Support XDP_REDIRECT in qede driver, from Alexander Lobakin.

33) Support PCI relaxed ordering in mlx5 driver, from Aya Levin.

34) Support TCP syncookies in MPTCP, from Flowian Westphal.

35) Fix several tricky cases of PMTU handling wrt. briding, from Stefano
    Brivio.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2056 commits)
  net: thunderx: initialize VF's mailbox mutex before first usage
  usb: hso: remove bogus check for EINPROGRESS
  usb: hso: no complaint about kmalloc failure
  hso: fix bailout in error case of probe
  ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM
  selftests/net: relax cpu affinity requirement in msg_zerocopy test
  mptcp: be careful on subflow creation
  selftests: rtnetlink: make kci_test_encap() return sub-test result
  selftests: rtnetlink: correct the final return value for the test
  net: dsa: sja1105: use detected device id instead of DT one on mismatch
  tipc: set ub->ifindex for local ipv6 address
  ipv6: add ipv6_dev_find()
  net: openvswitch: silence suspicious RCU usage warning
  Revert "vxlan: fix tos value before xmit"
  ptp: only allow phase values lower than 1 period
  farsync: switch from 'pci_' to 'dma_' API
  wan: wanxl: switch from 'pci_' to 'dma_' API
  hv_netvsc: do not use VF device if link is down
  dpaa2-eth: Fix passing zero to 'PTR_ERR' warning
  net: macb: Properly handle phylink on at91sam9x
  ...
2020-08-05 20:13:21 -07:00
Linus Torvalds
fffe3ae0ee hmm related patches for 5.9
This series adds reporting of the page table order from hmm_range_fault()
 and some optimization of migrate_vma():
 
 - Report the size of the page table mapping out of hmm_range_fault(). This
   makes it easier to establish a large/huge/etc mapping in the device's
   page table.
 
 - Allow devices to ignore the invalidations during migration in cases
   where the migration is not going to change pages. For instance migrating
   pages to a device does not require the device to invalidate pages
   already in the device.
 
 - Update nouveau and hmm_tests to use the above
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl8oocYACgkQOG33FX4g
 mxqd3Q/+OClUADmrI+EGJAPI7VD3EYfyZdnMCcp39AYNfySQPN9+fCMF5hVD5U7x
 KZVflR/zKUIZJVvdD8yAdrynZ1sHBG/HEzDyoaKcGzfCKq5LEAEnP5FG3xsiDjkO
 QX7w6qIGDz59gaeanQKNzqaR3DMpBwO/0D5/80DWXv+WgmxsAphanJYlo4eWyq4D
 EGq8EndCxairkTLpPlDHvFottL5kAKDXEinSAwWGQeZJkRY93vj+HZAQaeltmB1K
 SDdZr7lsEg2RhtRjzT7CkA2bkCERKL3xEc4VWaCAZw+qm8aeswADVOSo5E5F7DMI
 NUsB/p4GZ2CvIog/y3g/aSGluevdYJHTH8ip1BnNr2qCcXSEqHKsmyKpVNZztSUl
 uljyT17ZzTsdR4xj50tM27fzgDaavWrwFZTsJxUifuvAO9rHvGDVpaN8ZIU9iZei
 PTsGQvfoHDmWBWKX1dkIUGq+UoGwEAYRGk+XU0OYZCK97xmjRnGVoH0FTOk4DNQs
 +A0250oTOrvdSGiv0fNT5qpWpFsQ/84h8Lz6ubAD3okVo1bk9cFMe2argQl+E2qI
 TGM9ZHS8rphJNWwiPm8xrgf9eQ9bNp3ilCsIzBBpqZq8elwaL6a3ySieDPE734Ar
 FZEeEYTvj5Z/gXtyo/gxVKhltCc4U8kPqye9uexTInz4zBUUZOM=
 =omAU
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull hmm updates from Jason Gunthorpe:
 "Ralph has been working on nouveau's use of hmm_range_fault() and
  migrate_vma() which resulted in this small series. It adds reporting
  of the page table order from hmm_range_fault() and some optimization
  of migrate_vma():

   - Report the size of the page table mapping out of hmm_range_fault().

     This makes it easier to establish a large/huge/etc mapping in the
     device's page table.

   - Allow devices to ignore the invalidations during migration in cases
     where the migration is not going to change pages.

     For instance migrating pages to a device does not require the
     device to invalidate pages already in the device.

   - Update nouveau and hmm_tests to use the above"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  mm/hmm/test: use the new migration invalidation
  nouveau/svm: use the new migration invalidation
  mm/notifier: add migration invalidation type
  mm/migrate: add a flags parameter to migrate_vma
  nouveau: fix storing invalid ptes
  nouveau/hmm: support mapping large sysmem pages
  nouveau: fix mapping 2MB sysmem pages
  nouveau/hmm: fault one page at a time
  mm/hmm: add tests for hmm_pfn_to_map_order()
  mm/hmm: provide the page mapping order in hmm_range_fault()
2020-08-05 13:28:50 -07:00
Linus Torvalds
1d8ce0e093 This is the bulk of GPIO changes for the v5.9 kernel cycle:
Core changes:
 
 - Introduce the for_each_requested_gpio() macro to help in
   dependent code all over the place. Also patch a few locations
   to use it while we are at it.
 
 - Split out the sysfs code into its own file.
 
 - Split out the character device code into its own file, then
   make a set of refactorings and improvements to this code.
   We are setting the stage to revamp the userspace API a bit
   in the next cycle.
 
 - Fix a whole slew of kerneldoc that was wrong or missing.
 
 New drivers:
 
 - The PCA953x driver now supports the PCAL9535.
 
 Driver improvements:
 
 - A host of incremental modernizations and improvements to the
   PCA953x driver.
 
 - Incremental improvements to the Xilinx Zynq driver.
 
 - Some improvements to the GPIO aggregator driver.
 
 - I ran all over the place switching all threaded and other
   drivers requesting their own IRQ while using the core
   GPIO IRQ helpers to pass the GPIO irq chip as a template
   instead of calling the explicit set-up functions. Next merge
   window we may retire the old code altogether.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAl8p2cEACgkQQRCzN7AZ
 XXMjBBAAs4YwEALYTCvpzbXdYGAgMGIEAD84dPBZTAYj3P/iGuXPk/+wdt4wLHTD
 dQl0/MG6JXf+mCClgsHhC+ILPUS6YM/v/it4gnV4j6o/ugYiNPADYl4pieaq/SVb
 jNrSxs0tzZYTtlq6xSmRMCpA8SgP3ASWNa5dp5mDkuo3xaEJvM3cpsww7Q5GG3Fb
 HQD9+XgBGQx7H+6JMoNHR8/E9A5y4pwIvTlfqzTcq3UupTqlvklekgu0fTiA6D7t
 z/ydQpYtqevuirir7J7oMiIhTSPgTkOcUwNO6hCcwpO/a0q7jwy1DvGIyJDM2yIj
 UYby03SCLknFGxLwKAcavC3mnLIIor61385e5+lXBH0OS7EO8OpBoi9vAxqalcZ8
 ZRrOyYrT3et40rD+ByFFwbUsrD9D4c0Ul+Ocr8upKdnGlRW/u8YahIwn82qnKjF0
 ANJanZu76c8ZU8AJQyM/HD7hrcJ0MJXhECXb/JJx4O52ebi4fmOdtS/Kx2urH4KR
 hX3tspQM/tU89a4jW7NIzzfZKlnPI2rI3NxYgziQVgVnt+iVrkyBiYSoWhK+DEyu
 JJl5zXHvgb1UmejqN2Qtkq9V+3R2VEYg7WUyr6b48q4acSVI1q+NfGE7+GBT/+d4
 Kt1eGz8QtEnxspDhVsQNY4WSP8QpwhmWWM2ujsO6a81vyjB4vSg=
 =OSvz
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "This is the bulk of GPIO changes for the v5.9 kernel cycle.

  There is nothing too exciting in it, but a new macro that fixes a
  build failure on a minor ARM32 platform that appeared yesterday is
  part of it so we better merge it.

  Core changes:

   - Introduce the for_each_requested_gpio() macro to help in dependent
     code all over the place. Also patch a few locations to use it while
     we are at it.

   - Split out the sysfs code into its own file.

   - Split out the character device code into its own file, then make a
     set of refactorings and improvements to this code. We are setting
     the stage to revamp the userspace API a bit in the next cycle.

   - Fix a whole slew of kerneldoc that was wrong or missing.

  New drivers:

   - The PCA953x driver now supports the PCAL9535.

  Driver improvements:

   - A host of incremental modernizations and improvements to the
     PCA953x driver.

   - Incremental improvements to the Xilinx Zynq driver.

   - Some improvements to the GPIO aggregator driver.

   - I ran all over the place switching all threaded and other drivers
     requesting their own IRQ while using the core GPIO IRQ helpers to
     pass the GPIO irq chip as a template instead of calling the
     explicit set-up functions. Next merge window we may retire the old
     code altogether"

* tag 'gpio-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (97 commits)
  gpio: wcove: Request IRQ after all initialisation done
  gpio: crystalcove: Free IRQ on error path
  gpio: pca953x: Request IRQ after all initialisation done
  gpio: don't use same lockdep class for all devm_gpiochip_add_data users
  gpio: max732x: Use irqchip template
  gpio: stmpe: Move chip registration
  gpio: rcar: Use irqchip template
  gpio: regmap: fix type clash
  gpio: Correct kernel-doc inconsistency
  gpio: pci-idio-16: Use irqchip template
  gpio: pcie-idio-24: Use irqchip template
  gpio: 104-idio-16: Use irqchip template
  gpio: 104-idi-48: Use irqchip template
  gpio: 104-dio-48e: Use irqchip template
  gpio: ws16c48: Use irqchip template
  gpio: omap: improve coding style for pin config flags
  gpio: dln2: Use irqchip template
  gpio: sch: Add a blank line between declaration and code
  gpio: sch: changed every 'unsigned' to 'unsigned int'
  gpio: ich: changed every 'unsigned' to 'unsigned int'
  ...
2020-08-05 12:56:27 -07:00
Willem de Bruijn
16f6458f24 selftests/net: relax cpu affinity requirement in msg_zerocopy test
The msg_zerocopy test pins the sender and receiver threads to separate
cores to reduce variance between runs.

But it hardcodes the cores and skips core 0, so it fails on machines
with the selected cores offline, or simply fewer cores.

The test mainly gives code coverage in automated runs. The throughput
of zerocopy ('-z') and non-zerocopy runs is logged for manual
inspection.

Continue even when sched_setaffinity fails. Just log to warn anyone
interpreting the data.

Fixes: 07b65c5b31 ("test: add msg_zerocopy test")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-05 12:25:35 -07:00
Po-Hsu Lin
72f70c159b selftests: rtnetlink: make kci_test_encap() return sub-test result
kci_test_encap() is actually composed by two different sub-tests,
kci_test_encap_vxlan() and kci_test_encap_fou()

Therefore we should check the test result of these two in
kci_test_encap() to let the script be aware of the pass / fail status.
Otherwise it will generate false-negative result like below:
    $ sudo ./test.sh
    PASS: policy routing
    PASS: route get
    PASS: preferred_lft addresses have expired
    PASS: promote_secondaries complete
    PASS: tc htb hierarchy
    PASS: gre tunnel endpoint
    PASS: gretap
    PASS: ip6gretap
    PASS: erspan
    PASS: ip6erspan
    PASS: bridge setup
    PASS: ipv6 addrlabel
    PASS: set ifalias 5b193daf-0a08-46d7-af2c-e7aadd422ded for test-dummy0
    PASS: vrf
    PASS: vxlan
    FAIL: can't add fou port 7777, skipping test
    PASS: macsec
    PASS: bridge fdb get
    PASS: neigh get
    $ echo $?
    0

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-05 12:23:29 -07:00
Po-Hsu Lin
c2a4d27479 selftests: rtnetlink: correct the final return value for the test
The return value "ret" will be reset to 0 from the beginning of each
sub-test in rtnetlink.sh, therefore this test will always pass if the
last sub-test has passed:
    $ sudo ./rtnetlink.sh
    PASS: policy routing
    PASS: route get
    PASS: preferred_lft addresses have expired
    PASS: promote_secondaries complete
    PASS: tc htb hierarchy
    PASS: gre tunnel endpoint
    PASS: gretap
    PASS: ip6gretap
    PASS: erspan
    PASS: ip6erspan
    PASS: bridge setup
    PASS: ipv6 addrlabel
    PASS: set ifalias a39ee707-e36b-41d3-802f-63179ed4d580 for test-dummy0
    PASS: vrf
    PASS: vxlan
    FAIL: can't add fou port 7777, skipping test
    PASS: macsec
    PASS: ipsec
    3,7c3,7
    < sa[0]    spi=0x00000009 proto=0x32 salt=0x64636261 crypt=1
    < sa[0]    key=0x31323334 35363738 39303132 33343536
    < sa[1] rx ipaddr=0x00000000 00000000 00000000 c0a87b03
    < sa[1]    spi=0x00000009 proto=0x32 salt=0x64636261 crypt=1
    < sa[1]    key=0x31323334 35363738 39303132 33343536
    ---
    > sa[0]    spi=0x00000009 proto=0x32 salt=0x61626364 crypt=1
    > sa[0]    key=0x34333231 38373635 32313039 36353433
    > sa[1] rx ipaddr=0x00000000 00000000 00000000 037ba8c0
    > sa[1]    spi=0x00000009 proto=0x32 salt=0x61626364 crypt=1
    > sa[1]    key=0x34333231 38373635 32313039 36353433
    FAIL: ipsec_offload incorrect driver data
    FAIL: ipsec_offload
    PASS: bridge fdb get
    PASS: neigh get
    $ echo $?
    0

Make "ret" become a local variable for all sub-tests.
Also, check the sub-test results in kci_test_rtnl() and return the
final result for this test.

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-05 12:23:28 -07:00
Linus Torvalds
ecfd7940b8 USB/Thunderbolt patches for 5.9-rc1
Here is the large set of USB and Thunderbolt patches for 5.9-rc1.
 
 Nothing really magic/major in here, just lots of little changes and
 updates:
 	- clean up language usages in USB core and some drivers
 	- Thunderbolt driver updates and additions
 	- USB Gadget driver updates
 	- dwc3 driver updates (like always...)
 	- build with "W=1" warning fixups
 	- mtu3 driver updates
 	- usb-serial driver updates and device ids
 	- typec additions and updates for new hardware
 	- xhci debug code updates for future platforms
 	- cdns3 driver updates
 	- lots of other minor driver updates and fixes and cleanups
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXymciA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylVVwCfU7JxgjFhAJTzC9K5efVqsrSHzxQAnijHrqUn
 pHgI9M1ZRVGwmv2RNWb3
 =9/8I
 -----END PGP SIGNATURE-----

Merge tag 'usb-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB/Thunderbolt updates from Greg KH:
 "Here is the large set of USB and Thunderbolt patches for 5.9-rc1.

  Nothing really magic/major in here, just lots of little changes and
  updates:

   - clean up language usages in USB core and some drivers

   - Thunderbolt driver updates and additions

   - USB Gadget driver updates

   - dwc3 driver updates (like always...)

   - build with "W=1" warning fixups

   - mtu3 driver updates

   - usb-serial driver updates and device ids

   - typec additions and updates for new hardware

   - xhci debug code updates for future platforms

   - cdns3 driver updates

   - lots of other minor driver updates and fixes and cleanups

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (330 commits)
  usb: common: usb-conn-gpio: Register charger
  usb: mtu3: simplify mtu3_req_complete()
  usb: mtu3: clear dual mode of u3port when disable device
  usb: mtu3: use MTU3_EP_WEDGE flag
  usb: mtu3: remove useless member @busy in mtu3_ep struct
  usb: mtu3: remove repeated error log
  usb: mtu3: add ->udc_set_speed()
  usb: mtu3: introduce a funtion to check maximum speed
  usb: mtu3: clear interrupts status when disable interrupts
  usb: mtu3: reinitialize CSR registers
  usb: mtu3: fix macro for maximum number of packets
  usb: mtu3: remove unnecessary pointer checks
  usb: xhci: Fix ASMedia ASM1142 DMA addressing
  usb: xhci: define IDs for various ASMedia host controllers
  usb: musb: convert to devm_platform_ioremap_resource_byname
  usb: gadget: tegra-xudc: convert to devm_platform_ioremap_resource_byname
  usb: gadget: r8a66597: convert to devm_platform_ioremap_resource_byname
  usb: dwc3: convert to devm_platform_ioremap_resource_byname
  usb: cdns3: convert to devm_platform_ioremap_resource_byname
  usb: phy: am335x: convert to devm_platform_ioremap_resource_byname
  ...
2020-08-05 12:13:10 -07:00
Linus Torvalds
dd27111e32 Driver core changes for 5.9-rc1
Here is the "big" set of changes to the driver core, and some drivers
 using the changes, for 5.9-rc1.
 
 "Biggest" thing in here is the device link exposure in sysfs, to help
 to tame the madness that is SoC device tree representations and driver
 interactions with it.
 
 Other stuff in here that is interesting is:
 	- device probe log helper so that drivers can report problems in
 	  a unified way easier.
 	- devres functions added
 	- DEVICE_ATTR_ADMIN_* macro added to make it harder to write
 	  incorrect sysfs file permissions
 	- documentation cleanups
 	- ability for debugfs to be present in the kernel, yet not
 	  exposed to userspace.  Needed for systems that want it
 	  enabled, but do not trust users, so they can still use some
 	  kernel functions that were otherwise disabled.
 	- other minor fixes and cleanups
 
 The patches outside of drivers/base/ all have acks from the respective
 subsystem maintainers to go through this tree instead of theirs.
 
 All of these have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXylhOQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylGdACeKqxm8IIDZycj0QjLUlPiEwVIROgAnjpf5jAB
 mb4jMvgEGsB6/FwxypPG
 =RUss
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the "big" set of changes to the driver core, and some drivers
  using the changes, for 5.9-rc1.

  "Biggest" thing in here is the device link exposure in sysfs, to help
  to tame the madness that is SoC device tree representations and driver
  interactions with it.

  Other stuff in here that is interesting is:

   - device probe log helper so that drivers can report problems in a
     unified way easier.

   - devres functions added

   - DEVICE_ATTR_ADMIN_* macro added to make it harder to write
     incorrect sysfs file permissions

   - documentation cleanups

   - ability for debugfs to be present in the kernel, yet not exposed to
     userspace. Needed for systems that want it enabled, but do not
     trust users, so they can still use some kernel functions that were
     otherwise disabled.

   - other minor fixes and cleanups

  The patches outside of drivers/base/ all have acks from the respective
  subsystem maintainers to go through this tree instead of theirs.

  All of these have been in linux-next with no reported issues"

* tag 'driver-core-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (39 commits)
  drm/bridge: lvds-codec: simplify error handling
  drm/bridge/sii8620: fix resource acquisition error handling
  driver core: add deferring probe reason to devices_deferred property
  driver core: add device probe log helper
  driver core: Avoid binding drivers to dead devices
  Revert "test_firmware: Test platform fw loading on non-EFI systems"
  firmware_loader: EFI firmware loader must handle pre-allocated buffer
  selftest/firmware: Add selftest timeout in settings
  test_firmware: Test platform fw loading on non-EFI systems
  driver core: Change delimiter in devlink device's name to "--"
  debugfs: Add access restriction option
  tracefs: Remove unnecessary debug_fs checks.
  driver core: Fix probe_count imbalance in really_probe()
  kobject: remove unused KOBJ_MAX action
  driver core: Fix sleeping in invalid context during device link deletion
  driver core: Add waiting_for_supplier sysfs file for devices
  driver core: Add state_synced sysfs file for devices that support it
  driver core: Expose device link details in sysfs
  driver core: Drop mention of obsolete bus rwsem from kernel-doc
  debugfs: file: Remove unnecessary cast in kfree()
  ...
2020-08-05 11:52:17 -07:00
Linus Torvalds
1785d11612 Char/Misc driver patches for 5.9-rc1
Here is the large set of char and misc and other driver subsystem
 patches for 5.9-rc1.  Lots of new driver submissions in here, and
 cleanups and features for existing drivers.
 
 Highlights are:
 	- habanalabs driver updates
 	- coresight driver updates
 	- nvmem driver updates
 	- huge number of "W=1" build warning cleanups from Lee Jones
 	- dyndbg updates
 	- virtbox driver fixes and updates
 	- soundwire driver updates
 	- mei driver updates
 	- phy driver updates
 	- fpga driver updates
 	- lots of smaller individual misc/char driver cleanups and fixes
 
 Full details are in the shortlog.
 
 All of these have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXylccQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymofgCfZ1CxNWd0ZVM0YIn8cY9gO6ON7MsAnRq48hvn
 Vjf4rKM73GC11bVF4Gyy
 =Xq1R
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the large set of char and misc and other driver subsystem
  patches for 5.9-rc1. Lots of new driver submissions in here, and
  cleanups and features for existing drivers.

  Highlights are:
   - habanalabs driver updates
   - coresight driver updates
   - nvmem driver updates
   - huge number of "W=1" build warning cleanups from Lee Jones
   - dyndbg updates
   - virtbox driver fixes and updates
   - soundwire driver updates
   - mei driver updates
   - phy driver updates
   - fpga driver updates
   - lots of smaller individual misc/char driver cleanups and fixes

  Full details are in the shortlog.

  All of these have been in linux-next with no reported issues"

* tag 'char-misc-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (322 commits)
  habanalabs: remove unused but set variable 'ctx_asid'
  nvmem: qcom-spmi-sdam: Enable multiple devices
  dt-bindings: nvmem: SID: add binding for A100's SID controller
  nvmem: update Kconfig description
  nvmem: qfprom: Add fuse blowing support
  dt-bindings: nvmem: Add properties needed for blowing fuses
  dt-bindings: nvmem: qfprom: Convert to yaml
  nvmem: qfprom: use NVMEM_DEVID_AUTO for multiple instances
  nvmem: core: add support to auto devid
  nvmem: core: Add nvmem_cell_read_u8()
  nvmem: core: Grammar fixes for help text
  nvmem: sc27xx: add sc2730 efuse support
  nvmem: Enforce nvmem stride in the sysfs interface
  MAINTAINERS: Add git tree for NVMEM FRAMEWORK
  nvmem: sprd: Fix return value of sprd_efuse_probe()
  drivers: android: Fix the SPDX comment style
  drivers: android: Fix a variable declaration coding style issue
  drivers: android: Remove braces for a single statement if-else block
  drivers: android: Remove the use of else after return
  drivers: android: Fix a variable declaration coding style issue
  ...
2020-08-05 11:43:47 -07:00
Linus Torvalds
4834ce9d8e linux-kselftest-5.9-rc1
This Kselftest update for Linux 5.9-rc1 consists of
 
 - TAP output reporting related fixes from Paolo Bonzini and Kees Cook.
   These fixes make it skip reporting consistent with TAP format.
 - Cleanup fixes to framework run_tests from Yauheni Kaliuta
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl8p1akACgkQCwJExA0N
 QxwqKg/9HzbnpWeb736HAjeA3v0LFuPte8TELSerjKqfas+g9xQxhf+ReHaZXz9i
 KnhBPYyOb57DjnT7Mi7c5qGYzLKCvBF30OR0M1P5lRWBX3SC0VvkgdHzLpksIsHx
 1wIht3gClvdOnHeHWWRG374iCbw86po8Xa5V9KOJOofbEKjctYjiShnd3OXXNLdJ
 h1/Ro+LffdsqO7VWoJruSuBplCXAVIr8IpUnOhtw/JTGlK7csNHBdHb7KTBm+zKU
 i0+f6H5uxRM3BA793OHen9D6kHAVLzhtPc7O0O1IhNRKnDgzY/UIS0qSGpxzD6KG
 +ZZ4FpvyfN2EPqgGegjdTNhQjPrjXpPos46FTNOM/qiQNYvCuvUFjRCAzVuN9cqU
 QzXdNPUOI7YLpOYpqLNqeefTXhZUxCWPF33oHPhU28W59qYpqQDRRDoOu7l9e7KQ
 DMwIKCaSw5qCB7S6x5LqOiQcc4/3xIbJVjO/CQU7G4cuREkAHqUvc2rfwCHcn60e
 rt2h/v2rnHzie4kTEukWKPuyHL64CL0jYTdHtFb/PbavnfbEPe3t6GiGu2DJN680
 mem6+9Q12KTtY4VY6enBE4YHVlxQPksbp/o5G91evNuFul1ZsAFgWdR5t+iiMrCU
 d6u2m5H6dMShZYqTw98MU/PKAh31sQCNZqhruhZMgUOJEV7lP/s=
 =4uja
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest updates form Shuah Khan:

 - TAP output reporting related fixes from Paolo Bonzini and Kees Cook.

   These fixes make it skip reporting consistent with TAP format.

 - Cleanup fixes to framework run_tests from Yauheni Kaliuta

* tag 'linux-kselftest-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (23 commits)
  selftests/harness: Limit step counter reporting
  selftests/seccomp: Check ENOSYS under tracing
  selftests/seccomp: Refactor to use fixture variants
  selftests/harness: Clean up kern-doc for fixtures
  selftests: kmod: Add module address visibility test
  Replace HTTP links with HTTPS ones: KMOD KERNEL MODULE LOADER - USERMODE HELPER
  selftests: fix condition in run_tests
  selftests: do not use .ONESHELL
  selftests: pidfd: skip test if unshare fails with EPERM
  selftests: pidfd: do not use ksft_exit_skip after ksft_set_plan
  selftests/harness: Report skip reason
  selftests/harness: Display signed values correctly
  selftests/harness: Refactor XFAIL into SKIP
  selftests/harness: Switch to TAP output
  selftests: Add header documentation and helpers
  selftests/binderfs: Fix harness API usage
  selftests: Remove unneeded selftest API headers
  selftests/clone3: Reorder reporting output
  selftests: sync_test: do not use ksft_exit_skip after ksft_set_plan
  selftests: sigaltstack: do not use ksft_exit_skip after ksft_set_plan
  ...
2020-08-05 10:28:25 -07:00
Linus Torvalds
53e5504bdb linux-kselftest-kunit-5.9-rc1
This Kunit update for Linux 5.9-rc1 consists of:
 
 - Adds a generic kunit_resource API extending it to support
   resources that are passed in to kunit in addition kunit
   allocated resources. In addition, KUnit resources are now
   refcounted to avoid passed in resources being released while
   in use by kunit.
 
 - Add support for named resources.
 
 - Important bug fixes from Brendan Higgins and Will Chen
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl8pyI4ACgkQCwJExA0N
 QxwQmQ//YXvS1IMt4yLQ7iTrhSK2CavzPi/zX+1bgXj7RfEPrdolF12tcHf7b/Ef
 UwhG/PTEE6VkBWFuprleGk0fvVW9h13K+2RkN6wkx8rmTMunMhqmMIRj4neGQWQH
 LV9bYcc5HBH36jbSnJYMFDJdsATq3mmPMeknzkJf/qZgWxczBLo3pUHPNRuhZWq7
 MaRUutpo1f3oSroxhcY8gouRjjPq+yUCip9EeQfdZc/LV9tqjwhjfo+zRKAY3jGf
 0KL00wv4NMTNBSgLkxXFrTcyP28Y84MZB+wE3ikyE5a5wzzvMXuEtZhJk9a9q2ZJ
 tZI+ICCjnTa+E3pN93F2w7cwyrQysxNH1tpK0qv1hGXLRuxOGCJ993AeUtIWjnyV
 CYh/OGK9o92pRuTmoJnMNk9em1WF45YTtvhzpadPiBGb84nbxYSAlrDmm/4xC89p
 7PPpkkVxFReMiu1fIyXOV++y33YFMDtoWe/Qn3pLBRpDDs/XWdK0eT4dVWx1/ttl
 BdtbAxXafApVI5lGk1qPoGC1MO9XvaAidQP6PULD/rhw2Ww6rtlvwGVPffx/omkn
 zgzPTxH1rBehzKPcEaorQqqR6yGXxb4v+gJi1Tg8qRE51Lw619dfyZlJO7axp2h6
 7RO3BOVrIkwLP9XT2NPlL6t+RqbQXezGoZjigLmBRlCSvs1lbLs=
 =kIIk
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:

 - Add a generic kunit_resource API extending it to support resources
   that are passed in to kunit in addition kunit allocated resources. In
   addition, KUnit resources are now refcounted to avoid passed in
   resources being released while in use by kunit.

 - Add support for named resources.

 - Important bug fixes from Brendan Higgins and Will Chen

* tag 'linux-kselftest-kunit-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: tool: fix improper treatment of file location
  kunit: tool: fix broken default args in unit tests
  kunit: capture stderr on all make subprocess calls
  Documentation: kunit: Remove references to --defconfig
  kunit: add support for named resources
  kunit: generalize kunit_resource API beyond allocated resources
2020-08-05 10:07:39 -07:00
Linus Torvalds
2324d50d05 It's been a busy cycle for documentation - hopefully the busiest for a
while to come.  Changes include:
 
  - Some new Chinese translations
 
  - Progress on the battle against double words words and non-HTTPS URLs
 
  - Some block-mq documentation
 
  - More RST conversions from Mauro.  At this point, that task is
    essentially complete, so we shouldn't see this kind of churn again for a
    while.  Unless we decide to switch to asciidoc or something...:)
 
  - Lots of typo fixes, warning fixes, and more.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl8oVkwPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YoW8H/jJ/xnXFn7tkgVPQAlL3k5HCnK7A5nDP9RVR
 cg1pTx1cEFdjzxPlJyExU6/v+AImOvtweHXC+JDK7YcJ6XFUNYXJI3LxL5KwUXbY
 BL/xRFszDSXH2C7SJF5GECcFYp01e/FWSLN3yWAh+g+XwsKiTJ8q9+CoIDkHfPGO
 7oQsHKFu6s36Af0LfSgxk4sVB7EJbo8e4psuPsP5SUrl+oXRO43Put0rXkR4yJoH
 9oOaB51Do5fZp8I4JVAqGXvpXoExyLMO4yw0mASm6YSZ3KyjR8Fae+HD9Cq4ZuwY
 0uzb9K+9NEhqbfwtyBsi99S64/6Zo/MonwKwevZuhtsDTK4l4iU=
 =JQLZ
 -----END PGP SIGNATURE-----

Merge tag 'docs-5.9' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "It's been a busy cycle for documentation - hopefully the busiest for a
  while to come. Changes include:

   - Some new Chinese translations

   - Progress on the battle against double words words and non-HTTPS
     URLs

   - Some block-mq documentation

   - More RST conversions from Mauro. At this point, that task is
     essentially complete, so we shouldn't see this kind of churn again
     for a while. Unless we decide to switch to asciidoc or
     something...:)

   - Lots of typo fixes, warning fixes, and more"

* tag 'docs-5.9' of git://git.lwn.net/linux: (195 commits)
  scripts/kernel-doc: optionally treat warnings as errors
  docs: ia64: correct typo
  mailmap: add entry for <alobakin@marvell.com>
  doc/zh_CN: add cpu-load Chinese version
  Documentation/admin-guide: tainted-kernels: fix spelling mistake
  MAINTAINERS: adjust kprobes.rst entry to new location
  devices.txt: document rfkill allocation
  PCI: correct flag name
  docs: filesystems: vfs: correct flag name
  docs: filesystems: vfs: correct sync_mode flag names
  docs: path-lookup: markup fixes for emphasis
  docs: path-lookup: more markup fixes
  docs: path-lookup: fix HTML entity mojibake
  CREDITS: Replace HTTP links with HTTPS ones
  docs: process: Add an example for creating a fixes tag
  doc/zh_CN: add Chinese translation prefer section
  doc/zh_CN: add clearing-warn-once Chinese version
  doc/zh_CN: add admin-guide index
  doc:it_IT: process: coding-style.rst: Correct __maybe_unused compiler label
  futex: MAINTAINERS: Re-add selftests directory
  ...
2020-08-04 22:47:54 -07:00
Linus Torvalds
4da9f33026 Support for FSGSBASE. Almost 5 years after the first RFC to support it,
this has been brought into a shape which is maintainable and actually
 works.
 
 This final version was done by Sasha Levin who took it up after Intel
 dropped the ball. Sasha discovered that the SGX (sic!) offerings out there
 ship rogue kernel modules enabling FSGSBASE behind the kernels back which
 opens an instantanious unpriviledged root hole.
 
 The FSGSBASE instructions provide a considerable speedup of the context
 switch path and enable user space to write GSBASE without kernel
 interaction. This enablement requires careful handling of the exception
 entries which go through the paranoid entry path as they cannot longer rely
 on the assumption that user GSBASE is positive (as enforced via prctl() on
 non FSGSBASE enabled systemn). All other entries (syscalls, interrupts and
 exceptions) can still just utilize SWAPGS unconditionally when the entry
 comes from user space. Converting these entries to use FSGSBASE has no
 benefit as SWAPGS is only marginally slower than WRGSBASE and locating and
 retrieving the kernel GSBASE value is not a free operation either. The real
 benefit of RD/WRGSBASE is the avoidance of the MSR reads and writes.
 
 The changes come with appropriate selftests and have held up in field
 testing against the (sanitized) Graphene-SGX driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8pGnoTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTYJD/9873GkwvGcc/Vq/dJH1szGTgFftPyZ
 c/Y9gzx7EGBPLo25BS820L+ZlynzXHDxExKfCEaD10TZfe5XIc1vYNR0J74M2NmK
 IBgEDstJeW93ai+rHCFRXIevhpzU4GgGYJ1MeeOgbVMN3aGU1g6HfzMvtF0fPn8Y
 n6fsLZa43wgnoTdjwjjikpDTrzoZbaL1mbODBzBVPAaTbim7IKKTge6r/iCKrOjz
 Uixvm3g9lVzx52zidJ9kWa8esmbOM1j0EPe7/hy3qH9DFo87KxEzjHNH3T6gY5t6
 NJhRAIfY+YyTHpPCUCshj6IkRudE6w/qjEAmKP9kWZxoJrvPCTWOhCzelwsFS9b9
 gxEYfsnaKhsfNhB6fi0PtWlMzPINmEA7SuPza33u5WtQUK7s1iNlgHfvMbjstbwg
 MSETn4SG2/ZyzUrSC06lVwV8kh0RgM3cENc/jpFfIHD0vKGI3qfka/1RY94kcOCG
 AeJd0YRSU2RqL7lmxhHyG8tdb8eexns41IzbPCLXX2sF00eKNkVvMRYT2mKfKLFF
 q8v1x7yuwmODdXfFR6NdCkGm9IU7wtL6wuQ8Nhu9UraFmcXo6X6FLJC18FqcvSb9
 jvcRP4XY/8pNjjf44JB8yWfah0xGQsaMIKQGP4yLv4j6Xk1xAQKH1MqcC7l1D2HN
 5Z24GibFqSK/vA==
 =QaAN
 -----END PGP SIGNATURE-----

Merge tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fsgsbase from Thomas Gleixner:
 "Support for FSGSBASE. Almost 5 years after the first RFC to support
  it, this has been brought into a shape which is maintainable and
  actually works.

  This final version was done by Sasha Levin who took it up after Intel
  dropped the ball. Sasha discovered that the SGX (sic!) offerings out
  there ship rogue kernel modules enabling FSGSBASE behind the kernels
  back which opens an instantanious unpriviledged root hole.

  The FSGSBASE instructions provide a considerable speedup of the
  context switch path and enable user space to write GSBASE without
  kernel interaction. This enablement requires careful handling of the
  exception entries which go through the paranoid entry path as they
  can no longer rely on the assumption that user GSBASE is positive (as
  enforced via prctl() on non FSGSBASE enabled systemn).

  All other entries (syscalls, interrupts and exceptions) can still just
  utilize SWAPGS unconditionally when the entry comes from user space.
  Converting these entries to use FSGSBASE has no benefit as SWAPGS is
  only marginally slower than WRGSBASE and locating and retrieving the
  kernel GSBASE value is not a free operation either. The real benefit
  of RD/WRGSBASE is the avoidance of the MSR reads and writes.

  The changes come with appropriate selftests and have held up in field
  testing against the (sanitized) Graphene-SGX driver"

* tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  x86/fsgsbase: Fix Xen PV support
  x86/ptrace: Fix 32-bit PTRACE_SETREGS vs fsbase and gsbase
  selftests/x86/fsgsbase: Add a missing memory constraint
  selftests/x86/fsgsbase: Fix a comment in the ptrace_write_gsbase test
  selftests/x86: Add a syscall_arg_fault_64 test for negative GSBASE
  selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE
  selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write
  Documentation/x86/64: Add documentation for GS/FS addressing mode
  x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2
  x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit
  x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit
  x86/entry/64: Introduce the FIND_PERCPU_BASE macro
  x86/entry/64: Switch CR3 before SWAPGS in paranoid entry
  x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation
  x86/process/64: Use FSGSBASE instructions on thread copy and ptrace
  x86/process/64: Use FSBSBASE in switch_to() if available
  x86/process/64: Make save_fsgs_for_kvm() ready for FSGSBASE
  x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
  x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions
  x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
  ...
2020-08-04 21:16:22 -07:00
Linus Torvalds
4f30a60aa7 close-range-v5.9
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXygcpgAKCRCRxhvAZXjc
 ogPeAQDv1ncqtNroFAC4pJ4tQhH7JSjW0OltiMk/AocY/J2SdQD9GJ15luYJ0/om
 697q/Z68sndRynhdoZlMuf3oYuBlHQw=
 =3ZhE
 -----END PGP SIGNATURE-----

Merge tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull close_range() implementation from Christian Brauner:
 "This adds the close_range() syscall. It allows to efficiently close a
  range of file descriptors up to all file descriptors of a calling
  task.

  This is coordinated with the FreeBSD folks which have copied our
  version of this syscall and in the meantime have already merged it in
  April 2019:

    https://reviews.freebsd.org/D21627
    https://svnweb.freebsd.org/base?view=revision&revision=359836

  The syscall originally came up in a discussion around the new mount
  API and making new file descriptor types cloexec by default. During
  this discussion, Al suggested the close_range() syscall.

  First, it helps to close all file descriptors of an exec()ing task.
  This can be done safely via (quoting Al's example from [1] verbatim):

        /* that exec is sensitive */
        unshare(CLONE_FILES);
        /* we don't want anything past stderr here */
        close_range(3, ~0U);
        execve(....);

  The code snippet above is one way of working around the problem that
  file descriptors are not cloexec by default. This is aggravated by the
  fact that we can't just switch them over without massively regressing
  userspace. For a whole class of programs having an in-kernel method of
  closing all file descriptors is very helpful (e.g. demons, service
  managers, programming language standard libraries, container managers
  etc.).

  Second, it allows userspace to avoid implementing closing all file
  descriptors by parsing through /proc/<pid>/fd/* and calling close() on
  each file descriptor and other hacks. From looking at various
  large(ish) userspace code bases this or similar patterns are very
  common in service managers, container runtimes, and programming
  language runtimes/standard libraries such as Python or Rust.

  In addition, the syscall will also work for tasks that do not have
  procfs mounted and on kernels that do not have procfs support compiled
  in. In such situations the only way to make sure that all file
  descriptors are closed is to call close() on each file descriptor up
  to UINT_MAX or RLIMIT_NOFILE, OPEN_MAX trickery.

  Based on Linus' suggestion close_range() also comes with a new flag
  CLOSE_RANGE_UNSHARE to more elegantly handle file descriptor dropping
  right before exec. This would usually be expressed in the sequence:

        unshare(CLONE_FILES);
        close_range(3, ~0U);

  as pointed out by Linus it might be desirable to have this be a part
  of close_range() itself under a new flag CLOSE_RANGE_UNSHARE which
  gets especially handy when we're closing all file descriptors above a
  certain threshold.

  Test-suite as always included"

* tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  tests: add CLOSE_RANGE_UNSHARE tests
  close_range: add CLOSE_RANGE_UNSHARE
  tests: add close_range() tests
  arch: wire-up close_range()
  open: add close_range()
2020-08-04 15:12:02 -07:00
Linus Torvalds
74858abbb1 cap-checkpoint-restore-v5.9
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXygegQAKCRCRxhvAZXjc
 olWZAQCMPbhI/20LA3OYJ6s+BgBEnm89PymvlHcym6Z4AvTungD+KqZonIYuxWgi
 6Ttlv/fzgFFbXgJgbuass5mwFVoN5wM=
 =oK7d
 -----END PGP SIGNATURE-----

Merge tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull checkpoint-restore updates from Christian Brauner:
 "This enables unprivileged checkpoint/restore of processes.

  Given that this work has been going on for quite some time the first
  sentence in this summary is hopefully more exciting than the actual
  final code changes required. Unprivileged checkpoint/restore has seen
  a frequent increase in interest over the last two years and has thus
  been one of the main topics for the combined containers &
  checkpoint/restore microconference since at least 2018 (cf. [1]).

  Here are just the three most frequent use-cases that were brought forward:

   - The JVM developers are integrating checkpoint/restore into a Java
     VM to significantly decrease the startup time.

   - In high-performance computing environment a resource manager will
     typically be distributing jobs where users are always running as
     non-root. Long-running and "large" processes with significant
     startup times are supposed to be checkpointed and restored with
     CRIU.

   - Container migration as a non-root user.

  In all of these scenarios it is either desirable or required to run
  without CAP_SYS_ADMIN. The userspace implementation of
  checkpoint/restore CRIU already has the pull request for supporting
  unprivileged checkpoint/restore up (cf. [2]).

  To enable unprivileged checkpoint/restore a new dedicated capability
  CAP_CHECKPOINT_RESTORE is introduced. This solution has last been
  discussed in 2019 in a talk by Google at Linux Plumbers (cf. [1]
  "Update on Task Migration at Google Using CRIU") with Adrian and
  Nicolas providing the implementation now over the last months. In
  essence, this allows the CRIU binary to be installed with the
  CAP_CHECKPOINT_RESTORE vfs capability set thereby enabling
  unprivileged users to restore processes.

  To make this possible the following permissions are altered:

   - Selecting a specific PID via clone3() set_tid relaxed from userns
     CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE.

   - Selecting a specific PID via /proc/sys/kernel/ns_last_pid relaxed
     from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE.

   - Accessing /proc/pid/map_files relaxed from init userns
     CAP_SYS_ADMIN to init userns CAP_CHECKPOINT_RESTORE.

   - Changing /proc/self/exe from userns CAP_SYS_ADMIN to userns
     CAP_CHECKPOINT_RESTORE.

  Of these four changes the /proc/self/exe change deserves a few words
  because the reasoning behind even restricting /proc/self/exe changes
  in the first place is just full of historical quirks and tracking this
  down was a questionable version of fun that I'd like to spare others.

  In short, it is trivial to change /proc/self/exe as an unprivileged
  user, i.e. without userns CAP_SYS_ADMIN right now. Either via ptrace()
  or by simply intercepting the elf loader in userspace during exec.
  Nicolas was nice enough to even provide a POC for the latter (cf. [3])
  to illustrate this fact.

  The original patchset which introduced PR_SET_MM_MAP had no
  permissions around changing the exe link. They too argued that it is
  trivial to spoof the exe link already which is true. The argument
  brought up against this was that the Tomoyo LSM uses the exe link in
  tomoyo_manager() to detect whether the calling process is a policy
  manager. This caused changing the exe links to be guarded by userns
  CAP_SYS_ADMIN.

  All in all this rather seems like a "better guard it with something
  rather than nothing" argument which imho doesn't qualify as a great
  security policy. Again, because spoofing the exe link is possible for
  the calling process so even if this were security relevant it was
  broken back then and would be broken today. So technically, dropping
  all permissions around changing the exe link would probably be
  possible and would send a clearer message to any userspace that relies
  on /proc/self/exe for security reasons that they should stop doing
  this but for now we're only relaxing the exe link permissions from
  userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE.

  There's a final uapi change in here. Changing the exe link used to
  accidently return EINVAL when the caller lacked the necessary
  permissions instead of the more correct EPERM. This pr contains a
  commit fixing this. I assume that userspace won't notice or care and
  if they do I will revert this commit. But since we are changing the
  permissions anyway it seems like a good opportunity to try this fix.

  With these changes merged unprivileged checkpoint/restore will be
  possible and has already been tested by various users"

[1] LPC 2018
     1. "Task Migration at Google Using CRIU"
        https://www.youtube.com/watch?v=yI_1cuhoDgA&t=12095
     2. "Securely Migrating Untrusted Workloads with CRIU"
        https://www.youtube.com/watch?v=yI_1cuhoDgA&t=14400
     LPC 2019
     1. "CRIU and the PID dance"
         https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=2m48s
     2. "Update on Task Migration at Google Using CRIU"
        https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=1h2m8s

[2] https://github.com/checkpoint-restore/criu/pull/1155

[3] https://github.com/nviennot/run_as_exe

* tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  selftests: add clone3() CAP_CHECKPOINT_RESTORE test
  prctl: exe link permission error changed from -EINVAL to -EPERM
  prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe
  proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTORE
  pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid
  pid: use checkpoint_restore_ns_capable() for set_tid
  capabilities: Introduce CAP_CHECKPOINT_RESTORE
2020-08-04 15:02:07 -07:00
Linus Torvalds
0a72761b27 threads-v5.9
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXygcLwAKCRCRxhvAZXjc
 ohajAP4n5E3BmN0jpIviXT4eNhP62jzxJtxlVXtgGT3D8b1mpQEA5n8NSOlQLoAh
 yUGsjtwR9xDcHMcrhXD3yN6eYJSK0A8=
 =tn4R
 -----END PGP SIGNATURE-----

Merge tag 'threads-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull thread updates from Christian Brauner:
 "This contains the changes to add the missing support for attaching to
  time namespaces via pidfds.

  Last cycle setns() was changed to support attaching to multiple
  namespaces atomically. This requires all namespaces to have a point of
  no return where they can't fail anymore.

  Specifically, <namespace-type>_install() is allowed to perform
  permission checks and install the namespace into the new struct nsset
  that it has been given but it is not allowed to make visible changes
  to the affected task. Once <namespace-type>_install() returns,
  anything that the given namespace type additionally requires to be
  setup needs to ideally be done in a function that can't fail or if it
  fails the failure must be non-fatal.

  For time namespaces the relevant functions that fell into this
  category were timens_set_vvar_page() and vdso_join_timens(). The
  latter could still fail although it didn't need to. This function is
  only implemented for vdso_join_timens() in current mainline. As
  discussed on-list (cf. [1]), in order to make setns() support time
  namespaces when attaching to multiple namespaces at once properly we
  changed vdso_join_timens() to always succeed. So vdso_join_timens()
  replaces the mmap_write_lock_killable() with mmap_read_lock().

  Please note that arm is about to grow vdso support for time namespaces
  (possibly this merge window). We've synced on this change and arm64
  also uses mmap_read_lock(), i.e. makes vdso_join_timens() a function
  that can't fail. Once the changes here and the arm64 changes have
  landed, vdso_join_timens() should be turned into a void function so
  it's obvious to callers and implementers on other architectures that
  the expectation is that it can't fail.

  We didn't do this right away because it would've introduced
  unnecessary merge conflicts between the two trees for no major gain.

  As always, tests included"

[1]: https://lore.kernel.org/lkml/20200611110221.pgd3r5qkjrjmfqa2@wittgenstein

* tag 'threads-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  tests: add CLONE_NEWTIME setns tests
  nsproxy: support CLONE_NEWTIME with setns()
  timens: add timens_commit() helper
  timens: make vdso_join_timens() always succeed
2020-08-04 14:40:07 -07:00
Linus Torvalds
9ecc6ea491 seccomp updates for v5.9-rc1
- Improved selftest coverage, timeouts, and reporting
 - Add EPOLLHUP support for SECCOMP_RET_USER_NOTIF (Christian Brauner)
 - Refactor __scm_install_fd() into __receive_fd() and fix buggy callers
 - Introduce "addfd" command for SECCOMP_RET_USER_NOTIF (Sargun Dhillon)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl8oZcQWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJomDD/4x3j7eXREcXDsHOmlgEaHWGx4l
 JldHFQhV5GjmD7gOkPcoZSG7NfG7F6VpwAJg7ZoR3qUkem7K8DFucxqgo1RldCot
 nigleeLX6JeMS0Z+iwjAVZd+5t4xG4J/7GGDHIIMiG5qvwJ0Yf64o1bkjaB2Q/Bv
 tluBg0WF32kFMG/ZwyY/V2QDbbue97CFPflybOh1o2nWbVzmUlFEEum3UUvZsxc8
 smMsattJyuAV7kcEKzKrs8b010NdFZqwdbub5Np9W3XEXGBYMdIPoNsOQGmB9wby
 j2ui0lzboXRG997jM7TCd1l/XZAv8aAwvPplw3FJRybzkOGs9NDyLMoz87yJpR1T
 xp511vnMyMbyKIGdungkt7cIyzaictHwaYzznsmuNdCPEjTaIQJr1ctsa4GEgtqf
 pnkktZ9YbMCcHU0CtZ8GlOVqA9wE+FUm0/u0zgikzJQsB+HcNItiARTTTHRyco7p
 VJCqK8o4Zx4ELV7QNkSH4nhFkVgRopvrvBiPAGro/qwGOofBg8W8wM8O1+V/MDmp
 zSU22v4SncT1Xb7dtmdJqDEeHfDikhaCAb4Je2hsGQWzbdAqwHGlpa7vpk9x3Q5r
 L+XyP+Z+rPHlXYyypJwUvvOQhXOmP0zYxcEHxByqIBfXiwy+3dN4tDDfatWbccwl
 uTlTDM8kmQn6QzSztA==
 =yb55
 -----END PGP SIGNATURE-----

Merge tag 'seccomp-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp updates from Kees Cook:
 "There are a bunch of clean ups and selftest improvements along with
  two major updates to the SECCOMP_RET_USER_NOTIF filter return:
  EPOLLHUP support to more easily detect the death of a monitored
  process, and being able to inject fds when intercepting syscalls that
  expect an fd-opening side-effect (needed by both container folks and
  Chrome). The latter continued the refactoring of __scm_install_fd()
  started by Christoph, and in the process found and fixed a handful of
  bugs in various callers.

   - Improved selftest coverage, timeouts, and reporting

   - Add EPOLLHUP support for SECCOMP_RET_USER_NOTIF (Christian Brauner)

   - Refactor __scm_install_fd() into __receive_fd() and fix buggy
     callers

   - Introduce 'addfd' command for SECCOMP_RET_USER_NOTIF (Sargun
     Dhillon)"

* tag 'seccomp-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits)
  selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFD
  seccomp: Introduce addfd ioctl to seccomp user notifier
  fs: Expand __receive_fd() to accept existing fd
  pidfd: Replace open-coded receive_fd()
  fs: Add receive_fd() wrapper for __receive_fd()
  fs: Move __scm_install_fd() to __receive_fd()
  net/scm: Regularize compat handling of scm_detach_fds()
  pidfd: Add missing sock updates for pidfd_getfd()
  net/compat: Add missing sock updates for SCM_RIGHTS
  selftests/seccomp: Check ENOSYS under tracing
  selftests/seccomp: Refactor to use fixture variants
  selftests/harness: Clean up kern-doc for fixtures
  seccomp: Use -1 marker for end of mode 1 syscall list
  seccomp: Fix ioctl number for SECCOMP_IOCTL_NOTIF_ID_VALID
  selftests/seccomp: Rename user_trap_syscall() to user_notif_syscall()
  selftests/seccomp: Make kcmp() less required
  seccomp: Use pr_fmt
  selftests/seccomp: Improve calibration loop
  selftests/seccomp: use 90s as timeout
  selftests/seccomp: Expand benchmark to per-filter measurements
  ...
2020-08-04 14:11:08 -07:00
Linus Torvalds
99ea1521a0 Remove uninitialized_var() macro for v5.9-rc1
- Clean up non-trivial uses of uninitialized_var()
 - Update documentation and checkpatch for uninitialized_var() removal
 - Treewide removal of uninitialized_var()
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl8oYLQWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsfjEACvf0D3WL3H7sLHtZ2HeMwOgAzq
 il08t6vUscINQwiIIK3Be43ok3uQ1Q+bj8sr2gSYTwunV2IYHFferzgzhyMMno3o
 XBIGd1E+v1E4DGBOiRXJvacBivKrfvrdZ7AWiGlVBKfg2E0fL1aQbe9AYJ6eJSbp
 UGqkBkE207dugS5SQcwrlk1tWKUL089lhDAPd7iy/5RK76OsLRCJFzIerLHF2ZK2
 BwvA+NWXVQI6pNZ0aRtEtbbxwEU4X+2J/uaXH5kJDszMwRrgBT2qoedVu5LXFPi8
 +B84IzM2lii1HAFbrFlRyL/EMueVFzieN40EOB6O8wt60Y4iCy5wOUzAdZwFuSTI
 h0xT3JI8BWtpB3W+ryas9cl9GoOHHtPA8dShuV+Y+Q2bWe1Fs6kTl2Z4m4zKq56z
 63wQCdveFOkqiCLZb8s6FhnS11wKtAX4czvXRXaUPgdVQS1Ibyba851CRHIEY+9I
 AbtogoPN8FXzLsJn7pIxHR4ADz+eZ0dQ18f2hhQpP6/co65bYizNP5H3h+t9hGHG
 k3r2k8T+jpFPaddpZMvRvIVD8O2HvJZQTyY6Vvneuv6pnQWtr2DqPFn2YooRnzoa
 dbBMtpon+vYz6OWokC5QNWLqHWqvY9TmMfcVFUXE4AFse8vh4wJ8jJCNOFVp8On+
 drhmmImUr1YylrtVOw==
 =xHmk
 -----END PGP SIGNATURE-----

Merge tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull uninitialized_var() macro removal from Kees Cook:
 "This is long overdue, and has hidden too many bugs over the years. The
  series has several "by hand" fixes, and then a trivial treewide
  replacement.

   - Clean up non-trivial uses of uninitialized_var()

   - Update documentation and checkpatch for uninitialized_var() removal

   - Treewide removal of uninitialized_var()"

* tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  compiler: Remove uninitialized_var() macro
  treewide: Remove uninitialized_var() usage
  checkpatch: Remove awareness of uninitialized_var() macro
  mm/debug_vm_pgtable: Remove uninitialized_var() usage
  f2fs: Eliminate usage of uninitialized_var() macro
  media: sur40: Remove uninitialized_var() usage
  KVM: PPC: Book3S PR: Remove uninitialized_var() usage
  clk: spear: Remove uninitialized_var() usage
  clk: st: Remove uninitialized_var() usage
  spi: davinci: Remove uninitialized_var() usage
  ide: Remove uninitialized_var() usage
  rtlwifi: rtl8192cu: Remove uninitialized_var() usage
  b43: Remove uninitialized_var() usage
  drbd: Remove uninitialized_var() usage
  x86/mm/numa: Remove uninitialized_var() usage
  docs: deprecated.rst: Add uninitialized_var()
2020-08-04 13:49:43 -07:00
David S. Miller
ee895a30ef Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Flush the cleanup xtables worker to make sure destructors
   have completed, from Florian Westphal.

2) iifgroup is matching erroneously, also from Florian.

3) Add selftest for meta interface matching, from Florian Westphal.

4) Move nf_ct_offload_timeout() to header, from Roi Dayan.

5) Call nf_ct_offload_timeout() from flow_offload_add() to
   make sure garbage collection does not evict offloaded flow,
   from Roi Dayan.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 13:32:39 -07:00
Stefano Brivio
7b53682c94 selftests: pmtu.sh: Add tests for UDP tunnels handled by Open vSwitch
The new tests check that IP and IPv6 packets exceeding the local PMTU
estimate, forwarded by an Open vSwitch instance from another node,
result in the correct route exceptions being created, and that
communication with end-to-end fragmentation, over GENEVE and VXLAN
Open vSwitch ports, is now possible as a result of PMTU discovery.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 13:01:46 -07:00
Stefano Brivio
df40e39c0d selftests: pmtu.sh: Add tests for bridged UDP tunnels
The new tests check that IP and IPv6 packets exceeding the local PMTU
estimate, both locally generated and forwarded by a bridge from
another node, result in the correct route exceptions being created,
and that communication with end-to-end fragmentation over VXLAN and
GENEVE tunnels is now possible as a result of PMTU discovery.

Part of the existing setup functions aren't generic enough to simply
add a namespace and a bridge to the existing routing setup. This
rework is in progress and we can easily shrink this once more generic
topology functions are available.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04 13:01:45 -07:00
Jin Yao
c4735d9902 perf evsel: Don't set sample_regs_intr/sample_regs_user for dummy event
Since commit 0a892c1c94 ("perf record: Add dummy event during system wide synthesis"),
a dummy event is added to capture mmaps.

But if we run perf-record as,

 # perf record -e cycles:p -IXMM0 -a -- sleep 1
 Error:
 dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'

The issue is, if we enable the extended regs (-IXMM0), but the
pmu->capabilities is not set with PERF_PMU_CAP_EXTENDED_REGS, the kernel
will return -EOPNOTSUPP error.

See following code:

/* in kernel/events/core.c */
static int perf_try_init_event(struct pmu *pmu, struct perf_event *event)

{
        ....
        if (!(pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS) &&
            has_extended_regs(event))
                ret = -EOPNOTSUPP;
        ....
}

For software dummy event, the PMU should not be set with
PERF_PMU_CAP_EXTENDED_REGS. But unfortunately now, the dummy
event has possibility to be set with PERF_REG_EXTENDED_MASK bit.

In evsel__config, /* tools/perf/util/evsel.c */

if (opts->sample_intr_regs) {
        attr->sample_regs_intr = opts->sample_intr_regs;
}

If we use -IXMM0, the attr>sample_regs_intr will be set with
PERF_REG_EXTENDED_MASK bit.

It doesn't make sense to set attr->sample_regs_intr for a
software dummy event.

This patch adds dummy event checking before setting
attr->sample_regs_intr and attr->sample_regs_user.

After:
  # ./perf record -e cycles:p -IXMM0 -a -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.413 MB perf.data (45 samples) ]

Committer notes:

Adrian said this when providing his Acked-by:

"
This is fine.  It will not break PT.

no_aux_samples is useful for evsels that have been added by the code rather
than requested by the user.  For old kernels PT adds sched_switch tracepoint
to track context switches (before the current context switch event was
added) and having auxiliary sample information unnecessarily uses up space
in the perf buffer.
"

Fixes: 0a892c1c94 ("perf record: Add dummy event during system wide synthesis")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200720010013.18238-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-04 09:04:31 -03:00
Alexey Budankov
1d078ccb33 perf record: Introduce --control fd:ctl-fd[,ack-fd] options
Introduce --control fd:ctl-fd[,ack-fd] options to pass open file
descriptors numbers from command line.

Extend perf-record.txt file with --control fd:ctl-fd[,ack-fd] options
description.

Document possible usage model introduced by --control fd:ctl-fd[,ack-fd]
options by providing example bash shell script.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/8dc01e1a-3a80-3f67-5385-4bc7112b0dd3@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-04 08:50:52 -03:00
Alexey Budankov
acce022394 perf record: Implement control commands handling
Implement handling of 'enable' and 'disable' control commands coming
from control file descriptor.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/f0fde590-1320-dca1-39ff-da3322704d3b@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-04 08:50:28 -03:00