New features:
. Allow configuring a 'perf ftrace' default --tracer (Taeung Song)
Infrastructure:
. Sync tools/arch/{powerpc,arm}/include/uapi/asm/kvm.h and
tools/arch/x86/include/asm/cpufeatures.h (Ingo Molnar)
. Add BPF program file system pinning APIs and respective
'perf test' entry (Joe Stringer)
. Make tools tree support 'make -s' (Josh Poimboeuf)
. Reference count maps in callchains, fixing SEGFAULT when
referencing maps after it is freed (Krister Johansen)
. Create for_each_event trace points iterator (Taeung Song)
. Do not consider an error not to have any perfconfig file
(Arnaldo Carvalho de Melo
. Propagate perf_config() errors (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYkcuWAAoJENZQFvNTUqpAMDgQALp5+x7KkWikCBWe4t3CX0Rq
+VQcanFJ9eqRYsoBBJ7F4A+E+9AKMjzYTfoPyFv1S+W9cCFsEKeOuVsICtM8PpjB
UEUgQGJjAfRK/IcDeeQSzGqWMbiVgPv2fE5W/zPV2V+i25uBVHbZkOmPB7xcCUsY
/ORA5RT93uaHkAv9je8/bGPrQp5y6aDlKh6W3ePD4UwBnFmy8QMOx+24uc4t/uZM
G2cjTcz5HEXlUE0wGCO4C2T2mcEchGDVtBM1Uz0pKmE3eMB3ZluA5+XpHvmLQTQ5
dZQhm3mLnllNz4Zbg2yUkIq3HEaARav1db1rUC0P1ySvYventvGRd181dpugyWdu
3XuPPqb4QjHT7x4eYcXJY4HuwrLChbVKn3lFo8Mhfd6HpMxQHFspTwyIVxLYoq8I
LcZE3waOYYki5yaRTQiHtWypvn6RNBOLvfvU6DoINENHMJ82DgsqgXoIA8bp++YX
miO5NHYheWBi4DmBqh9oSKvq5hriKDS5M2QTyZEkzItzOE2lsq0tl2neLg4SQCNC
ajM3szUYje/cvSoB5K3r28C9vWwTLXvnOE0B6TpDbh3y7xqdNX59oDSNu4dnWwwv
fN8O7cJo2X5W/JjScCuMSBg+V+UgIl2JfKCxfd0AdXL/e1UsTz9dV6CtgMGtN664
YCxiQKnJeZLViXZfVmaM
=OCSZ
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-4.11-20170201' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Allow configuring a 'perf ftrace' default --tracer (Taeung Song)
Infrastructure changes:
- Sync tools/arch/{powerpc,arm}/include/uapi/asm/kvm.h and
tools/arch/x86/include/asm/cpufeatures.h (Ingo Molnar)
- Add BPF program file system pinning APIs and respective
'perf test' entry (Joe Stringer)
- Make tools tree support 'make -s' (Josh Poimboeuf)
- Reference count maps in callchains, fixing SEGFAULT when
referencing maps after it is freed (Krister Johansen)
- Create for_each_event trace points iterator (Taeung Song)
- Do not consider an error not to have any perfconfig file
(Arnaldo Carvalho de Melo
- Propagate perf_config() errors (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit:
8ee83b2ab3 ("perf/x86/intel/pt: Add support for PTWRITE and power event tracing")
forgot to add format strings to the PT driver. So one could enable these features
by setting corresponding bits in the event config, but not by their mnemonic names.
This patch adds the format strings.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Fixes: 8ee83b2ab3 ("perf/x86/intel/pt: Add support for PTWRITE...")
Link: http://lkml.kernel.org/r/20170127151644.8585-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Similar to for_each_subsystem and for_each_event in util/parse-events.c,
add new macro 'for_each_event' for easy iteration over the tracepoints
in order to be more compact and readable.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485862711-20216-2-git-send-email-treeze.taeung@gmail.com
[ Slight change to keep existing style for checking strcmp() return ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allow mounting of the BPF filesystem at /sys/fs/bpf.
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-6-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
rm_rf() doesn't modify its path argument, and a future caller will pass
a string constant into it to delete.
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-5-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a new API to pin a BPF object to the filesystem. The user can
specify the path within a BPF filesystem to pin the object.
Programs will be pinned under a subdirectory named the same as the
program, with each instance appearing as a numbered file under that
directory, and maps will be pinned under the path using the name of
the map as the file basename.
For example, with the directory '/sys/fs/bpf/foo' and a BPF object which
contains two instances of a program named 'bar', and a map named 'baz':
/sys/fs/bpf/foo/bar/0
/sys/fs/bpf/foo/bar/1
/sys/fs/bpf/foo/baz
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-4-joe@ovn.org
[ Check snprintf >= for truncation, as snprintf(bf, size, ...) == size also means truncation ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a new API to pin a BPF map to the filesystem. The user can specify
the path full path within a BPF filesystem to pin the map.
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-3-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add new APIs to pin a BPF program (or specific instances) to the
filesystem. The user can specify the path full path within a BPF
filesystem to pin the program.
bpf_program__pin_instance(prog, path, n) will pin the nth instance of
'prog' to the specified path.
bpf_program__pin(prog, path) will create the directory 'path' (if it
does not exist) and pin each instance within that directory. For
instance, path/0, path/1, path/2.
Committer notes:
- Add missing headers for mkdir()
- Check strdup() for failure
- Check snprintf >= size, not >, as == also means truncated, see 'man
snprintf', return value.
- Conditionally define BPF_FS_MAGIC, as it isn't in magic.h in older
systems and we're not yet having a tools/include/uapi/linux/magic.h
copy.
- Do not include linux/magic.h, not present in older distros.
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-2-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If dso__load_kcore frees all of the existing maps, but one has already
been attached to a callchain cursor node, then we can get a SIGSEGV in
any function that happens to try to use this invalid cursor. Use the
existing map refcount mechanism to forestall cleanup of a map until the
cursor iterates past the node.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@kernel.org
Fixes: 84c2cafa28 ("perf tools: Reference count struct map")
Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The following upstream headers were updated:
- The x86 cpufeatures.h file picked up a couple of new feature entries
- The PowerPC and ARM KVM headers picked up new features
None of which requires changes to perf tooling, so refresh the tooling copy.
Solves these build time warnings:
Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170130081131.GA8322@gmail.com
[ resync tools/arch/x86/include/asm/cpufeatures.h ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the AMD pieces from the generic Makefile so that
$ make arch/x86/events/amd/<file>.s
can work too. Otherwise you get:
$ make arch/x86/events/amd/ibs.s
scripts/Makefile.build:44: arch/x86/events/amd/Makefile: No such file or directory
make[1]: *** No rule to make target 'arch/x86/events/amd/Makefile'. Stop.
Makefile:1636: recipe for target 'arch/x86/events/amd/ibs.s' failed
make: *** [arch/x86/events/amd/ibs.s] Error 2
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20170126080819.417-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch updates the sysfs attributes for AMD Family17h processors. In
Family17h, the event bit position is changed for both the NorthBridge
and Last level cache counters.
The sysfs attributes are assigned based on the family and the type of
the counter.
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/617570ed3634e804991f95db62c3cf3856a9d2a7.1484598705.git.Janakarajan.Natarajan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch updates the AMD uncore driver to support AMD Family17h
processors. In Family17h, there are two extra last level cache counters.
The maximum available counters is increased and the number of counters
for each uncore type is now based on the family.
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/799f9c5be8963cc209d9169a08f4a2643b748dc7.1484598705.git.Janakarajan.Natarajan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch renames L2 counters to LLC counters. In AMD Family17h
processors, L3 cache counter is supported.
Since older families have at most L2 counters, last level cache (LLC)
indicates L2/L3 based on the family.
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/5d8cd8736d8d578354597a548e64ff16210c319b.1484598705.git.Janakarajan.Natarajan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
perf has additional overhead when monitoring the task which
frequently generates child tasks.
perf_init_event() is one of the hotspots for the additional overhead:
Currently, to get the PMU, it tries to search the type in pmu_idr at
first. But it is not always successful, especially for the widely used
PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events. So it has to go to the
slow path which go through the whole PMUs list.
It will be a big performance issue, if the PMUs list is long (e.g. server
with many uncore boxes) and the task frequently generates child tasks.
The child event inherits its parent event. So the child event should
try its parent PMU first.
Here is some data from the overhead test on Broadwell server:
perf record -e $TEST_EVENTS -- ./loop.sh 50000
loop.sh
start=$(date +%s%N)
i=0
while [ "$i" -le "$1" ]
do
date > /dev/null
i=`expr $i + 1`
done
end=$(date +%s%N)
elapsed=`expr $end - $start`
Event# Original elapsed time Elapsed time with patch delta
1 196,573,192,397 189,162,029,998 -3.77%
2 257,567,753,013 241,620,788,683 -6.19%
4 398,730,726,971 370,518,938,714 -7.08%
8 824,983,761,120 740,702,489,329 -10.22%
16 1,883,411,923,498 1,672,027,508,355 -11.22%
... which shows a nice performance improvement.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1484745662-15928-2-git-send-email-kan.liang@intel.com
[ Tidied up the changelog and the code comment. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When new events are added to an active context, we go and reschedule
all cpu groups and all task groups in order to preserve the priority
(cpu pinned, task pinned, cpu flexible, task flexible), but in
reality we only need to reschedule groups of the same priority as
that of the events being added, and below.
This patch changes the behavior so that only groups that need to be
rescheduled are rescheduled.
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20170119164330.22887-3-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In the sched-in path, we first remove a CPU's flexible events in order to
give priority to the task's pinned events. However, this step can be safely
skipped if the task doesn't have its own pinned events.
This patch implements this skipping.
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20170119164330.22887-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
cpuctx->unique_pmu was originally introduced as a way to identify cpuctxs
with shared pmus in order to avoid visiting the same cpuctx more than once
in a for_each_pmu loop.
cpuctx->unique_pmu == cpuctx->pmu in non-software task contexts since they
have only one pmu per cpuctx. Since perf_pmu_sched_task() is only called in
hw contexts, this patch replaces cpuctx->unique_pmu by cpuctx->pmu in it.
The change above, together with the previous patch in this series, removed
the remaining uses of cpuctx->unique_pmu, so we remove it altogether.
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Cc: Vince Weaver <vince@deater.net>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20170118192454.58008-3-davidcc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch follows from a conversation in CQM/CMT's last series about
speeding up the context switch for cgroup events:
https://patchwork.kernel.org/patch/9478617/
This is a low-hanging fruit optimization. It replaces the iteration over
the "pmus" list in cgroup switch by an iteration over a new list that
contains only cpuctxs with at least one cgroup event.
This is necessary because the number of PMUs have increased over the years
e.g modern x86 server systems have well above 50 PMUs.
The iteration over the full PMU list is unneccessary and can be costly in
heavy cache contention scenarios.
Below are some instrumentation measurements with 10, 50 and 90 percentiles
of the total cost of context switch before and after this optimization for
a simple array read/write microbenchark.
Contention
Level Nr events Before (us) After (us) Median
L2 L3 types (10%, 50%, 90%) (10%, 50%, 90% Speedup
--------------------------------------------------------------------------
Low Low 1 (1.72, 2.42, 5.85) (1.35, 1.64, 5.46) 29%
High Low 1 (2.08, 4.56, 19.8) (1720, 2.20, 13.7) 51%
High High 1 (2.86, 10.4, 12.7) (2.54, 4.32, 12.1) 58%
Low Low 2 (1.98, 3.20, 6.89) (1.68, 2.41, 8.89) 24%
High Low 2 (2.48, 5.28, 22.4) (2150, 3.69, 14.6) 30%
High High 2 (3.32, 8.09, 13.9) (2.80, 5.15, 13.7) 36%
where:
1 event type = cycles
2 event types = cycles,intel_cqm/llc_occupancy/
Contention L2 Low: workset < L2 cache size.
High: " >> L2 " " .
Contention L3 Low: workset of task on all sockets < L3 cache size.
High: " " " " " " >> L3 " " .
Median Speedup is (50%ile Before - 50%ile After) / 50%ile Before
Unsurprisingly, the benefits of this optimization decrease with the number
of cpuctxs with a cgroup events, yet, is never detrimental.
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Cc: Vince Weaver <vince@deater.net>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20170118192454.58008-2-davidcc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Andres reported that MMAP2 records for anonymous memory always have
their protection field 0.
Turns out, someone daft put the prot/flags generation code in the file
branch, leaving them unset for anonymous memory.
Reported-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Don Zickus <dzickus@redhat.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: anton@ozlabs.org
Cc: namhyung@kernel.org
Cc: stable@vger.kernel.org # v3.16+
Fixes: f972eb63b1 ("perf: Pass protection and flags bits through mmap2 interface")
Link: http://lkml.kernel.org/r/20170126221508.GF6536@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dmitry reported a KASAN use-after-free on event->group_leader.
It turns out there's a hole in perf_remove_from_context() due to
event_function_call() not calling its function when the task
associated with the event is already dead.
In this case the event will have been detached from the task, but the
grouping will have been retained, such that group operations might
still work properly while there are live child events etc.
This does however mean that we can miss a perf_group_detach() call
when the group decomposes, this in turn can then lead to
use-after-free.
Fix it by explicitly doing the group detach if its still required.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # v4.5+
Cc: syzkaller <syzkaller@googlegroups.com>
Fixes: 63b6da39bb ("perf: Fix perf_event_exit_task() race")
Link: http://lkml.kernel.org/r/20170126153955.GD6515@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I've seen this trigger twice now, where the i915_gem_object_to_ggtt()
call in intel_unpin_fb_obj() returns NULL, resulting in an oops
immediately afterwards as the (inlined) call to i915_vma_unpin_fence()
tries to dereference it.
It seems to be some race condition where the object is going away at
shutdown time, since both times happened when shutting down the X
server. The call chains were different:
- VT ioctl(KDSETMODE, KD_TEXT):
intel_cleanup_plane_fb+0x5b/0xa0 [i915]
drm_atomic_helper_cleanup_planes+0x6f/0x90 [drm_kms_helper]
intel_atomic_commit_tail+0x749/0xfe0 [i915]
intel_atomic_commit+0x3cb/0x4f0 [i915]
drm_atomic_commit+0x4b/0x50 [drm]
restore_fbdev_mode+0x14c/0x2a0 [drm_kms_helper]
drm_fb_helper_restore_fbdev_mode_unlocked+0x34/0x80 [drm_kms_helper]
drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper]
intel_fbdev_set_par+0x18/0x70 [i915]
fb_set_var+0x236/0x460
fbcon_blank+0x30f/0x350
do_unblank_screen+0xd2/0x1a0
vt_ioctl+0x507/0x12a0
tty_ioctl+0x355/0xc30
do_vfs_ioctl+0xa3/0x5e0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x13/0x94
- i915 unpin_work workqueue:
intel_unpin_work_fn+0x58/0x140 [i915]
process_one_work+0x1f1/0x480
worker_thread+0x48/0x4d0
kthread+0x101/0x140
and this patch purely papers over the issue by adding a NULL pointer
check and a WARN_ON_ONCE() to avoid the oops that would then generally
make the machine unresponsive.
Other callers of i915_gem_object_to_ggtt() seem to also check for the
returned pointer being NULL and warn about it, so this clearly has
happened before in other places.
[ Reported it originally to the i915 developers on Jan 8, applying the
ugly workaround on my own now after triggering the problem for the
second time with no feedback.
This is likely to be the same bug reported as
https://bugs.freedesktop.org/show_bug.cgi?id=98829https://bugs.freedesktop.org/show_bug.cgi?id=99134
which has a patch for the underlying problem, but it hasn't gotten to
me, so I'm applying the workaround. ]
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull two parisc fixes from Helge Deller:
"One fix to avoid usage of BITS_PER_LONG in user-space exported swab.h
header which breaks compiling qemu, and one trivial fix for printk
continuation in the parisc parport driver"
* 'parisc-4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Don't use BITS_PER_LONG in userspace-exported swab.h header
parisc, parport_gsc: Fixes for printk continuation lines
Pull i2c fixes from Wolfram Sang:
"Two I2C driver bugfixes.
The 'VLLS mode support' patch should have been entitled 'reconfigure
pinctrl after suspend' to make the bugfix more clear. Sorry, I missed
that, yet didn't want to rebase"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx-lpi2c: add VLLS mode support
i2c: i2c-cadence: Initialize configuration before probing devices
In swab.h the "#if BITS_PER_LONG > 32" breaks compiling userspace programs if
BITS_PER_LONG is #defined by userspace with the sizeof() compiler builtin.
Solve this problem by using __BITS_PER_LONG instead. Since we now
#include asm/bitsperlong.h avoid further potential userspace pollution
by moving the #define of SHIFT_PER_LONG to bitops.h which is not
exported to userspace.
This patch unbreaks compiling qemu on hppa/parisc.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org>
Stable patches:
- NFSv4.1: Fix a deadlock in layoutget
- NFSv4 must not bump sequence ids on NFS4ERR_MOVED errors
- NFSv4 Fix a regression with OPEN EXCLUSIVE4 mode
- Fix a memory leak when removing the SUNRPC module
Bugfixes:
- Fix a reference leak in _pnfs_return_layout
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYjM6UAAoJEGcL54qWCgDy+WoP/2NdwqzOa3xxn8piMhLh/TLJ
lxWfVmUZJ8UNVSVl45NDslt2V63P0gpITUiaVo27rPfjKV8pv8XTU7HkxCuXzj0m
uo11jEErsYxbXELNDL2y1fmsPTygYvg4MAV3NK0inKv/ciu6TTMddoga+QQXGJPz
QqQ7znTHyjDSTqZM1D5+f8EhbY7AIAzJbkgVr90WPPjsT6YtdlTQpMb1O9tlDtfj
oYidIOxRphqLOYTdoD8NW4x19dlgTNIYyCgOFomCbpUTFKKK5RFjVy+bpml3NxV1
oUldw8lhiUE8x/5J/CfbCMt1bPZ3yJeFWNghINOIF7cgGnvKEjw0uO2o8Lk6fMzi
yF2aNoZcNQnyoHHCjsqK8t5dB51abwxlzeElrFB1u73B56CcXB91xEP0GsX1vFt1
BIGCnfsOvtYPZxp/OX+B3AweAJ2DRuIvdze9cS2IeHBW9c5XdVdDmMK7CFeL9dc3
YEsWmaWNWtDt41erxCvQ0ezkr3lPHXOcjwS440LIiVpr38/rzjp81+Tf+i2D1GzB
XTsy9IHzaC2zLb0Vz5vOR3S10k4OiFtXT6ikoZLwgpTfI2gt146e0YHx3g90AiFh
QYBRNLJrjSYVedjv2GGFLG1WWsHbhbP6dpmT3BwNmhZTzjPoqZ76f5QFIQUya4jt
4Y2ksyhx7pOeOJy++6SN
=S5qH
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Stable patches:
- NFSv4.1: Fix a deadlock in layoutget
- NFSv4 must not bump sequence ids on NFS4ERR_MOVED errors
- NFSv4 Fix a regression with OPEN EXCLUSIVE4 mode
- Fix a memory leak when removing the SUNRPC module
Bugfixes:
- Fix a reference leak in _pnfs_return_layout"
* tag 'nfs-for-4.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
pNFS: Fix a reference leak in _pnfs_return_layout
nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"
SUNRPC: cleanup ida information when removing sunrpc module
NFSv4.0: always send mode in SETATTR after EXCLUSIVE4
nfs: Don't increment lock sequence ID after NFS4ERR_MOVED
NFSv4.1: Fix a deadlock in layoutget
Pull MD fixes from Shaohua Li:
"This fixes several corner cases for raid5 cache, which is merged into
this cycle"
* tag 'md/4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md/r5cache: disable write back for degraded array
md/r5cache: shift complex rmw from read path to write path
md/r5cache: flush data only stripes in r5l_recovery_log()
md/raid5: move comment of fetch_block to right location
md/r5cache: read data into orig_page for prexor of cached data
md/raid5-cache: delete meaningless code
is not currently handled
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYjHDtAAoJEGvWsS0AyF7x8dAP/1WL/oDMrymUMymL94Y/79oh
dzMcfsrDCsN7alxs5remTYUtcmkc0LsoBqhzkgA42tj80mfXDSUlUuFtG7iBrnTH
JcMb4VUlzdxHQjmyAGV35JTzVS9dlpyh1inP7OXqlJ+dcD4IqaoLj6eYdE4YQjCE
ERsQw+TR0sIyaA9GnfZ1iTuWp1CKHv4128uMqJrMT3yPb0mU6XuAnkgwfqOdLdz9
YlSABW/vRICGlYdymSURnIOdeuEevspZhqXFJOe0Oq3ML6Worox5dpXY924llHrp
O3synTDjunWFDFDB2pugefdbG5Wbq4D4egkhikrZ4r99K1/GRaACzfxPeZB5cGGI
U4n4fdxblkrSiWBi57CuyBSIarDvMpyzbypo/z/RlFjw5abv7bmTLbj85KL1g/C3
Y7gcE8B3IPDS3KlgOp+N/A3DFQc4xHqEKE1kgAoe2RBYhybQQLzvtKF5aI9zI2xr
8at9i5oWjTAIDXaNR2q9xIuvk1aR3QNavS2y/Gzd+UlhP0M3iLXugZca96Lva07r
de4m2ubqrZlvAwQtyJRuV6khLnrT3Cqc6RNWqLXfJKg0awb0or9/wilQPB0AnPkX
OrvF0m6WJvcqecdzRhwahQhJhtnprKOOrFzlgO434/OUgI0yD7apVqGeZP0p+If8
YDEE3Avx4M6u9i1UcOyz
=syme
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Fix kernel panic on ACPI-based systems where CPU capacity description
is not currently handled"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: skip register_cpufreq_notifier on ACPI-based systems
- Fix for unaligned access emulation corner case
- fix for udelay loop inline asm regression
- Fix irq affinity finally for AXS103 board [Yuriy]
- Final fixes for setting IO-coherency sanely in SMP
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYi8faAAoJEGnX8d3iisJeU5wP/A1OM9S6TOCe4Ikguhp8UDgS
UaGXboVpAAJNr6B+NfzvUbxi3VybgqooRM6tEU33A4eDzmfCTKBxMbvvE77rOuPC
CC/8wxQ87wDqU+6GGL5DoTTc4rrxnNoEK6MOh2/Rv76idEi8Z2eo4INfm6FkBjGY
+2WkuSpegBIxKrvFmBK9JC7cYaR3KBGSxW3rjLaK6xn0yt+LT1LQKJ/4OFLke7zy
HWNVVnXnyJhn6z7v+Bum3hjnA8bhnDEVKAM0XgD/8TtjkhS6aod4HS2mjvcKnGc/
vyAk3B4fgq0EKFXX0IyHNnWI92gCnACWs8sHAQ/UBB4GHf0z1ScoihJeE4VAqRS4
kJ6SKAIHEtwY1nQV5GgVE2ddMgTDEXwAKCna99ejoUMMmQZFaRy80YuuSMLIjEuz
H16oPkSzJDsR6Z9ocoC5mATWnHxsFZsCAX72u88X+ylaJmBziF2VmjaKRIXB8Psn
YVz02U1YlONJTesEI7lnLD3fwx6pzu/XJjNTe4saiJFAoxaGuPbjRg6sJq3URDj6
3CJ89OFRbgx78jbk7QpYlc6m9SdTM2F5T7ICi6yxElok1dn+i+UQhnBXCinIvTcL
5w9IKA/9qeqjyT76vy1cPY2KlSDGj0kqjI+63IzpGtTScRuyflA5S1CvqyNeUbQq
Vrtw/IRz5pvS7iaCDwuc
=u+Jp
-----END PGP SIGNATURE-----
Merge tag 'arc-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"Hopefully last set of changes for ARC for 4.10:
- fix for unaligned access emulation corner case
- fix for udelay loop inline asm regression
- fix irq affinity finally for AXS103 board [Yuriy]
- final fixes for setting IO-coherency sanely in SMP"
* tag 'arc-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [arcompact] handle unaligned access delay slot corner case
ARCv2: smp-boot: wake_flag polling by non-Masters needs to be uncached
ARC: smp-boot: Decouple Non masters waiting API from jump to entry point
ARCv2: MCIP: update the BCR per current changes
ARC: udelay: fix inline assembler by adding LP_COUNT to clobber list
ARCv2: MCIP: Deprecate setting of affinity in Device Tree
Pull networking fixes from David Miller:
1) GTP fixes from Andreas Schultz (missing genl module alias, clear IP
DF on transmit).
2) Netfilter needs to reflect the fwmark when sending resets, from Pau
Espin Pedrol.
3) nftable dump OOPS fix from Liping Zhang.
4) Fix erroneous setting of VIRTIO_NET_HDR_F_DATA_VALID on transmit,
from Rolf Neugebauer.
5) Fix build error of ipt_CLUSTERIP when procfs is disabled, from Arnd
Bergmann.
6) Fix regression in handling of NETIF_F_SG in harmonize_features(),
from Eric Dumazet.
7) Fix RTNL deadlock wrt. lwtunnel module loading, from David Ahern.
8) tcp_fastopen_create_child() needs to setup tp->max_window, from
Alexey Kodanev.
9) Missing kmemdup() failure check in ipv6 segment routing code, from
Eric Dumazet.
10) Don't execute unix_bind() under the bindlock, otherwise we deadlock
with splice. From WANG Cong.
11) ip6_tnl_parse_tlv_enc_lim() potentially reallocates the skb buffer,
therefore callers must reload cached header pointers into that skb.
Fix from Eric Dumazet.
12) Fix various bugs in legacy IRQ fallback handling in alx driver, from
Tobias Regnery.
13) Do not allow lwtunnel drivers to be unloaded while they are
referenced by active instances, from Robert Shearman.
14) Fix truncated PHY LED trigger names, from Geert Uytterhoeven.
15) Fix a few regressions from virtio_net XDP support, from John
Fastabend and Jakub Kicinski.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (102 commits)
ISDN: eicon: silence misleading array-bounds warning
net: phy: micrel: add support for KSZ8795
gtp: fix cross netns recv on gtp socket
gtp: clear DF bit on GTP packet tx
gtp: add genl family modules alias
tcp: don't annotate mark on control socket from tcp_v6_send_response()
ravb: unmap descriptors when freeing rings
virtio_net: reject XDP programs using header adjustment
virtio_net: use dev_kfree_skb for small buffer XDP receive
r8152: check rx after napi is enabled
r8152: re-schedule napi for tx
r8152: avoid start_xmit to schedule napi when napi is disabled
r8152: avoid start_xmit to call napi_schedule during autosuspend
net: dsa: Bring back device detaching in dsa_slave_suspend()
net: phy: leds: Fix truncated LED trigger names
net: phy: leds: Break dependency of phy.h on phy_led_triggers.h
net: phy: leds: Clear phy_num_led_triggers on failure to avoid crash
net-next: ethernet: mediatek: change the compatible string
Documentation: devicetree: change the mediatek ethernet compatible string
bnxt_en: Fix RTNL lock usage on bnxt_get_port_module_status().
...
- Fix race conditions in the CoW code
- Fix some incorrect input validation checks
- Avoid crashing fs by running out of space when freeing inodes
- Fix toctou race wrt whether or not an inode has an attr
- Fix build error on arm
- Fix page refcount corruption when readahead fails
- Don't corrupt userspace in the bmap ioctl
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJYi4RJAAoJEPh/dxk0SrTrsU4QAIZBUUSpvpwggyfbTcOb6QWb
F/vBoj50f+cxZB2jxLrch4aRsvlhW6IkRnsKiciG4cQaDbjPbjhSMH4JEUUqrPvG
TRTcBvT6rEbxnB3adIH2DVrDAaEEVpRSkPaV/vLjYEfJp8Nyv8yXb6U8zH//NeZQ
Pwrwe0RX0bJRyAEBIRwBnTayMP6xIccCE9Ml7+sMG/UVnb0Fa8t5zg/9igfd0MrB
xf3MdkW0fpFCcp1Bbby8cnDmTjjxEtB6OApL82UnSZ7l2/U5AHhiA0NgHreYuzt8
47ezqEQfk+IWK5LY1c6V/vARKhVvM738jS2dG1tsFhnbTbq9yXA2yiCMdA+sKB7+
wlRIuTq7tuhN4Lk/9eheXHR4xHKDbOKY+zWEWi/AlFRaWmld0otMykVC6wbp6soo
1gYgbaCjJcJcResKYAdby92jqvIRONqknpUF2L0jOiGIPgz8rmjA6BIvymjaXEuO
4MLfSjeVP4Ip2tDcaa0R3dSQ40lP778UQNiuqcKb1WODx1AyljJB+gemK0jE8kwN
OEY7IBSs+wP/UBYN+XbYhoIGlJ4ckwyhIctl4bMvVOduQ40uASlyQS6hmqng5Df/
NIFd+fCwuBCa45mwUJ2LPTzx3WndMyLv30z/ladtshV+WlUbu60yTzT4bIiQDcpZ
CYALhDjBiCHzrs6rIhxW
=Co7t
-----END PGP SIGNATURE-----
Merge tag 'xfs-for-linus-4.10-rc6-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs uodates from Darrick Wong:
"I have some more fixes this week: better input validation, corruption
avoidance, build fixes, memory leak fixes, and a couple from Christoph
to avoid an ENOSPC failure.
Summary:
- Fix race conditions in the CoW code
- Fix some incorrect input validation checks
- Avoid crashing fs by running out of space when freeing inodes
- Fix toctou race wrt whether or not an inode has an attr
- Fix build error on arm
- Fix page refcount corruption when readahead fails
- Don't corrupt userspace in the bmap ioctl"
* tag 'xfs-for-linus-4.10-rc6-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: prevent quotacheck from overloading inode lru
xfs: fix bmv_count confusion w/ shared extents
xfs: clear _XBF_PAGES from buffers when readahead page
xfs: extsize hints are not unlikely in xfs_bmap_btalloc
xfs: remove racy hasattr check from attr ops
xfs: use per-AG reservations for the finobt
xfs: only update mount/resv fields on success in __xfs_ag_resv_init
xfs: verify dirblocklog correctly
xfs: fix COW writeback race
Pull btrfs updates from Chris Mason:
"Some fixes that we've collected from the list.
We still have one more pending to nail down a regression in lzo
compression, but I wanted to get this batch out the door"
* 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: remove ->{get, set}_acl() from btrfs_dir_ro_inode_operations
Btrfs: disable xattr operations on subvolume directories
Btrfs: remove old tree_root case in btrfs_read_locked_inode()
Btrfs: fix truncate down when no_holes feature is enabled
Btrfs: Fix deadlock between direct IO and fast fsync
btrfs: fix false enospc error when truncating heavily reflinked file
Pull block fixes from Jens Axboe:
"A set of fixes for this series. This contains:
- Set of fixes for the nvme target code
- A revert of patch from this merge window, causing a regression with
WRITE_SAME on iSCSI targets at least.
- A fix for a use-after-free in the new O_DIRECT bdev code.
- Two fixes for the xen-blkfront driver"
* 'for-linus' of git://git.kernel.dk/linux-block:
Revert "sd: remove __data_len hack for WRITE SAME"
nvme-fc: use blk_rq_nr_phys_segments
nvmet-rdma: Fix missing dma sync to nvme data structures
nvmet: Call fatal_error from keep-alive timout expiration
nvmet: cancel fatal error and flush async work before free controller
nvmet: delete controllers deletion upon subsystem release
nvmet_fc: correct logic in disconnect queue LS handling
block: fix use after free in __blkdev_direct_IO
xen-blkfront: correct maximum segment accounting
xen-blkfront: feature flags handling adjustments
- Series of iw_cxgb4 fixes to make it work with the drain cq API
- One or two patches each to: srp, iser, cxgb3, vmw_pvrdma, umem, rxe,
and ipoib
- One big series (13 patches) for the new qedr driver
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYi6KyAAoJELgmozMOVy/d9wgP+gN1i3pEvXN+yy8JmQb/kYDT
9k74tbrPY/E3FnxqfWhqIWWSU427/+wXq7phqFxc/w9BNB3by7Ivud1FuHpBWfx5
PoEhQj+TA3gQEHk259+Jdz8OHLMrKEME3zg1tuDUM9vq+e9hfZWyRVXcTFuOZTKg
YoKBpAE4AmSyKCcogn0FNHkm7UxlSZGlIfiVA61I1vPYjV0Z/ePkiXg51Y3RNOUy
x8StD+RVfWHWJTZ7ZKASC26Mm9tLzQxENQ0kJXg24oG0CaAO90iO3oFUM86z5NP8
gFyvdCpsgDSByRtoLhPnUvSmoTTzcLPa65YneINu17MeKiKjZaVmZipwr54pNTw0
++mqnfuEYhOUzjN1naDaK3DihEAj7CoW9Gdx84s9hOggOyURFUrV7DqwJSEBAz02
9VxBWpKsR7NKF7xVdy/oTgoWpO8Nr7DuYyHklaxS6eORci7CD3lqs71uf1XJg48t
b9SfncPxHReC4L6Y9sAUZsseaLsSCy/3bestB5wTqL6iRxjp25fJ7EvK9VLW/gXl
FcRU5pzph1pKDAasb/M8CjUnS4f4LveenALKAcNViuZDh8WEimM/Jlfg5vHbUAr8
rc0z9s/0cgU9AL/ZPaWQb57ZivaS0pXXa1z+ykKvSvj7nYRAX2tsYLzaX8Im2ang
8Rqqy93AdiO4Zq8ypnCq
=d5rm
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"Second round of -rc fixes for 4.10.
This -rc cycle has been slow for the rdma subsystem. I had already
sent you the first batch before the Holiday break. After that, we kept
only getting a few here or there. Up until this week, when I got a
drop of 13 to one driver (qedr). So, here's the -rc patches I have. I
currently have none held in reserve, so unless something new comes in,
this is it until the next merge window opens.
Summary:
- series of iw_cxgb4 fixes to make it work with the drain cq API
- one or two patches each to: srp, iser, cxgb3, vmw_pvrdma, umem,
rxe, and ipoib
- one big series (13 patches) for the new qedr driver"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (27 commits)
RDMA/cma: Fix unknown symbol when CONFIG_IPV6 is not enabled
IB/rxe: Prevent from completer to operate on non valid QP
IB/rxe: Fix rxe dev insertion to rxe_dev_list
IB/umem: Release pid in error and ODP flow
RDMA/qedr: Dispatch port active event from qedr_add
RDMA/qedr: Fix and simplify memory leak in PD alloc
RDMA/qedr: Fix RDMA CM loopback
RDMA/qedr: Fix formatting
RDMA/qedr: Mark three functions as static
RDMA/qedr: Don't reset QP when queues aren't flushed
RDMA/qedr: Don't spam dmesg if QP is in error state
RDMA/qedr: Remove CQ spinlock from CM completion handlers
RDMA/qedr: Return max inline data in QP query result
RDMA/qedr: Return success when not changing QP state
RDMA/qedr: Add uapi header qedr-abi.h
RDMA/qedr: Fix MTU returned from QP query
RDMA/core: Add the function ib_mtu_int_to_enum
IB/vmw_pvrdma: Fix incorrect cleanup on pvrdma_pci_probe error path
IB/vmw_pvrdma: Don't leak info from alloc_ucontext
IB/cxgb3: fix misspelling in header guard
...
Pull s390 fixes from Martin Schwidefsky:
"Another two bug fixes:
- ptrace partial write information leak
- a guest page hinting regression introduced with v4.6"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: Fix cmma unused transfer from pgste into pte
s390/ptrace: Preserve previous registers for short regset write
Pull swiotlb fix from Konrad Rzeszutek Wilk:
"An ARM fix in the Xen SWIOTLB - mainly the translation of physical to
bus addresses was done just a tad too late"
* 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
swiotlb-xen: update dev_addr after swapping pages
- mdev IOMMU groups are not yet compatible with the powerpc SPAPR
IOMMU backend, detect and fail group attach (Greg Kurz)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJYi4JdAAoJECObm247sIsiO1MQAIP4b8wfzw69AXhRVwmucwOr
YAH5qaFdOcNbcun7zSVEgNGSen7GUWxCcbia0TRavNHkT7qeJh7NDJ+WqqOHrYmQ
ct8r6iOBQ/g150n9uLOs+WboKuFPOHB4+Bm4P4RaA+Amfs2mSmGDpreU/lXR00iD
t2PD9hsGWgTDei2dFEtG+eqOLyS9jOuQbTuvpr1UnRT4ogNvSJv7X5kGCrLaBmtN
/X89oogp55hSBErlaNiEOTTE57zeXKSqE3PwslDRDkP8WXvhgG+OQzCUZkRpXLu6
Uy3XD5muIIF98ay173N9WjuswNla7AQgCny5kaPuZ1qFVzWN1ZYCOQeb0bgux0BG
Y6EPJDqvaHVVAs0xflr2khnSVAdsVZbohOpCCv3IDBLvt9C8cEUZcg68SHzURli+
haHYPf4EaF9HjjcKd6z89M3IMQaIlTjTBxorBmu0EIJXPCv6u9KKYDALDYDStoU4
OLg/05P7dmjIvmHvUsqkLb0fzQiZiyczTFeRm2O8pdZp/EQZqzyaGiLujajNoL62
tYR2GZcRwxt6Jburz043X1NoJgcSp31GR+NGsn9fL4ivBEeWQmQZbfjRh1pQgGVg
eua+APa+pws7VrYfocEZrQKY0zvCA67+DONeYJasFPGBqv7Hz9yD8sGyz6Y6B/5t
AY7FtirzOObp4TvFLPUS
=h4To
-----END PGP SIGNATURE-----
Merge tag 'vfio-v4.10-rc6' of git://github.com/awilliam/linux-vfio
Pull VFIO fix from Alex Williamson:
"mdev IOMMU groups are not yet compatible with the powerpc SPAPR IOMMU
backend, detect and fail group attach (Greg Kurz)"
* tag 'vfio-v4.10-rc6' of git://github.com/awilliam/linux-vfio:
vfio/spapr: fail tce_iommu_attach_group() when iommu_data is null
If IPV6 has not been enabled in the underlying kernel, we must avoid
calling IPV6 procedures in rdma_cm.ko.
This requires using "IS_ENABLED(CONFIG_IPV6)" in "if" statements
surrounding any code which calls external IPV6 procedures.
In the instance fixed here, procedure cma_bind_addr() called
ipv6_addr_type() -- which resulted in calling external procedure
__ipv6_addr_type().
Fixes: 6c26a77124 ("RDMA/cma: fix IPv6 address resolution")
Cc: <stable@vger.kernel.org> # v4.2+
Cc: Spencer Baugh <sbaugh@catern.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Konrad writes:
Please pull in your 'for-linus' branch two little fixes for Xen
block front:
One fix is for handling the XEN_PAGE_SIZE != PAGE_SIZE (4KB vs 64KB
on ARM for example) mishandling while the other is fixing
the accounting for the configuration changes.
After emulating an unaligned access in delay slot of a branch, we
pretend as the delay slot never happened - so return back to actual
branch target (or next PC if branch was not taken).
Curently we did this by handling STATUS32.DE, we also need to clear the
BTA.T bit, which is disregarded when returning from original misaligned
exception, but could cause weirdness if it took the interrupt return
path (in case interrupt was acive too)
One ARC700 customer ran into this when enabling unaligned access fixup
for kernel mode accesses as well
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYixa5AAoJEAhfPr2O5OEVxEIP/RdOWmeLxBW0AT8sRlvlw43K
ecFIq0WTIGr6w7VJPm2uHIcw9sdDnwQgozbp5di4KkDLH13aNoJYjMknpl13oPLM
KIBUvIAz7WG+/k3Iq45RLZ7eovrWzGSOTRGHOAitidDBmuFofkLslfzlJKhnNacB
PhSgzQsjK0GrBj0z8Ud/dCSWSR1EU67Uq+pdr+eFGJpMtEicQipnj36BkDMECW02
12wGht0NW7eDMILrNXCwfBYqsFLKfgJqhypoJLci1rnnrGtuMVmlphZJSfPC6Riz
6D6mbgNaYcSAQq8VsHW8L/T5EMio/TNCkapY3LjeUOp/qxTk486/sYWg22fN0qvg
Y1lCU1s5jY0Wu9TBdb6ty/dQ52f7gG9SGjRupRNYCYIVJI0V7UQdWxIrr+uqAkH/
Ha/n7q7oTT6RxjmhHB+qAOspeWI31sGzRBhpDve9R+6MQa366aek14SpuG+4OdUN
7As7Lih9sagi90tB+J2B9PTeuZnr4XaZB3Z+t2vaBb19NG4/vpHYSuC0ymp0mzLh
9hnxtfZCsEOYLSRTWRz5PJmAK2t5j26C5Pb3qeRA40rNbKdSyTj4ntJ/VLUaiFg5
m7kJ222GvdUTeuBLhCvzkjk1sCt1YRvaqQkmLrzpynJnzojuS4SXeWWmdygQIZwq
cQC5lcwMFM/x1MmzBAWI
=r51U
-----END PGP SIGNATURE-----
Merge tag 'media/v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- fix a regression on tvp5150 causing failures at input selection and
image glitches
- CEC was moved out of staging for v4.10. Fix some bugs on it while not
too late
- fix a regression on pctv452e caused by VM stack changes
- fix suspend issued with smiapp
- fix a regression on cobalt driver
- fix some warnings and Kconfig issues with some random configs.
* tag 'media/v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] s5k4ecgx: select CRC32 helper
[media] dvb: avoid warning in dvb_net
[media] v4l: tvp5150: Don't override output pinmuxing at stream on/off time
[media] v4l: tvp5150: Fix comment regarding output pin muxing
[media] v4l: tvp5150: Reset device at probe time, not in get/set format handlers
[media] pctv452e: move buffer to heap, no mutex
[media] media/cobalt: use pci_irq_allocate_vectors
[media] cec: fix race between configuring and unconfiguring
[media] cec: move cec_report_phys_addr into cec_config_thread_func
[media] cec: replace cec_report_features by cec_fill_msg_report_features
[media] cec: update log_addr[] before finishing configuration
[media] cec: CEC_MSG_GIVE_FEATURES should abort for CEC version < 2
[media] cec: when canceling a message, don't overwrite old status info
[media] cec: fix report_current_latency
[media] smiapp: Make suspend and resume functions __maybe_unused
[media] smiapp: Implement power-on and power-off sequences without runtime PM
Pull thermal management fix from Zhang Rui:
"A single revert from a recently introduced problem.
Specifics:
Commit 7611fb6806 ("thermal: thermal_hwmon: Convert to
hwmon_device_register_with_info()"), which was introduced in 4.10-rc5,
uses new hwmon API. But this breaks some soc thermal driver because
the new hwmon API has a strict rule for the hwmon device name. Revert
the offending commit as a quick solution for 4.10"
* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
Revert "thermal: thermal_hwmon: Convert to hwmon_device_register_with_info()"
Quotacheck runs at mount time in situations where quota accounting must
be recalculated. In doing so, it uses bulkstat to visit every inode in
the filesystem. Historically, every inode processed during quotacheck
was released and immediately tagged for reclaim because quotacheck runs
before the superblock is marked active by the VFS. In other words,
the final iput() lead to an immediate ->destroy_inode() call, which
allowed the XFS background reclaim worker to start reclaiming inodes.
Commit 17c12bcd3 ("xfs: when replaying bmap operations, don't let
unlinked inodes get reaped") marks the XFS superblock active sooner as
part of the mount process to support caching inodes processed during log
recovery. This occurs before quotacheck and thus means all inodes
processed by quotacheck are inserted to the LRU on release. The
s_umount lock is held until the mount has completed and thus prevents
the shrinkers from operating on the sb. This means that quotacheck can
excessively populate the inode LRU and lead to OOM conditions on systems
without sufficient RAM.
Update the quotacheck bulkstat handler to set XFS_IGET_DONTCACHE on
inodes processed by quotacheck. This causes ->drop_inode() to return 1
and in turn causes iput_final() to evict the inode. This preserves the
original quotacheck behavior and prevents it from overloading the LRU
and running out of memory.
CC: stable@vger.kernel.org # v4.9
Reported-by: Martin Svec <martin.svec@zoner.cz>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
With some gcc versions, we get a warning about the eicon driver,
and that currently shows up as the only remaining warning in one
of the build bots:
In file included from ../drivers/isdn/hardware/eicon/message.c:30:0:
eicon/message.c: In function 'mixer_notify_update':
eicon/platform.h:333:18: warning: array subscript is above array bounds [-Warray-bounds]
The code is easily changed to open-code the unusual PUT_WORD() line
causing this to avoid the warning.
Cc: stable@vger.kernel.org
Link: http://arm-soc.lixom.net/buildlogs/stable-rc/v4.4.45/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>