Currently the error from arch_uprobe_post_xol() is silently ignored.
This doesn't look good and this can lead to the hard-to-debug problems.
1. Change handle_singlestep() to loudly complain and send SIGILL.
Note: this only affects x86, ppc/arm can't fail.
2. Change arch_uprobe_post_xol() to call arch_uprobe_abort_xol() and
avoid TF games if it is going to return an error.
This can help to to analyze the problem, if nothing else we should
not report ->ip = xol_slot in the core-file.
Note: this means that handle_riprel_post_xol() can be called twice,
but this is fine because it is idempotent.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
arch_uprobe_analyze_insn() calls handle_riprel_insn() at the start,
but only "0xff" and "default" cases need the UPROBE_FIX_RIP_ logic.
Move the callsite into "default" case and change the "0xff" case to
fall-through.
We are going to add the various hooks to handle the rip-relative
jmp/call instructions (and more), we need this change to enforce the
fact that the new code can not conflict with is_riprel_insn() logic
which, after this change, can only be used by default_xol_ops.
Note: arch_uprobe_abort_xol() still calls handle_riprel_post_xol()
directly. This is fine unless another _xol_ops we may add later will
need to reuse "UPROBE_FIX_RIP_AX|UPROBE_FIX_RIP_CX" bits in ->fixup.
In this case we can add uprobe_xol_ops->abort() hook, which (perhaps)
we will need anyway in the long term.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Introduce arch_uprobe->ops pointing to the "struct uprobe_xol_ops",
move the current UPROBE_FIX_{RIP*,IP,CALL} code into the default
set of methods and change arch_uprobe_pre/post_xol() accordingly.
This way we can add the new uprobe_xol_ops's to handle the insns
which need the special processing (rip-relative jmp/call at least).
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
No functional changes. Preparation to simplify the review of the next
change. Just reorder the code in arch_uprobe_pre/post_xol() functions
so that UPROBE_FIX_{RIP_*,IP,CALL} logic goes to the end.
Also change arch_uprobe_pre_xol() to use utask instead of autask, to
make the code more symmetrical with arch_uprobe_post_xol().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cosmetic. Move pre_xol_rip_insn() and handle_riprel_post_xol() up to
the closely related handle_riprel_insn(). This way it is simpler to
read and understand this code, and this lessens the number of ifdef's.
While at it, update the comment in handle_riprel_post_xol() as Jim
suggested.
TODO: rename them somehow to make the naming consistent.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Kill the "mm->context.ia32_compat" check in handle_riprel_insn(), if
it is true insn_rip_relative() must return false. validate_insn_bits()
passed "ia32_compat" as !x86_64 to insn_init(), and insn_rip_relative()
checks insn->x86_64.
Also, remove the no longer needed "struct mm_struct *mm" argument and
the unnecessary "return" at the end.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
No functional changes, preparation.
Shift the code from prepare_fixups() to arch_uprobe_analyze_insn()
with the following modifications:
- Do not call insn_get_opcode() again, it was already called
by validate_insn_bits().
- Move "case 0xea" up. This way "case 0xff" can fall through
to default case.
- change "case 0xff" to use the nested "switch (MODRM_REG)",
this way the code looks a bit simpler.
- Make the comments look consistent.
While at it, kill the initialization of rip_rela_target_address and
->fixups, we can rely on kzalloc(). We will add the new members into
arch_uprobe, it would be better to assume that everything is zero by
default.
TODO: cleanup/fix the mess in validate_insn_bits() paths:
- validate_insn_64bits() and validate_insn_32bits() should be
unified.
- "ifdef" is not used consistently; if good_insns_64 depends
on CONFIG_X86_64, then probably good_insns_32 should depend
on CONFIG_X86_32/EMULATION
- the usage of mm->context.ia32_compat looks wrong if the task
is TIF_X32.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
UPROBE_COPY_INSN, UPROBE_SKIP_SSTEP, and uprobe->flags must die. This
patch kills UPROBE_SKIP_SSTEP. I never understood why it was added;
not only it doesn't help, it harms.
It can only help to avoid arch_uprobe_skip_sstep() if it was already
called before and failed. But this is ugly, if we want to know whether
we can emulate this instruction or not we should do this analysis in
arch_uprobe_analyze_insn(), not when we hit this probe for the first
time.
And in fact this logic is simply wrong. arch_uprobe_skip_sstep() can
fail or not depending on the task/register state, if this insn can be
emulated but, say, put_user() fails we need to xol it this time, but
this doesn't mean we shouldn't try to emulate it when this or another
thread hits this bp next time.
And this is the actual reason for this change. We need to emulate the
"call" insn, but push(return-address) can obviously fail.
Per-arch notes:
x86: __skip_sstep() can only emulate "rep;nop". With this
change it will be called every time and most probably
for no reason.
This will be fixed by the next changes. We need to
change this suboptimal code anyway.
arm: Should not be affected. It has its own "bool simulate"
flag checked in arch_uprobe_skip_sstep().
ppc: Looks like, it can emulate almost everything. Does it
actually need to record the fact that emulate_step()
failed? Hopefully not. But if yes, it can add the ppc-
specific flag into arch_uprobe.
TODO: rename arch_uprobe_skip_sstep() to arch_uprobe_emulate_insn(),
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: David A. Long <dave.long@linaro.org>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
User visible:
. Add --percentage option to control absolute/relative percentage output (Namhyung Kim)
Developer stuff:
. Add --list-cmds to 'kmem', 'mem', 'lock' and 'sched', for use by completion scripts (Ramkumar Ramachandra)
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTTsj7AAoJEPZqUSBWB3s98OEP/AzBJviyJSRqf/DNxMZpnuyT
E1Pd9R8IUT50sWGklEYZcjeCG8JPF46sHcuf5v/tXCwni4u0muJ2imcOJzBKf4it
XVpXA+E7bNt/oAr+8RVcZ0mhusX2DqafCZOVRTnuUv5y5JG7cWd9fC8qW2Hu6oYe
SOfjNRpwi9tV6aCkJUMbJT6SaW+rMGSxHLxv77KK/uP50Ekd/H06TyVspsJRfFeB
rckEIm8p3svQWX7EAHs09p7BxAarTbrK0TWzgfL3YEnCee/y0y9xxN32Pdc8fGj5
bKx6uRe5HWDcZZZvDIIDQeZLECwBu4xJvQPeRE9AeDs8NgXyvnNvljED+RSXcQ/F
zxnxYmviZSz/15B8fQ3TWIv+fd1rsc7hnbnmY1U+vPWdRgsc0spTFKZiKtDkLIBq
jZI+vwhgznDCjajBdKTAVQj6lErupP9ML6HIRcQPWu1YHoDbPiM+ZZ0xskIcsOuE
AfMAu5reYXW4bPGIA80AEEKLRjTjxcYlNadyQOHYZO/fE+SPs7iW+c/sm/+boN+r
dWX8NeK6Nfz45wunBIxvXaCM/MnBvTuH657fZQlKuf43radnrduZRjf/Q+yDUEfQ
VovatgZlK7PRJiIfBUfYfG+ov1cr0F9qJKBYyC8fIrNtWNsvfIwFd3ubkOQBW/qG
Q0PYdJPNL0furiI0VWO0
=cjaT
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core
Pull perf/core improvements and fixes from Jiri Olsa:
User visible changes:
* Add --percentage option to control absolute/relative percentage output (Namhyung Kim)
Plumbing changes:
* Add --list-cmds to 'kmem', 'mem', 'lock' and 'sched', for use by completion scripts (Ramkumar Ramachandra)
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add hist.percentage option for setting default value of the
symbol_conf.filter_relative. It affects the output of various perf
commands (like perf report, top and diff) only if filter(s) applied.
An user can write .perfconfig file like below to show absolute
percentage of filtered entries by default:
$ cat ~/.perfconfig
[hist]
percentage = absolute
And it can be changed through command line:
$ perf report --percentage relative
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-6-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The --percentage option is for controlling overhead percentage
displayed. It can only receive either of "relative" or "absolute" and
affects -c delta output only.
For more information, please see previous commit same thing done to
"perf report".
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-5-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The --percentage option is for controlling overhead percentage
displayed. It can only receive either of "relative" or "absolute".
Move the parser callback function into a common location since it's
used by multiple commands now.
For more information, please see previous commit same thing done to
"perf report".
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The --percentage option is for controlling overhead percentage
displayed. It can only receive either of "relative" or "absolute".
"relative" means it's relative to filtered entries only so that the
sum of shown entries will be always 100%. "absolute" means it retains
the original value before and after the filter is applied.
$ perf report -s comm
# Overhead Command
# ........ ............
#
74.19% cc1
7.61% gcc
6.11% as
4.35% sh
4.14% make
1.13% fixdep
...
$ perf report -s comm -c cc1,gcc --percentage absolute
# Overhead Command
# ........ ............
#
74.19% cc1
7.61% gcc
$ perf report -s comm -c cc1,gcc --percentage relative
# Overhead Command
# ........ ............
#
90.69% cc1
9.31% gcc
Note that it has zero effect if no filter was applied.
Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-3-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
When filtering by thread, dso or symbol on TUI it also update total
period so that the output shows different result than no filter - the
percentage changed to relative to filtered entries only. Sometimes
this is not desired since users might expect same results with filter.
So new filtered_* fields to hists->stats to count them separately.
They'll be controlled/used by user later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
perf stat did initialize the stats structure used to compute
stddev etc. incorrectly. It merely zeroes it. But one member
(min) needs to be set to a non zero value. This causes min
to be not computed at all. Call init_stats() correctly.
It doesn't matter for stat currently because it doesn't use
min, but it's still better to do it correctly.
The other users of statistics are already correct.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1395768699-16060-1-git-send-email-andi@firstfloor.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Currently,
$ perf bench numa mem
errors out with usage information. To make this more user-friendly, let
us provide a minimum set of default values required for a test
run. As an added bonus,
$ perf bench all
now goes all the way to completion.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1395964219-22173-2-git-send-email-artagnon@gmail.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
At the end of
$ perf bench all
the program segfaults because it attempts to dereference a NULL
pointer. Fix this fault.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1395964219-22173-4-git-send-email-artagnon@gmail.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The dwarf_getcfi() only checks .debug_frame section for CFI, but as
most binaries only have .eh_frame it'd return NULL and it makes
some variables inaccessible.
Using dwarf_getcfi_elf (along with dwarf_getelf()) allows to show and
add probe to more variables.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/1396854348-9296-1-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
As Namhyung reported(https://lkml.org/lkml/2014/4/1/89),
current perf-probe -L option doesn't handle errors in line-range
searching correctly. It causes a SEGV if an error occured in the
line-range searching.
----
$ perf probe -x ./perf -v -L map__load
Open Debuginfo file: /home/namhyung/project/linux/tools/perf/perf
fname: util/map.c, lineno:153
New line range: 153 to 2147483647
path: (null)
Segmentation fault (core dumped)
----
This is because line_range_inline_cb() ignores errors
from find_line_range_by_line() which means that lr->path is
already freed on the error path in find_line_range_by_line().
As a result, get_real_path() accesses the lr->path and it
causes a NULL pointer exception.
This fixes line_range_inline_cb() to handle the error correctly,
and report it to the caller.
Anyway, this just fixes a possible SEGV bug, Namhyung's patch
is also required.
Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20140402054831.19080.27006.stgit@ltc230.yrl.intra.hitachi.co.jp
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The commit 5a62257a3d ("perf probe: Replace line_list with
intlist") replaced line_list to intlist but it has a problem that if a
same line was added again, it'd return -EEXIST rather than 1.
Since line_range_walk_cb() only checks the result being negative, it
resulted in failure or segfault sometimes.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/1396327677-3657-1-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The Makefile logic sets FEATURE_CHECKS_CFLAGS-libdw-dwarf-unwind and
FEATURE_CHECKS_LDFLAGS-libdw-dwarf-unwind only if LIBDW_DIR is
defined. This means that under a normal setup,
$ make NO_LIBUNWIND=1
won't automatically pick up libdw. Fix this.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1395873845-466-1-git-send-email-artagnon@gmail.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Leaving ghostprotocols.net for old networking stuff.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-jott6d40nkjjc3vvh3vw53lp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
I.e. do the same as when NO_LIBELF is explicitely passed in the 'make'
command line, fixing this:
Auto-detecting system features:
... dwarf: [ OFF ]
... glibc: [ on ]
... gtk2: [ OFF ]
... libaudit: [ OFF ]
... libbfd: [ OFF ]
... libelf: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... DWARF post unwind library: libdw
<SNIP>
CC /tmp/build/perf/util/symbol-minimal.o
CC /tmp/build/perf/util/unwind-libdw.o
arch/x86/util/unwind-libdw.c:1:30: fatal error: elfutils/libdwfl.h: No such file or directory
compilation terminated.
CC /tmp/build/perf/tests/keep-tracking.o
util/unwind-libdw.c:2:28: fatal error: elfutils/libdw.h: No such file or directory
compilation terminated.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-e39j1yxanltjx4t0msse63ax@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The patch 3a3ffa2e82 ("tools lib traceevent: Report better error
message on bad function args") added the error message but it seems
there's no reason to call warning() directly.
So change it to do_warning_event() to provide event information too.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1395192174-26273-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
It's sometimes useful to know where the parse failure was occurred. Add
do_warning_event() macro to see the failing event.
It now shows the messages like below:
$ perf test 5
5: parse events tests : Warning: [kvmmmu:kvm_mmu_get_page] bad op token {
Warning: [kvmmmu:kvm_mmu_sync_page] bad op token {
Warning: [kvmmmu:kvm_mmu_unsync_page] bad op token {
Warning: [kvmmmu:kvm_mmu_prepare_zap_page] bad op token {
Warning: [kvmmmu:fast_page_fault] function is_writable_pte not defined
Warning: [xen:xen_mmu_ptep_modify_prot_commit] function sizeof not defined
Warning: [xen:xen_mmu_ptep_modify_prot_start] function sizeof not defined
Warning: [xen:xen_mmu_set_pgd] function sizeof not defined
Warning: [xen:xen_mmu_set_pud] function sizeof not defined
Warning: [xen:xen_mmu_set_pmd] function sizeof not defined
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1395192174-26273-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
On perf top, the -s option is used for --sort, but the man page
contains invalid documentation of -s option for --sym-annotate.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1395193578-27098-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Some versions of gcc even warn about it:
mm/shmem.c: In function ‘shmem_file_aio_read’:
mm/shmem.c:1414: warning: ‘error’ may be used uninitialized in this function
If the loop is aborted during the first iteration by one of the two
first break statements, error will be uninitialized.
Introduced by commit 6e58e79db8 ("introduce copy_page_to_iter, kill
loop over iovec in generic_file_aio_read()").
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On 32 bit, size_t is "unsigned int", not "unsigned long", causing the
following warning when comparing with PAGE_SIZE, which is always "unsigned
long":
fs/cifs/file.c: In function ‘cifs_readdata_to_iov’:
fs/cifs/file.c:2757: warning: comparison of distinct pointer types lacks a cast
Introduced by commit 7f25bba819 ("cifs_iovec_read: keep iov_iter
between the calls of cifs_readdata_to_iov()"), which changed the
signedness of "remaining" and the code from min_t() to min().
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull slab changes from Pekka Enberg:
"The biggest change is byte-sized freelist indices which reduces slab
freelist memory usage:
https://lkml.org/lkml/2013/12/2/64"
* 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
mm: slab/slub: use page->list consistently instead of page->lru
mm/slab.c: cleanup outdated comments and unify variables naming
slab: fix wrongly used macro
slub: fix high order page allocation problem with __GFP_NOFAIL
slab: Make allocations with GFP_ZERO slightly more efficient
slab: make more slab management structure off the slab
slab: introduce byte sized index for the freelist of a slab
slab: restrict the number of objects in a slab
slab: introduce helper functions to get/set free object
slab: factor out calculate nr objects in cache_estimate
Pull misc kbuild changes from Michal Marek:
"Here is the non-critical part of kbuild:
- One bogus coccinelle check removed, one check fixed not to suggest
the obsolete PTR_RET macro
- scripts/tags.sh does not index the generated *.mod.c files
- new objdiff tool to list differences between two versions of an
object file
- A fix for scripts/bootgraph.pl"
* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
scripts/coccinelle: Use PTR_ERR_OR_ZERO
scripts/bootgraph.pl: Add graphic header
scripts: objdiff: detect object code changes between two commits
Coccicheck: Remove memcpy to struct assignment test
scripts/tags.sh: Ignore *.mod.c
This patch fixes I/O errors with the sym53c8xx_2 driver when the disk
returns QUEUE FULL status.
When the controller encounters an error (including QUEUE FULL or BUSY
status), it aborts all not yet submitted requests in the function
sym_dequeue_from_squeue.
This function aborts them with DID_SOFT_ERROR.
If the disk has full tag queue, the request that caused the overflow is
aborted with QUEUE FULL status (and the scsi midlayer properly retries
it until it is accepted by the disk), but the sym53c8xx_2 driver aborts
the following requests with DID_SOFT_ERROR --- for them, the midlayer
does just a few retries and then signals the error up to sd.
The result is that disk returning QUEUE FULL causes request failures.
The error was reproduced on 53c895 with COMPAQ BD03685A24 disk
(rebranded ST336607LC) with command queue 48 or 64 tags. The disk has
64 tags, but under some access patterns it return QUEUE FULL when there
are less than 64 pending tags. The SCSI specification allows returning
QUEUE FULL anytime and it is up to the host to retry.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 8f619b5429 ("powerpc/ppc64: Do not turn AIL (reloc-on
interrupts) too early") added code to set the AIL bit in the LPCR
without checking whether the kernel is running in hypervisor mode. The
result is that when the kernel is running as a guest (i.e., under
PowerKVM or PowerVM), the processor takes a privileged instruction
interrupt at that point, causing a panic. The visible result is that
the kernel hangs after printing "returning from prom_init".
This fixes it by checking for hypervisor mode being available before
setting LPCR. If we are not in hypervisor mode, we enable relocation-on
interrupts later in pSeries_setup_arch using the H_SET_MODE hcall.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commits 11d4616bd0 ("futex: revert back to the explicit waiter
counting code") and 69cd9eba38 ("futex: avoid race between requeue and
wake") changed some of the finer details of how we think about futexes.
One was a late fix and the other a consequence of overlooking the whole
requeuing logic.
The first change caused our documentation to be incorrect, and the
second made us aware that we need to explicitly add more details to it.
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull yet more networking updates from David Miller:
1) Various fixes to the new Redpine Signals wireless driver, from
Fariya Fatima.
2) L2TP PPP connect code takes PMTU from the wrong socket, fix from
Dmitry Petukhov.
3) UFO and TSO packets differ in whether they include the protocol
header in gso_size, account for that in skb_gso_transport_seglen().
From Florian Westphal.
4) If VLAN untagging fails, we double free the SKB in the bridging
output path. From Toshiaki Makita.
5) Several call sites of sk->sk_data_ready() were referencing an SKB
just added to the socket receive queue in order to calculate the
second argument via skb->len. This is dangerous because the moment
the skb is added to the receive queue it can be consumed in another
context and freed up.
It turns out also that none of the sk->sk_data_ready()
implementations even care about this second argument.
So just kill it off and thus fix all these use-after-free bugs as a
side effect.
6) Fix inverted test in tcp_v6_send_response(), from Lorenzo Colitti.
7) pktgen needs to do locking properly for LLTX devices, from Daniel
Borkmann.
8) xen-netfront driver initializes TX array entries in RX loop :-) From
Vincenzo Maffione.
9) After refactoring, some tunnel drivers allow a tunnel to be
configured on top itself. Fix from Nicolas Dichtel.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
vti: don't allow to add the same tunnel twice
gre: don't allow to add the same tunnel twice
drivers: net: xen-netfront: fix array initialization bug
pktgen: be friendly to LLTX devices
r8152: check RTL8152_UNPLUG
net: sun4i-emac: add promiscuous support
net/apne: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
net: ipv6: Fix oif in TCP SYN+ACK route lookup.
drivers: net: cpsw: enable interrupts after napi enable and clearing previous interrupts
drivers: net: cpsw: discard all packets received when interface is down
net: Fix use after free by removing length arg from sk_data_ready callbacks.
Drivers: net: hyperv: Address UDP checksum issues
Drivers: net: hyperv: Negotiate suitable ndis version for offload support
Drivers: net: hyperv: Allocate memory for all possible per-pecket information
bridge: Fix double free and memory leak around br_allowed_ingress
bonding: Remove debug_fs files when module init fails
i40evf: program RSS LUT correctly
i40evf: remove open-coded skb_cow_head
ixgb: remove open-coded skb_cow_head
igbvf: remove open-coded skb_cow_head
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iEYEABECAAYFAlNFtEkACgkQuseO5dulBZVcPwCdFWY81hKqQaKHaSPFh9m+n1lt
yY0An2VZZGrFSkj162POFy1P2sPpGw5p
=LRFZ
-----END PGP SIGNATURE-----
Merge tag 'llvmlinux-for-v3.15' of git://git.linuxfoundation.org/llvmlinux/kernel
Pull llvm patches from Behan Webster:
"These are some initial updates to support compiling the kernel with
clang.
These patches have been through the proper reviews to the best of my
ability, and have been soaking in linux-next for a few weeks. These
patches by themselves still do not completely allow clang to be used
with the kernel code, but lay the foundation for other patches which
are still under review.
Several other of the LLVMLinux patches have been already added via
maintainer trees"
* tag 'llvmlinux-for-v3.15' of git://git.linuxfoundation.org/llvmlinux/kernel:
x86: LLVMLinux: Fix "incomplete type const struct x86cpu_device_id"
x86 kbuild: LLVMLinux: More cc-options added for clang
x86, acpi: LLVMLinux: Remove nested functions from Thinkpad ACPI
LLVMLinux: Add support for clang to compiler.h and new compiler-clang.h
LLVMLinux: Remove warning about returning an uninitialized variable
kbuild: LLVMLinux: Fix LINUX_COMPILER definition script for compilation with clang
Documentation: LLVMLinux: Update Documentation/dontdiff
kbuild: LLVMLinux: Adapt warnings for compilation with clang
kbuild: LLVMLinux: Add Kbuild support for building kernel with Clang
Pull SCSI target updates from Nicholas Bellinger:
"Here are the target pending updates for v3.15-rc1. Apologies in
advance for waiting until the second to last day of the merge window
to send these out.
The highlights this round include:
- iser-target support for T10 PI (DIF) offloads (Sagi + Or)
- Fix Task Aborted Status (TAS) handling in target-core (Alex Leung)
- Pass in transport supported PI at session initialization (Sagi + MKP + nab)
- Add WRITE_INSERT + READ_STRIP T10 PI support in target-core (nab + Sagi)
- Fix iscsi-target ERL=2 ASYNC_EVENT connection pointer bug (nab)
- Fix tcm_fc use-after-free of ft_tpg (Andy Grover)
- Use correct ib_sg_dma primitives in ib_isert (Mike Marciniszyn)
Also, note the virtio-scsi + vhost-scsi changes to expose T10 PI
metadata into KVM guest have been left-out for now, as there where a
few comments from MST + Paolo that where not able to be addressed in
time for v3.15. Please expect this feature for v3.16-rc1"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (43 commits)
ib_srpt: Use correct ib_sg_dma primitives
target/tcm_fc: Rename ft_tport_create to ft_tport_get
target/tcm_fc: Rename ft_{add,del}_lport to {add,del}_wwn
target/tcm_fc: Rename structs and list members for clarity
target/tcm_fc: Limit to 1 TPG per wwn
target/tcm_fc: Don't export ft_lport_list
target/tcm_fc: Fix use-after-free of ft_tpg
target: Add check to prevent Abort Task from aborting itself
target: Enable READ_STRIP emulation in target_complete_ok_work
target/sbc: Add sbc_dif_read_strip software emulation
target: Enable WRITE_INSERT emulation in target_execute_cmd
target/sbc: Add sbc_dif_generate software emulation
target/sbc: Only expose PI read_cap16 bits when supported by fabric
target/spc: Only expose PI mode page bits when supported by fabric
target/spc: Only expose PI inquiry bits when supported by fabric
target: Pass in transport supported PI at session initialization
target/iblock: Fix double bioset_integrity_free bug
Target/sbc: Initialize COMPARE_AND_WRITE write_sg scatterlist
target/rd: T10-Dif: RAM disk is allocating more space than required.
iscsi-target: Fix ERL=2 ASYNC_EVENT connection pointer bug
...
Pull media fixes from Mauro Carvalho Chehab:
"A series of bug fix patches for v3.15-rc1. Most are just driver
fixes. There are some changes at remote controller core level, fixing
some definitions on a new API added for Kernel v3.15.
It also adds the missing include at include/uapi/linux/v4l2-common.h,
to allow its compilation on userspace, as pointed by you"
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (24 commits)
[media] gpsca: remove the risk of a division by zero
[media] stk1160: warrant a NUL terminated string
[media] v4l: ti-vpe: retain v4l2_buffer flags for captured buffers
[media] v4l: ti-vpe: Set correct field parameter for output and capture buffers
[media] v4l: ti-vpe: zero out reserved fields in try_fmt
[media] v4l: ti-vpe: Fix initial configuration queue data
[media] v4l: ti-vpe: Use correct bus_info name for the device in querycap
[media] v4l: ti-vpe: report correct capabilities in querycap
[media] v4l: ti-vpe: Allow usage of smaller images
[media] v4l: ti-vpe: Use video_device_release_empty
[media] v4l: ti-vpe: Make sure in job_ready that we have the needed number of dst_bufs
[media] lgdt3305: include sleep functionality in lgdt3304_ops
[media] drx-j: use customise option correctly
[media] m88rs2000: fix sparse static warnings
[media] r820t: fix size and init values
[media] rc-core: remove generic scancode filter
[media] rc-core: split dev->s_filter
[media] rc-core: do not change 32bit NEC scancode format for now
[media] rtl28xxu: remove duplicate ID 0458:707f Genius TVGo DVB-T03
[media] xc2028: add missing break to switch
...
ntb_netdev, a typo, and a leak of msix entries in the error path.
Clean ups of the event handling logic, as well as a overall style
cleanup. Finally, the driver was converted to use the new
pci_enable_msix_range logic (and the refactoring to go along with it).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJTQxbAAAoJEG5mS6x6i9Ijq7oP/0pYUCGKYv3ElQuyatPnDI8q
rSi93Q7WVYpv92y+uhal/TldVR5LvKKnPBf700TwUpj752X1XhlcFCw2TE9kHH1c
VRicAZypLl4+5ob5QhFD7mRl+gEbq3JHLwSPffM26V5dm3mMqm24PRQPm9RN2uzO
MRL/WEgShGM1pJQhIx+Ti+35/PxV67dw/rysvoUtPIuB8DfoIU7UBYEJ9BrzfyCP
AXSjJmrR92ES/HX4w1kcXl+FA6sF/UR/YmnAW9uO6YDhxiIs1RNpiqJEJUettOHI
7SxF4tFv8Pi5Heuf4ASflF1bPOFOLUt698W3+JMWaD0bLgF5IDI9hTvtbfOhy7Ag
zqsRCipZ3dVJ8EE012IA978rjqbr8qrHfQK2jnD0JALWf/t8RHBzo3IjfRu+c6W+
aRlkqu2BxZO0V3tFLoJ7uQrjp23hrERjRJi8s5hjOqwMwgB3SlVsaMVJDkeuR/46
diQTkxjylhlQgdJb6xlSiM948wfcMt5pIFxWQppHFk8ouCOX1+q9u4cEipU+nvT+
dcnc5mqERqMMjsHke3uzsHtG8OlPbbtHjzCKXT1/7NPhUn2x84AsOEkiX+qas5nK
sabgOWZtawz1Nl3x+gKR6OBGtXDdLlj7JTsqphPa3mr3RBqx8v8VVYYtqHVCiy8N
mtc/701fHOKdrBtk54Wb
=vDei
-----END PGP SIGNATURE-----
Merge tag 'ntb-3.15' of git://github.com/jonmason/ntb
Pull PCIe non-transparent bridge fixes and features from Jon Mason:
"NTB driver bug fixes to address issues in list traversal, skb leak in
ntb_netdev, a typo, and a leak of msix entries in the error path.
Clean ups of the event handling logic, as well as a overall style
cleanup. Finally, the driver was converted to use the new
pci_enable_msix_range logic (and the refactoring to go along with it)"
* tag 'ntb-3.15' of git://github.com/jonmason/ntb:
ntb: Use pci_enable_msix_range() instead of pci_enable_msix()
ntb: Split ntb_setup_msix() into separate BWD/SNB routines
ntb: Use pci_msix_vec_count() to obtain number of MSI-Xs
NTB: Code Style Clean-up
NTB: client event cleanup
ntb: Fix leakage of ntb_device::msix_entries[] array
NTB: Fix typo in setting one translation register
ntb_netdev: Fix skb free issue in open
ntb_netdev: Fix list_for_each_entry exit issue