Currently the ceph client doesn't respect the rlimit in fallocate. This
means that a user can allocate a file with size > RLIMIT_FSIZE. This
patch adds the call to inode_newsize_ok() to verify filesystem limits and
ulimits. This should make ceph successfully run xfstest generic/228.
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
when reopen a connection, use 'reconnect seq' to clean up
messages that have already been received by peer.
Link: http://tracker.ceph.com/issues/18690
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Pull ptrace fix from Eric Biederman:
"This fixes a brown paper bag bug. When I fixed the ptrace interaction
with user namespaces I added a new field ptracer_cred in struct_task
and I failed to properly initialize it on fork.
This dangling pointer wound up breaking runing setuid applications run
from the enlightenment window manager.
As this is the worst sort of bug. A regression breaking user space for
no good reason let's get this fixed"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ptrace: Properly initialize ptracer_cred on fork
- sdhci-xenon: Don't free data for phy allocated by devm*
- sdhci-iproc: Suppress spurious interrupts
- cavium: Fix probing race with regulator
- cavium: Prevent crash with incomplete DT
- cavium-octeon: Use proper GPIO name for power control
- cavium-octeon: Fix interrupt enable code
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZJUVSAAoJEP4mhCVzWIwpg9kP/3BlyhoIrg8b+NQbAmDIgAYG
tDhGCo1aVxWFpTB+xXbnSQAF1Bvqz9HSN4tnW5N/xdd8ZpvxszP91XwYn+II8Yvv
R2uGS2BIPXNi4pZISFmJk/xRv4uq8tBKMv9NWMAdJq4Ocv5Lu7jJjRnRke8E/AbQ
V+W3MEnfD6RvsD7Tz94DjLALZEl5/VnOGwLphQW0EyCa1HqBS2ysjHd5RRAmwpTf
5rMWnsGSSh75YlOpFNoEbOWxE5hI2w6g9Wkd4ktE7tqHx+HngEv0dZBrjhsdeaFN
VhqzSPUV0HzG/LW8WraFo8oyR7+FOdgMyP6n1kO3ozpzBO85BmXOqMj1aqr3ygo/
4wFF5YI0RX7a0D/ch7wg7iQ9R5HEb7pFof/ANQQZsjaDI331l8PcAlBGAmqmoeJs
lIiYfn6FPfflNB0IWGQ5UMCZTDWE5n/9nr8DpPY8gc8QBf27RGG7/JKbbc2DLBS3
YCWtIPgLQKSXV1s8w3ljvBQ6vgugmrULAZ18YVPOk/fO4B06XvB/cB4TvXQqZk41
w4feSi7Y4XPCZaUf30x5BZHvsmoylOrn8nns2/J1cZV2MKyWsxtIYrybJHb16jqA
vwv75ObRTSqTIyaTLtIMBHG+zgJRFW/Nqiy4rDKYZ8WfaPnazhGET7uKriBYqy5Z
+q/ZICv3VbcX5yzErIN5
=xVIX
-----END PGP SIGNATURE-----
Merge tag 'mmc-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"A couple of MMC host fixes intended for v4.12 rc3:
- sdhci-xenon: Don't free data for phy allocated by devm*
- sdhci-iproc: Suppress spurious interrupts
- cavium: Fix probing race with regulator
- cavium: Prevent crash with incomplete DT
- cavium-octeon: Use proper GPIO name for power control
- cavium-octeon: Fix interrupt enable code"
* tag 'mmc-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-iproc: suppress spurious interrupt with Multiblock read
mmc: cavium: Fix probing race with regulator
of/platform: Make of_platform_device_destroy globally visible
mmc: cavium: Prevent crash with incomplete DT
mmc: cavium-octeon: Use proper GPIO name for power control
mmc: cavium-octeon: Fix interrupt enable code
mmc: sdhci-xenon: kill xenon_clean_phy()
In the current form of the code, if a->replacementlen is 0, the reference
to *insnbuf for comparison touches potentially garbage memory. While it
doesn't affect the execution flow due to the subsequent a->replacementlen
comparison, it is (rightly) detected as use of uninitialized memory by a
runtime instrumentation currently under my development, and could be
detected as such by other tools in the future, too (e.g. KMSAN).
Fix the "false-positive" by reordering the conditions to first check the
replacement instruction length before referencing specific opcode bytes.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/20170524135500.27223-1-mjurczyk@google.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The compiler doesn't always spot the guard that object is allocated on
the first pass, leading to:
drivers/gpu/drm/i915/selftests/i915_gem_context.c: warning: 'obj' may be used uninitialized in this function [-Wuninitialized]: => 370:8
v2: Make it more obvious by setting obj to NULL on the first pass and
any later pass where we need to reallocate.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 791ff39ae3 ("drm/i915: Live testing for context execution")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
c: <drm-intel-fixes@lists.freedesktop.org> # v4.12-rc1+
Link: http://patchwork.freedesktop.org/patch/msgid/20170523194412.1195-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
(cherry picked from commit ca83d5840c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
This reverts commit bc5ca47c0a.
Gabriel put this back into generic code with
commit 75f6dfe3e6
Author: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Date: Wed Dec 28 12:32:11 2016 -0200
drm: Deduplicate driver initialization message
but somehow he missed Chris' patch to add the message meanwhile.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101025
Fixes: 75f6dfe3e6 ("drm: Deduplicate driver initialization message")
Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.11+
Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517131557.7836-1-daniel.vetter@ffwll.ch
(cherry picked from commit 6bdba81979)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Commit b685d3d65a "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
Fixes: b685d3d65a
CC: reiserfs-devel@vger.kernel.org
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Commit b685d3d65a "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions
Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.
Fixes: b685d3d65a
CC: Steven Whitehouse <swhiteho@redhat.com>
CC: cluster-devel@redhat.com
CC: stable@vger.kernel.org
Acked-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
In the file arch/x86/mm/pat.c, there's a '__pat_enabled' variable. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.
However, on AMD K6-3 CPUs, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.
The result of this bug is that a kernel warning is produced when attempting to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers:
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
...
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
To fix the bug change pat_enabled() so that it returns true only if PAT
initialization was actually done.
Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
this_cpu_has(X86_FEATURE_PAT) in pat_ap_init(), so that we check the PAT
feature on the processor that is being initialized.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1704181501450.26399@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
At least Make 3.82 dislikes the tab in front of the $(warning) function:
arch/x86/Makefile:162: *** recipe commences before first target. Stop.
Let's be gentle.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1944fcd8-e3df-d1f7-c0e4-60aeb1917a24@siemens.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dave Jones and Steven Rostedt reported unwinder warnings like the
following:
WARNING: kernel stack frame pointer at ffff8800bda0ff30 in sshd:1090 has bad value 000055b32abf1fa8
In both cases, the unwinder was attempting to unwind from an ftrace
handler into entry code. The callchain was something like:
syscall entry code
C function
ftrace handler
save_stack_trace()
The problem is that the unwinder's end-of-stack logic gets confused by
the way ftrace lays out the stack frame (with fentry enabled).
I was able to recreate this warning with:
echo call_usermodehelper_exec_async:stacktrace > /sys/kernel/debug/tracing/set_ftrace_filter
(exit login session)
I considered fixing this by changing the ftrace code to rewrite the
stack to make the unwinder happy. But that seemed too intrusive after I
implemented it. Instead, just add another check to the unwinder's
end-of-stack logic to detect this special case.
Side note: We could probably get rid of these end-of-stack checks by
encoding the frame pointer for syscall entry just like we do for
interrupt entry. That would be simpler, but it would also be a lot more
intrusive since it would slightly affect the performance of every
syscall.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: live-patching@vger.kernel.org
Fixes: c32c47c68a ("x86/unwind: Warn on bad frame pointer")
Link: http://lkml.kernel.org/r/671ba22fbc0156b8f7e0cfa5ab2a795e08bc37e1.1495553739.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Petr Mladek reported the following warning when loading the livepatch
sample module:
WARNING: CPU: 1 PID: 3699 at arch/x86/kernel/stacktrace.c:132 save_stack_trace_tsk_reliable+0x133/0x1a0
...
Call Trace:
__schedule+0x273/0x820
schedule+0x36/0x80
kthreadd+0x305/0x310
? kthread_create_on_cpu+0x80/0x80
? icmp_echo.part.32+0x50/0x50
ret_from_fork+0x2c/0x40
That warning means the end of the stack is no longer recognized as such
for newly forked tasks. The problem was introduced with the following
commit:
ff3f7e2475 ("x86/entry: Fix the end of the stack for newly forked tasks")
... which was completely misguided. It only partially fixed the
reported issue, and it introduced another bug in the process. None of
the other entry code saves the frame pointer before calling into C code,
so it doesn't make sense for ret_from_fork to do so either.
Contrary to what I originally thought, the original issue wasn't related
to newly forked tasks. It was actually related to ftrace. When entry
code calls into a function which then calls into an ftrace handler, the
stack frame looks different than normal.
The original issue will be fixed in the unwinder, in a subsequent patch.
Reported-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: live-patching@vger.kernel.org
Fixes: ff3f7e2475 ("x86/entry: Fix the end of the stack for newly forked tasks")
Link: http://lkml.kernel.org/r/f350760f7e82f0750c8d1dd093456eb212751caa.1495553739.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Sync (copy) the following v4.12 kernel headers to the tooling headers:
arch/x86/include/asm/disabled-features.h:
arch/x86/include/uapi/asm/kvm.h:
arch/powerpc/include/uapi/asm/kvm.h:
arch/s390/include/uapi/asm/kvm.h:
arch/arm/include/uapi/asm/kvm.h:
arch/arm64/include/uapi/asm/kvm.h:
- 'struct kvm_sync_regs' got changed in an ABI-incompatible way,
fortunately none of the (in-kernel) tooling relied on it
- new KVM_DEV calls added
arch/x86/include/asm/required-features.h:
- 5-level paging hardware ABI detail added
arch/x86/include/asm/cpufeatures.h:
- new CPU feature added
arch/x86/include/uapi/asm/vmx.h:
- new VMX exit conditions
None of the changes requires fixes in the tooling source code.
This addresses the following warnings:
Warning: include/uapi/linux/stat.h differs from kernel
Warning: arch/x86/include/asm/disabled-features.h differs from kernel
Warning: arch/x86/include/asm/required-features.h differs from kernel
Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel
Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel
Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524065721.j2mlch6bgk5klgbc@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The __hpp__sort_acc() sorts entries using callchain depth in order to
put callers above in children mode. But it assumed the callchain order
was callee-first. Now default (for children) is caller-first so the
order of entries is reverted.
For example, consider following case:
$ perf report --no-children
..l
# Overhead Command Shared Object Symbol
# ........ ....... ................... ..........................
#
99.44% a.out a.out [.] main
|
---main
__libc_start_main
_start
Then children mode should show 'start' above '__libc_start_main' since
it's the caller (parent) of the __libc_start_main. But it's reversed:
# Children Self Command Shared Object Symbol
# ........ ........ ....... ............... .....................
#
99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main
99.61% 0.00% a.out a.out [.] _start
99.54% 99.44% a.out a.out [.] main
This patch fixes it.
# Children Self Command Shared Object Symbol
# ........ ........ ....... ............... .....................
#
99.61% 0.00% a.out a.out [.] _start
99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main
99.54% 99.44% a.out a.out [.] main
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-8-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The very last inlined frame, i.e. the one furthest away from the
non-inlined frame, was silently dropped. This is apparent when
comparing the output of `perf script` and `addr2line`:
~~~~~~
$ perf script --inline
...
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
$ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
0x0000000000000a4a
std::__complex_abs(doublecomplex )
/usr/include/c++/6.3.1/complex:589
double std::abs<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:597
double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:654
double std::norm<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:664
main
/tmp/inlining.cpp:14
~~~~~
Note how `std::__complex_abs` is missing from the `perf script`
output. This is similarly showing up in `perf report`. The patch
here fixes this issue, and the output becomes:
~~~~~
a.out 26722 80836.309329: 72425 cycles:
21561 __hypot_finite (/usr/lib/libm-2.25.so)
ace3 hypot (/usr/lib/libm-2.25.so)
a4a main (a.out)
std::__complex_abs
std::abs<double>
std::_Norm_helper<true>::_S_do_it<double>
std::norm<double>
main
20510 __libc_start_main (/usr/lib/libc-2.25.so)
bd9 _start (a.out)
~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
So far, the inlined nodes where only reversed when we built perf
against libbfd. If that was not available, the addr2line fallback
code path was missing the inline_list__reverse call.
Now we always add the nodes in the correct order within
inline_list__append. This removes the need to reverse the list
and also ensures that all callers construct the list in the right
order.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-6-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.
This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:
~~~~~~~~~~~~~~~
#include <thread>
#include <chrono>
using namespace std;
int main()
{
this_thread::sleep_for(chrono::milliseconds(1000));
this_thread::sleep_for(chrono::milliseconds(100));
this_thread::sleep_for(chrono::milliseconds(10));
return 0;
}
~~~~~~~~~~~~~~~
Now compile and record it:
~~~~~~~~~~~~~~~
g++ -std=c++11 -g -O2 test.cpp
echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
perf record \
--event sched:sched_stat_sleep \
--event sched:sched_process_exit \
--event sched:sched_switch --call-graph=dwarf \
--output perf.data.raw \
./a.out
echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
perf inject --sched-stat --input perf.data.raw --output perf.data
~~~~~~~~~~~~~~~
Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:9
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:10
| __libc_start_main
| _start
|
--0.91%--main test.cpp:13
__libc_start_main
_start
~~~~~~~~~~~~~~~
With this patch here applied, the issue is fixed. The report becomes
much more usable:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:8
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:9
| __libc_start_main
| _start
|
--0.91%--main test.cpp:10
__libc_start_main
_start
~~~~~~~~~~~~~~~
Similarly it works for signal frames:
~~~~~~~~~~~~~~~
__noinline void bar(void)
{
volatile long cnt = 0;
for (cnt = 0; cnt < 100000000; cnt++);
}
__noinline void foo(void)
{
bar();
}
void sig_handler(int sig)
{
foo();
}
int main(void)
{
signal(SIGUSR1, sig_handler);
raise(SIGUSR1);
foo();
return 0;
}
~~~~~~~~~~~~~~~~
Before, the report wrongly points to `signal.c:29` after raise():
~~~~~~~~~~~~~~~~
$ perf report --stdio --no-children -g srcline -s srcline
...
100.00% signal.c:11
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:29
__libc_start_main
_start
~~~~~~~~~~~~~~~~
With this patch in, the issue is fixed and we instead get:
~~~~~~~~~~~~~~~~
100.00% signal signal [.] bar
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:27
__libc_start_main
_start
~~~~~~~~~~~~~~~~
Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is straight-forward thanks
to dwfl_frame_pc(). For libunwind, we replace the functionality via
unw_is_signal_frame() for any but the very first frame.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-4-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a filename was found in addr2line it was duplicated via strdup()
but never freed. Now we pass NULL and handle this gracefully in
addr2line.
Detected by Valgrind:
==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 220
==16331== at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16331== by 0x672FA69: strdup (in /usr/lib/libc-2.25.so)
==16331== by 0x52769F: addr2line (srcline.c:256)
==16331== by 0x52769F: addr2inlines (srcline.c:294)
==16331== by 0x52769F: dso__parse_addr_inlines (srcline.c:502)
==16331== by 0x574D7A: inline__fprintf (hist.c:41)
==16331== by 0x574D7A: ipchain__fprintf_graph (hist.c:147)
==16331== by 0x57518A: __callchain__fprintf_graph (hist.c:212)
==16331== by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337)
==16331== by 0x57738E: hist_entry__fprintf (hist.c:628)
==16331== by 0x57738E: hists__fprintf (hist.c:882)
==16331== by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399)
==16331== by 0x44A20F: report__browse_hists (builtin-report.c:491)
==16331== by 0x44A20F: __cmd_report (builtin-report.c:624)
==16331== by 0x44A20F: cmd_report (builtin-report.c:1054)
==16331== by 0x4A49CE: run_builtin (perf.c:296)
==16331== by 0x4A4CC0: handle_internal_command (perf.c:348)
==16331== by 0x434371: run_argv (perf.c:392)
==16331== by 0x434371: main (perf.c:530)
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-3-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I just hit a segfault when doing `perf report -g srcline`.
Valgrind pointed me at this code as the culprit:
==8359== Invalid read of size 8
==8359== at 0x3096D9: map__rip_2objdump (map.c:430)
==8359== by 0x2FC1A3: match_chain_srcline (callchain.c:645)
==8359== by 0x2FC1A3: match_chain (callchain.c:700)
==8359== by 0x2FC1A3: append_chain (callchain.c:895)
==8359== by 0x2FC1A3: append_chain_children (callchain.c:846)
==8359== by 0x2FF719: callchain_append (callchain.c:944)
==8359== by 0x2FF719: hist_entry__append_callchain (callchain.c:1058)
==8359== by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908)
==8359== by 0x33195C: hist_entry_iter__add (hist.c:1050)
==8359== by 0x258F65: process_sample_event (builtin-report.c:204)
==8359== by 0x30D60C: perf_session__deliver_event (session.c:1310)
==8359== by 0x30D60C: ordered_events__deliver_event (session.c:119)
==8359== by 0x310D12: __ordered_events__flush (ordered-events.c:210)
==8359== by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277)
==8359== by 0x30DD3C: perf_session__process_user_event (session.c:1349)
==8359== by 0x30DD3C: perf_session__process_event (session.c:1475)
==8359== by 0x30FC3C: __perf_session__process_events (session.c:1867)
==8359== by 0x30FC3C: perf_session__process_events (session.c:1921)
==8359== by 0x25A985: __cmd_report (builtin-report.c:575)
==8359== by 0x25A985: cmd_report (builtin-report.c:1054)
==8359== by 0x2B9A80: run_builtin (perf.c:296)
==8359== Address 0x70 is not stack'd, malloc'd or (recently) free'd
This patch fixes the issue.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
[ Remove dependency from another change ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-2-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tweak the Kconfig description to mention support for NSP and make the
default on for iProc based platforms.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Add a missing character in this description for a function.
Acked-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The script "checkpatch.pl" pointed information out like the following.
WARNING: Possible unnecessary 'out of memory' message
Thus remove such statements here.
Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf
Acked-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "devm_kcalloc".
This issue was detected by using the Coccinelle software.
Acked-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Making thermal_emergency_poweroff static fixes sparse warning:
drivers/thermal/thermal_core.c:6: warning: symbol
'thermal_emergency_poweroff' was not declared. Should it be static?
Fixes: ef1d87e06a ("thermal: core: Add a back up thermal shutdown mechanism")
Acked-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Building this driver with W=1 reports:
warning: variable 'trip' set but not used [-Wunused-but-set-variable]
The call for of_thermal_get_trip_points() is useless.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
We currently do
tcmu_free_device ->tcmu_netlink_event(TCMU_CMD_REMOVED_DEVICE) ->
uio_unregister_device -> kfree(tcmu_dev).
The problem is that the kernel does not wait for userspace to
do the close() on the uio device before freeing the tcmu_dev.
We can then hit a race where the kernel frees the tcmu_dev before
userspace does close() and so when close() -> release -> tcmu_release
is done, we try to access a freed tcmu_dev.
This patch made over the target-pending master branch moves the freeing
of the tcmu_dev to when the last reference has been dropped.
This also fixes a leak where if tcmu_configure_device was not called on a
device we did not free udev->name which was allocated at tcmu_alloc_device time.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
A recent commit added extra printks for CPU/RT limits. This can result in
excessive spam in dmesg.
Make the printks conditional on print_fatal_signals.
Fixes: e7ea7c9806 ("rlimits: Print more information when CPU/RT limits are exceeded")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arun Raghavan <arun@arunraghavan.net>
Some drivers - like i915 - may not support the system suspend direct
complete optimization due to differences in their runtime and system
suspend sequence. Add a flag that when set resumes the device before
calling the driver's system suspend handlers which effectively disables
the optimization.
Needed by a future patch fixing suspend/resume on i915.
Suggested by Rafael.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
If there is not enough space then ceph_decode_32_safe() does a goto bad.
We need to return an error code in that situation. The current code
returns ERR_PTR(0) which is NULL. The callers are not expecting that
and it results in a NULL dereference.
Fixes: f24e9980eb ("ceph: OSD client")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
None of these are validated in userspace, but since we do validate
reply_struct_v in ceph_x_proc_ticket_reply(), tkt_struct_v (first) and
CephXServiceTicket struct_v (second) in process_one_ticket(), validate
CephXTicketBlob struct_v as well.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
It's set but not used: CEPH_FEATURE_MONNAMES feature bit isn't
advertised, which guarantees a v1 MonMap.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
if we receive a compound such that:
- the sessionid, slot, and sequence number in the SEQUENCE op
match a cached succesful reply with N ops, and
- the Nth operation of the compound is a PUTFH, PUTPUBFH,
PUTROOTFH, or RESTOREFH,
then nfsd4_sequence will return 0 and set cstate->status to
nfserr_replay_cache. The current filehandle will not be set. This will
cause us to call check_nfsd_access with first argument NULL.
To nfsd4_compound it looks like we just succesfully executed an
operation that set a filehandle, but the current filehandle is not set.
Fix this by moving the nfserr_replay_cache earlier. There was never any
reason to have it after the encode_op label, since the only case where
he hit that is when opdesc->op_func sets it.
Note that there are two ways we could hit this case:
- a client is resending a previously sent compound that ended
with one of the four PUTFH-like operations, or
- a client is sending a *new* compound that (incorrectly) shares
sessionid, slot, and sequence number with a previously sent
compound, and the length of the previously sent compound
happens to match the position of a PUTFH-like operation in the
new compound.
The second is obviously incorrect client behavior. The first is also
very strange--the only purpose of a PUTFH-like operation is to set the
current filehandle to be used by the following operation, so there's no
point in having it as the last in a compound.
So it's likely this requires a buggy or malicious client to reproduce.
Reported-by: Scott Mayhew <smayhew@redhat.com>
Cc: stable@kernel.vger.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull i2c fixes from Wolfram Sang:
"Fix the i2c-designware regression of rc2.
Also, a DMA buffer fix for the tiny-usb driver where the USB core now
loudly complains about the non DMA-capable buffer"
[ I had cherry-picked the designware fix separately because it hit my
laptop, but here is the proper sync with the i2c tree - Linus ]
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: designware: Fix bogus sda_hold_time due to uninitialized vars
i2c: i2c-tiny-usb: fix buffer not being DMA capable
The code in block/partitions/msdos.c recognizes FreeBSD, OpenBSD
and NetBSD partitions and does a reasonable job picking out OpenBSD
and NetBSD UFS subpartitions.
But for FreeBSD the subpartitions are always "bad".
Kernel: <bsd:bad subpartition - ignored
Though all 3 of these BSD systems use UFS as a file system, only
FreeBSD uses relative start addresses in the subpartition
declarations.
The following patch fixes this for FreeBSD partitions and leaves
the code for OpenBSD and NetBSD intact:
Signed-off-by: Richard Narron <comet.berkeley@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Masks for extracting part of the Completion Queue Entry (CQE)
field rss_hash_type was swapped, namely CQE_RSS_HTYPE_IP and
CQE_RSS_HTYPE_L4.
The bug resulted in setting skb->l4_hash, even-though the
rss_hash_type indicated that hash was NOT computed over the
L4 (UDP or TCP) part of the packet.
Added comments from the datasheet, to make it more clear what
these masks are selecting.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some devices need their multicast filter reset but others are crashed by that.
So the methods need to be separated.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Reported-by: "Ridgway, Keith" <kridgway@harris.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2017-05-23
1) Fix wrong header offset for esp4 udpencap packets.
2) Fix a stack access out of bounds when creating a bundle
with sub policies. From Sabrina Dubroca.
3) Fix slab-out-of-bounds in pfkey due to an incorrect
sadb_x_sec_len calculation.
4) We checked the wrong feature flags when taking down
an interface with IPsec offload enabled.
Fix from Ilan Tayari.
5) Copy the anti replay sequence numbers when doing a state
migration, otherwise we get out of sync with the sequence
numbers. Fix from Antony Antony.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't set an error code on this path. It means that we return NULL
instead of an error pointer and the caller does a NULL dereference.
Fixes: 6d1d8050b4 ("block, partition: add partition_meta_info to hd_struct")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Add tolerance to failures of irq_set_affinity_hint().
Its role is to give hints that optimizes performance,
and should not block the driver load.
In non-SMP systems, functionality is not available as
there is a single core, and all these calls definitely
fail. Hence, do not call the function and avoid the
warning prints.
Fixes: db058a186f ("net/mlx5_core: Set irq affinity hints")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently when firmware command gets stuck or it takes long time to
complete, the driver command will get timeout and the command slot is
freed and can be used for new commands, and if the firmware receive new
command on the old busy slot its behavior is unexpected and this could
be harmful.
To fix this when the driver command gets timeout we return failure,
but we don't free the command slot and we wait for the firmware to
explicitly respond to that command.
Once all the entries are busy we will stop processing new firmware
commands.
Fixes: 9cba4ebcf3 ('net/mlx5: Fix potential deadlock in command mode change')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
IPoIB packet contains the pseudo header area, we need to pull it prior
to reset_mac_header in order to let the GRO work well.
In more details:
GRO checks the mac address of the new coming packet, it does that by
comparing the hard_header_len size of the current packet to the previous
one in that session, the comparison is over hard_header_len size.
Now, the driver prepares that area in the skb by allocating area from the
reserved part and resetting the correct mac header to it.
Fixes: 9d6bd752c6 ("net/mlx5e: IPoIB, RX handler")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The sparse tool emits these correct complaints:
drivers/net/ethernet/mellanox/mlx5/core//en_tc.c:1005:25: warning: cast to restricted __be32
drivers/net/ethernet/mellanox/mlx5/core//en_tc.c:1007:25: warning: cast to restricted __be16
The value is provided from user-space in network order, but there's
no way for them to realize that, avoid the warnings by casting to the
appropriate type.
Fixes: d79b6df6b1 ('net/mlx5e: Add parsing of TC pedit actions to HW format')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>