As reported in BZ #30352:
https://bugzilla.kernel.org/show_bug.cgi?id=30352
there's a kernel bug related to reading the last allowed page on x86_64.
The _copy_to_user() and _copy_from_user() functions use the following
check for address limit:
if (buf + size >= limit)
fail();
while it should be more permissive:
if (buf + size > limit)
fail();
That's because the size represents the number of bytes being
read/write from/to buf address AND including the buf address.
So the copy function will actually never touch the limit
address even if "buf + size == limit".
Following program fails to use the last page as buffer
due to the wrong limit check:
#include <sys/mman.h>
#include <sys/socket.h>
#include <assert.h>
#define PAGE_SIZE (4096)
#define LAST_PAGE ((void*)(0x7fffffffe000))
int main()
{
int fds[2], err;
void * ptr = mmap(LAST_PAGE, PAGE_SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
assert(ptr == LAST_PAGE);
err = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
assert(err == 0);
err = send(fds[0], ptr, PAGE_SIZE, 0);
perror("send");
assert(err == PAGE_SIZE);
err = recv(fds[1], ptr, PAGE_SIZE, MSG_WAITALL);
perror("recv");
assert(err == PAGE_SIZE);
return 0;
}
The other place checking the addr limit is the access_ok() function,
which is working properly. There's just a misleading comment
for the __range_not_ok() macro - which this patch fixes as well.
The last page of the user-space address range is a guard page and
Brian Gerst observed that the guard page itself due to an erratum on K8 cpus
(#121 Sequential Execution Across Non-Canonical Boundary Causes Processor
Hang).
However, the test code is using the last valid page before the guard page.
The bug is that the last byte before the guard page can't be read
because of the off-by-one error. The guard page is left in place.
This bug would normally not show up because the last page is
part of the process stack and never accessed via syscalls.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1305210630-7136-1-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Support memset() with enhanced rep stosb. On processors supporting enhanced
REP MOVSB/STOSB, the alternative memset_c_e function using enhanced rep stosb
overrides the fast string alternative memset_c and the original function.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-10-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Support memmove() by enhanced rep movsb. On processors supporting enhanced
REP MOVSB/STOSB, the alternative memmove() function using enhanced rep movsb
overrides the original function.
The patch doesn't change the backward memmove case to use enhanced rep
movsb.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-9-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Support memcpy() with enhanced rep movsb. On processors supporting enhanced
rep movsb, the alternative memcpy() function using enhanced rep movsb overrides the original function and the fast string
function.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-8-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Support copy_to_user/copy_from_user() by enhanced REP MOVSB/STOSB.
On processors supporting enhanced REP MOVSB/STOSB, the alternative
copy_user_enhanced_fast_string function using enhanced rep movsb overrides the
original function and the fast string function.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-7-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Intel processors are adding enhancements to REP MOVSB/STOSB and the use of
REP MOVSB/STOSB for optimal memcpy/memset or similar functions is recommended.
Enhancement availability is indicated by CPUID.7.0.EBX[9] (Enhanced REP MOVSB/
STOSB).
Support clear_page() with rep stosb for processor supporting enhanced REP MOVSB
/STOSB. On processors supporting enhanced REP MOVSB/STOSB, the alternative
clear_page_c_e function using enhanced REP STOSB overrides the original function
and the fast string function.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-6-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Add altinstruction_entry macro to generate .altinstructions section
entries from assembly code. This should be less failure-prone than
open-coding.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-5-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Some string operation functions may be patched twice, e.g. on enhanced REP MOVSB
/STOSB processors, memcpy is patched first by fast string alternative function,
then it is patched by enhanced REP MOVSB/STOSB alternative function.
Add comment for applying alternatives order to warn people who may change the
applying alternatives order for any reason.
[ Documentation-only patch ]
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-4-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
If kernel intends to use enhanced REP MOVSB/STOSB, it must ensure
IA32_MISC_ENABLE.Fast_String_Enable (bit 0) is set and CPUID.(EAX=07H, ECX=0H):
EBX[bit 9] also reports 1.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-3-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Intel processors are adding enhancements to REP MOVSB/STOSB and the use of
REP MOVSB/STOSB for optimal memcpy/memset or similar functions is recommended.
Enhancement availability is indicated by CPUID.7.0.EBX[9] (Enhanced REP MOVSB/
STOSB).
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-2-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
CPUID leaf 7, subleaf 0 returns the maximum subleaf in EAX, not the
number of subleaves. Since so far only subleaf 0 is defined (and only
the EBX bitfield) we do not need to qualify the test.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305660806-17519-1-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@kernel.org> 2.6.36..39
Do the mcount offset adjustment in the recordmcount.pl/recordmcount.[ch]
at compile time and not in ftrace_call_adjust at run time.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Do the mcount offset adjustment in the recordmcount.pl/recordmcount.[ch]
at compile time and not in ftrace_call_adjust at run time.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The section called .discard.text has tracing attached to it and is
currently ignored by ftrace. But it does include a call to the mcount
stub. Adding a notrace to the code keeps gcc from adding the useless
mcount caller to it.
Link: http://lkml.kernel.org/r/20110421023739.243651696@goodmis.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Both warning and warning_symbol are nowhere used.
Let's get rid of them.
Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Soeren Sandmann Pedersen <ssp@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: x86 <x86@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Link: http://lkml.kernel.org/r/1305205872-10321-2-git-send-email-richard@nod.at
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
The Intel Nehalem offcore bits implemented in:
e994d7d23a: perf: Fix LLC-* events on Intel Nehalem/Westmere
... are wrong: they implemented _ACCESS as _HIT and counted OTHER_CORE_HIT* as
MISS even though its clearly documented as an L3 hit ...
Fix them and the Westmere definitions as well.
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1299119690-13991-3-git-send-email-ming.m.lin@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We make use of ptrace_get_breakpoints() / ptrace_put_breakpoints() to
protect ptrace_set_debugreg() even if CONFIG_HAVE_HW_BREAKPOINT if off.
However in this case, these APIs are not implemented.
To fix this, push the protection down inside the relevant ifdef.
Best would be to export the code inside
CONFIG_HAVE_HW_BREAKPOINT into a standalone function to cleanup
the ifdefury there and call the breakpoint ref API inside. But
as it is more invasive, this should be rather made in an -rc1.
Fixes this build error:
arch/powerpc/kernel/ptrace.c:1594: error: implicit declaration of function 'ptrace_get_breakpoints' make[2]: ***
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: LPPC <linuxppc-dev@lists.ozlabs.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: v2.6.33.. <stable@kernel.org>
Link: http://lkml.kernel.org/r/1304639598-4707-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Extend the Intel SandyBridge PMU driver with definitions
for generic front-end and back-end stall events.
( As commit 3011203 "perf events, x86: Add Westmere stalled-cycles-frontend/backend
events" says, these are only approximations. )
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1304666042-17577-1-git-send-email-ming.m.lin@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-linus' of git://github.com/at91linux/linux-2.6-at91:
at91: Add ARCH_ID and basic cpu macros definition for 5series chips family.
arm: at91: fix compiler warning for eb01 board build
arm: at91: minimal defconfig for at91x40 SoC
ARM: at91: AT91CAP9 has a macb device
* 'stable/bug-fixes-for-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: mask_rw_pte mark RO all pagetable pages up to pgt_buf_top
xen/mmu: Add workaround "x86-64, mm: Put early page table high"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
sysctl: net: call unregister_net_sysctl_table where needed
Revert: veth: remove unneeded ifname code from veth_newlink()
smsc95xx: fix reset check
tg3: Fix failure to enable WoL by default when possible
networking: inappropriate ioctl operation should return ENOTTY
amd8111e: trivial typo spelling: Negotitate -> Negotiate
ipv4: don't spam dmesg with "Using LC-trie" messages
af_unix: Only allow recv on connected seqpacket sockets.
mii: add support of pause frames in mii_get_an
net: ftmac100: fix scheduling while atomic during PHY link status change
usbnet: Transfer of maintainership
usbnet: add support for some Huawei modems with cdc-ether ports
bnx2: cancel timer on device removal
iwl4965: fix "Received BA when not expected"
iwlagn: fix "Received BA when not expected"
dsa/mv88e6131: fix unknown multicast/broadcast forwarding on mv88e6085
usbnet: Resubmit interrupt URB if device is open
iwl4965: fix "TX Power requested while scanning"
iwlegacy: led stay solid on when no traffic
b43: trivial: update module info about ucode16_mimo firmware
...
The use of base for %ebx in this file is arbitrary, *except* that we
also use it to compute the real-mode segment. Therefore, make it so
that r_base really is the true address to which %ebx points.
This resolves kernel bugzilla 33302.
Reported-and-tested-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/n/tip-08os5wi3yq1no0y4i5m4z7he@git.kernel.org
mask_rw_pte is currently checking if a pfn is a pagetable page if it
falls in the range pgt_buf_start - pgt_buf_end but that is incorrect
because pgt_buf_end is a moving target: pgt_buf_top is the real
boundary.
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
As a consequence of the commit:
commit 4b239f458c
Author: Yinghai Lu <yinghai@kernel.org>
Date: Fri Dec 17 16:58:28 2010 -0800
x86-64, mm: Put early page table high
it causes the Linux kernel to crash under Xen:
mapping kernel into physical memory
Xen: setup ISA identity maps
about to get started...
(XEN) mm.c:2466:d0 Bad type (saw 7400000000000001 != exp 1000000000000000) for mfn b1d89 (pfn bacf7)
(XEN) mm.c:3027:d0 Error while pinning mfn b1d89
(XEN) traps.c:481:d0 Unhandled invalid opcode fault/trap [#6] on VCPU 0 [ec=0000]
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
...
The reason is that at some point init_memory_mapping is going to reach
the pagetable pages area and map those pages too (mapping them as normal
memory that falls in the range of addresses passed to init_memory_mapping
as argument). Some of those pages are already pagetable pages (they are
in the range pgt_buf_start-pgt_buf_end) therefore they are going to be
mapped RO and everything is fine.
Some of these pages are not pagetable pages yet (they fall in the range
pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they
are going to be mapped RW. When these pages become pagetable pages and
are hooked into the pagetable, xen will find that the guest has already
a RW mapping of them somewhere and fail the operation.
The reason Xen requires pagetables to be RO is that the hypervisor needs
to verify that the pagetables are valid before using them. The validation
operations are called "pinning" (more details in arch/x86/xen/mmu.c).
In order to fix the issue we mark all the pages in the entire range
pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation
is completed only the range pgt_buf_start-pgt_buf_end is reserved by
init_memory_mapping. Hence the kernel is going to crash as soon as one
of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those
ranges are RO).
For this reason, this function is introduced which is called _after_
the init_memory_mapping has completed (in a perfect world we would
call this function from init_memory_mapping, but lets ignore that).
Because we are called _after_ init_memory_mapping the pgt_buf_[start,
end,top] have all changed to new values (b/c another init_memory_mapping
is called). Hence, the first time we enter this function, we save
away the pgt_buf_start value and update the pgt_buf_[end,top].
When we detect that the "old" pgt_buf_start through pgt_buf_end
PFNs have been reserved (so memblock_x86_reserve_range has been called),
we immediately set out to RW the "old" pgt_buf_end through pgt_buf_top.
And then we update those "old" pgt_buf_[end|top] with the new ones
so that we can redo this on the next pagetable.
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Reviewed-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
[v1: Updated with Jeremy's comments]
[v2: Added the crash output]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm: (47 commits)
CLKDEV: Fix clkdev return value for NULL clk case
ARM: 6891/1: prevent heap corruption in OABI semtimedop
ARM: kprobes: Tidy-up kprobes-decode.c
ARM: kprobes: Add emulation of hint instructions like NOP and WFI
ARM: kprobes: Add emulation of SBFX, UBFX, BFI and BFC instructions
ARM: kprobes: Add emulation of MOVW and MOVT instructions
ARM: kprobes: Reject probing of undefined data processing instructions
ARM: kprobes: Remove redundant code in space_1111
ARM: kprobes: Fix emulation of PLD instructions
ARM: kprobes: Reject probing of SETEND instructions
ARM: kprobes: Consolidate stub decoding functions
ARM: kprobes: Reject probing of all coprocessor instructions
ARM: kprobes: Fix emulation of USAD8 instructions
ARM: kprobes: Fix emulation of SMUAD, SMUSD and SMMUL instructions
ARM: kprobes: Fix emulation of SXTB16, SXTB, SXTH, UXTB16, UXTB and UXTH instructions
ARM: kprobes: Reject probing of undefined media instructions
ARM: kprobes: Add emulation of RBIT instruction
ARM: kprobes: Reject probing of LDRB instructions which load PC
ARM: kprobes: Fix emulation of LDRD and STRD instructions
ARM: kprobes: Reject probing of LDR/STR instructions which update PC unpredictably
...
numa_cleanup_meminfo() trims each memblk between low (0) and
high (max_pfn) limits and discards empty ones. However, the
emptiness detection incorrectly used equality test. If the
start of a memblk is higher than max_pfn, it is empty but fails
the equality test and doesn't get discarded.
The condition triggers when max_pfn is lower than start of a
NUMA node and results in memory misconfiguration - leading to
WARN_ON()s and other funnies. The bug was discovered in devel
branch where 32bit too uses this code path for NUMA init. If a
node is above the addressing limit, max_pfn ends up lower than
the node triggering this problem.
The failure hasn't been observed on x86-64 but is still possible
with broken hardware e820/NUMA info. As the fix is very low
risk, it would be better to apply it even for 64bit.
Fix it by using >= instead of ==.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
[ Extracted the actual fix from the original patch and rewrote patch description. ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/20110501171204.GO29280@htj.dyndns.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Older AMD K8 processors (Revisions A-E) are affected by erratum
400 (APIC timer interrupts don't occur in C states greater than
C1). This, for example, means that X86_FEATURE_ARAT flag should
not be set for these parts.
This addresses regression introduced by commit
b87cf80af3 ("x86, AMD: Set ARAT
feature on AMD processors") where the system may become
unresponsive until external interrupt (such as keyboard input)
occurs. This results, for example, in time not being reported
correctly, lack of progress on the system and other lockups.
Reported-by: Joerg-Volker Peetz <jvpeetz@web.de>
Tested-by: Joerg-Volker Peetz <jvpeetz@web.de>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Boris Ostrovsky <Boris.Ostrovsky@amd.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1304113663-6586-1-git-send-email-ostr@amd64.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, x86, nmi: Move LVT un-masking into irq handlers
perf events, x86: Work around the Nehalem AAJ80 erratum
perf, x86: Fix BTS condition
ftrace: Build without frame pointers on Microblaze
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: ce4100: Configure IOAPIC pins for USB and SATA to level type
x86: devicetree: Configure IOAPIC pin only once
x86, setup: When probing memory with e801, use ax/bx as a pair
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
OMAP3+: voltage: remove initial voltage
OMAP4: Intialize IVA Device in addition to DSP device.
omap: rx51: mark reserved memory earlier
OMAP3: l3: fix for "irq 10: nobody cared" message
arm: omap2: enable smc instruction for sleep34xx
OMAP2/3: hwmod: fix gpio-reset timeouts seen during bootup.
OMAP3: PM: Do not rely on ROM code to restore CM_AUTOIDLE_PLL.AUTO_PERIPH_DPLL
OMAP2+: PM: Fix the saving of CM_AUTOIDLE_PLL register on scratchpad area
OMAP4: clock data: Change DSS clock aliases
OMAP2+: hwmod data: Fix wrong dma_system end address
When CONFIG_OABI_COMPAT is set, the wrapper for semtimedop does not
bound the nsops argument. A sufficiently large value will cause an
integer overflow in allocation size, followed by copying too much data
into the allocated buffer. Fix this by restricting nsops to SEMOPM.
Untested.
Cc: stable@kernel.org
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Extend the Intel Westmere PMU driver with definitions for generic front-end and
back-end stall events.
( These are only approximations. )
Reported-by: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n008io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Extend the Intel and AMD event definitions with generic front-end and
back-end stall events.
( These are only approximations - suggestions are welcome for better events. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n001io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add two generic hardware events: front-end and back-end stalled cycles.
These events measure conditions when the CPU is executing code but its
capabilities are not fully utilized. Understanding such situations and
analyzing them is an important sub-task of code optimization workflows.
Both events limit performance: most front end stalls tend to be caused
by branch misprediction or instruction fetch cachemisses, backend
stalls can be caused by various resource shortages or inefficient
instruction scheduling.
Front-end stalls are the more important ones: code cannot run fast
if the instruction stream is not being kept up.
An over-utilized back-end can cause front-end stalls and thus
has to be kept an eye on as well.
The exact composition is very program logic and instruction mix
dependent.
We use the terms 'stall', 'front-end' and 'back-end' loosely and
try to use the best available events from specific CPUs that
approximate these concepts.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n000io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
pfault, dasd diag and virtio all use the same external interrupt number.
The respective interrupt handlers decide by the subcode if they are
meant to handle the interrupt.
Counting is currently done before looking at the subcode which means
each handler counts an interrupt even if it is not handling it.
Fix this by moving the kstat code after the code which looks at the
subcode.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
- Remove coding standard violations reported by checkpatch.pl
- Delete comment about handling of conditional branches which is no
longer true.
- Delete comment at end of file which lists all ARM instructions. This
duplicates data available in the ARM ARM and seems like an
unnecessary maintenance burden to keep this up to date and accurate.
Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Being able to probe NOP instructions is useful for hard-coding probeable
locations and is used by the kprobes test code.
Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
These bit field manipulation instructions occur several thousand
times in an ARMv7 kernel.
Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
The MOVW and MOVT instructions account for approximately 7% of all
instructions in a ARMv7 kernel as GCC uses them instead of a literal
pool.
Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
The instruction decoding in space_cccc_000x needs to reject probing of
instructions with undefined patterns as they may in future become
defined and then emulated faultily - as has already happened with the
SMC instruction.
This fix is achieved by testing for the instruction patterns we want to
probe and making the the default fall-through paths reject probes. This
also allows us to remove some explicit tests for instructions that we
wish to reject, as that is now the default action.
Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
The tests to explicitly reject probing CPS, RFE and SRS instructions
are redundant as the default case is now to reject undecoded patterns.
Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>