Discover number of counters for all family 6 models even when not
in arch perfmon mode.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Newer Intel CPUs (Core1+) have support for architectural
events described in CPUID 0xA. See the IA32 SDM Vol3b.18 for details.
The advantage of this is that it can be done without knowing about
the specific CPU, because the CPU describes by itself what
performance events are supported. This is only a fallback
because only a limited set of 6 events are supported.
This allows to do profiling on Nehalem and on Atom systems
(later not tested)
This patch implements support for that in oprofile's Intel
Family 6 profiling module. It also has the advantage of supporting
an arbitary number of events now as reported by the CPU.
Also allow arbitary counter widths >32bit while we're at it.
Requires a patched oprofile userland to support the new
architecture.
v2: update for latest oprofile tree
remove force_arch_perfmon
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This essentially reverts Linus' earlier 4b9f12a377
commit. Nehalem is not core_2, so it shouldn't be reported as such.
However with the earlier arch perfmon patch it will fall back to
arch perfmon mode now, so there is no need to fake it as core_2.
The only drawback is that Linus will need to patch the arch perfmon
support into his oprofile binary now, but I think he can do that.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Vegard Nossum reported oprofile + hibernation problems:
> Now some warnings:
>
> ------------[ cut here ]------------
> WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/smp.c:328 s
> mp_call_function_mask+0x194/0x1a0()
The usual problem: the suspend function when interrupts are
already disabled calls smp_call_function which is not allowed with
interrupt off. But at this point all the other CPUs should be already
down anyways, so it should be enough to just drop that.
This patch should fix that problem at least by fixing cpu hotplug&
suspend support.
[ mingo@elte.hu: fixed 5 coding style errors. ]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch introduces multiplexing support for the Oprofile kernel
module. It basically adds a new function pointer in oprofile_operator
allowing each architecture to supply its callback to switch between
different sets of event when the timer expires. Userspace tools can
modify the time slice through /dev/oprofile/time_slice.
It also modifies the number of counters exposed to the userspace through
/dev/oprofile. For example, the number of counters for AMD CPUs are
changed to 32 and multiplexed in the sets of 4.
Signed-off-by: Jason Yeh <jason.yeh@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: oprofile-list <oprofile-list@lists.sourceforge.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patchset supports the new profiling hardware available in the
latest AMD CPUs in the oProfile driver.
Signed-off-by: Barry Kasindorf <barry.kasindorf@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: oprofile-list <oprofile-list@lists.sourceforge.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
These functions contain code for all AMD CPUs. The new names fit
better.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: oprofile-list <oprofile-list@lists.sourceforge.net>
Cc: Barry Kasindorf <barry.kasindorf@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch implements model specific OProfile init/exit functions for
x86 CPUs. Though there is more rework needed at the initialization
code, this new introduced functions allow it to keep model specific
code in the corresponding op_model_*.c files.
The function interface is the same as for oprofile_arch_init/exit().
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: oprofile-list <oprofile-list@lists.sourceforge.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch add support for AMD Family 11h CPUs.
Signed-off-by: Barry Kasindorf <barry.kasindorf@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: oprofile-list <oprofile-list@lists.sourceforge.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
..otherwise oprofile will fall back on that poor timer interrupt.
Also replace the unreadable chain of if-statements with a "switch()"
statement instead. It generates better code, and is a lot clearer.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's not even passed on to smp_call_function() anymore, since that
was removed. So kill it.
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
fix:
BUG: using smp_processor_id() in preemptible [00000000] code: oprofiled/27301
caller is nmi_shutdown+0x11/0x60
Pid: 27301, comm: oprofiled Not tainted 2.6.26-rc7 #25
[<c028a90d>] debug_smp_processor_id+0xbd/0xc0
[<c045fba1>] nmi_shutdown+0x11/0x60
[<c045dd4a>] oprofile_shutdown+0x2a/0x60
Note that we don't need this for the other functions, since they are all
called with on_each_cpu() (which disables preemption for us anyway).
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Philippe Elie <phil.el@wanadoo.fr>
Cc: oprofile-list@lists.sf.net
Cc: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change the following arrays sized by NR_CPUS to be PERCPU variables:
static struct op_msrs cpu_msrs[NR_CPUS];
static unsigned long saved_lvtpc[NR_CPUS];
Also some minor complaints from checkpatch.pl fixed.
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git
All changes were transparent except for:
static void nmi_shutdown(void)
{
+ struct op_msrs *msrs = &__get_cpu_var(cpu_msrs);
nmi_enabled = 0;
on_each_cpu(nmi_cpu_shutdown, NULL, 0, 1);
unregister_die_notifier(&profile_exceptions_nb);
- model->shutdown(cpu_msrs);
+ model->shutdown(msrs);
free_msrs();
}
The existing code passed a reference to cpu 0's instance of struct op_msrs
to model->shutdown, whilst the other functions are passed a reference to
<this cpu's> instance of a struct op_msrs. This seemed to be a bug to me
even though as long as cpu 0 and <this cpu> are of the same type it would
have the same effect...?
Cc: Philippe Elie <phil.el@wanadoo.fr>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The patch fixes 33 errors and a few warnings reported by checkpatch.pl
arch/x86/oprofile/op_model_athlon.o:
text data bss dec hex filename
1691 0 32 1723 6bb op_model_athlon.o.before
1691 0 32 1723 6bb op_model_athlon.o.after
md5:
c354bc2d7140e1e626c03390eddaa0a6 op_model_athlon.o.before.asm
c354bc2d7140e1e626c03390eddaa0a6 op_model_athlon.o.after.asm
Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Choose a less generic name for such a special case. Add
a comment explaining the odd use in X86_32.
Change the one user of stack_pointer.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Right now, we take the stack pointer early during the backtrace path, but
only calculate bp several functions deep later, making it hard to reconcile
the stack and bp backtraces (as well as showing several internal backtrace
functions on the stack with bp based backtracing).
This patch moves the bp taking to the same place we take the stack pointer;
sadly this ripples through several layers of the back tracing stack,
but it's not all that bad in the end I hope.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
For enhancing the 32 bit EBP based backtracer, I need the capability
for the backtracer to tell it's customer that an entry is either
reliable or unreliable, and the backtrace printing code then needs to
print the unreliable ones slightly different.
This patch adds the basic capability, the next patch will add a user
of this capability.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We have a lot of code which differs only by the naming of specific
members of structures that contain registers. In order to enable
additional unifications, this patch drops the e- or r- size prefix
from the register names in struct pt_regs, and drops the x- prefixes
for segment registers on the 32-bit side.
This patch also performs the equivalent renames in some additional
places that might be candidates for unification in the future.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
All kobjects require a dynamically allocated name now. We no longer
need to keep track if the name is statically assigned, we can just
unconditionally free() all kobject names on cleanup.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The latest Intel processors (the 45nm ones) have a model number of 23
(old ones had 15); they're otherwise compatible on the oprofile side.
This patch adds the new model number to the oprofile code.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch is for controlling the upper 32bits of the event ctrl msrs.
This includes the upper 4 bits of the event select and the Guest Only and
Host Only bits
This patch is necessary to make Event Based Profiling work reliably on a
Family 10h processor
[akpm@linux-foundation.org: checkpatch.pl fixes]
Signed-off-by: Barry Kasindorf <barry.kasindorf@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It seems commit 09cadedbdc was incomplete
due to a clash with the x86 architecture merge.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (74 commits)
fix do_sys_open() prototype
sysfs: trivial: fix sysfs_create_file kerneldoc spelling mistake
Documentation: Fix typo in SubmitChecklist.
Typo: depricated -> deprecated
Add missing profile=kvm option to Documentation/kernel-parameters.txt
fix typo about TBI in e1000 comment
proc.txt: Add /proc/stat field
small documentation fixes
Fix compiler warning in smount example program from sharedsubtree.txt
docs/sysfs: add missing word to sysfs attribute explanation
documentation/ext3: grammar fixes
Documentation/java.txt: typo and grammar fixes
Documentation/filesystems/vfs.txt: typo fix
include/asm-*/system.h: remove unused set_rmb(), set_wmb() macros
trivial copy_data_pages() tidy up
Fix typo in arch/x86/kernel/tsc_32.c
file link fix for Pegasus USB net driver help
remove unused return within void return function
Typo fixes retrun -> return
x86 hpet.h: remove broken links
...
This patch defines frame_pointer() and stack_pointer() similar to the
already defined instruction_pointer(). Thus the oprofile code can be
written in a more readable fashion.
[ tglx: arch/x86 adaptation ]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch improves oprofile callgraphs for i386/x86_64. The old
backtracing code was unable to produce kernel backtraces if the
kernel wasn't compiled with framepointers. The code now uses
dump_trace().
[ tglx: arch/x86 adaptation ]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Convert cpu_sibling_map from a static array sized by NR_CPUS to a per_cpu
variable. This saves sizeof(cpumask_t) * NR unused cpus. Access is mostly
from startup and CPU HOTPLUG functions.
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>