Here's a 63-bit implementation of shed_clock() for PXA2xx. The actual
period depends on the value of CLOCK_TICK_RATE and whether or not
reduced scaling factors were provided for it.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
period
This provides a 63 bit clock counter guaranteed to be monotonic over a
period of 35583 days instead of a clock wrap every 179 seconds, as long
as sched_clock() is called at least once every 89 seconds. This should
not be a problem in practice, although a kernel timer could be scheduled
every 80 seconds for example simply to call sched_clock() making sure
top bits are always synchronized if need be.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This provides a 63 bit clock counter guaranteed to be monotonic over a
period of 370 days instead of a clock wrap every 19.4 minutes, as long
as sched_clock() is called at least once every 9.7 minutes which
shouldn't be a problem in practice.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add support for the Cirrus Logic EDB9302A Evaluation Board. Confirmed
to work by Chase Douglas.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Do not assume that the ARM GCC toolchain defaults to building for the
32-bit ARM ISA (-marm) case. Instead, explicitly select -marm in CFLAGS
since the toolchain default can be for the 16-bit Thumb ISA (-mthumb) in
some odd/rare cases.
Signed-off-by: George G. Davis <gdavis@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
- remove the write-only local variable "bandwidth"
- don't set "max_cache_size" in the (cachesize < 0) case:
that's already handled in kernel/sched.c:measure_migration_cost()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
The .eh_frame section contents is never written to, so it can as well
benefit from CONFIG_DEBUG_RODATA.
Diff-ed against firstfloor tree.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
We don't need to setup _irq_regs in smp_xxx_interrupt (except apic timer).
These handlers run with irqs disabled and do not call functions which need
"struct pt_regs".
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Acked-By: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
setup_IO_APIC_irqs could fail to get vector for some device when you have too
many devices, because at that time only boot cpu is online. So check vector
for irq in setup_ioapic_dest and call setup_IO_APIC_irq to make sure IO-APIC
irq-routing table is initialized.
Also seperate setup_IO_APIC_irq from setup_IO_APIC_irqs.
Signed-off-by: Yinghai Lu <yinghai.lu@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Remove unused variable in msr_write().
Reported by D Binderman <dcb314@hotmail.com>.
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: David Rientjes <rientjes@cs.washington.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
- Remove "Disabling IOMMU" message because it confuses people
- Clarify that the GART IOMMU is refered to in other message
Signed-off-by: Andi Kleen <ak@suse.de>
Idle callbacks has some races when enter_idle() sets isidle and subsequent
interrupts that can happen on that CPU, before CPU goes to idle. Due to this,
an IDLE_END can get called before IDLE_START. To avoid these races, disable
interrupts before enter_idle and make sure that all idle routines do not
enable interrupts before entering idle.
Note that poll_idle() still has a this race as it has to enable interrupts
before going to idle. But, all other idle routines have the race fixed.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
This was added as a workaround for the fallback unwinder not supporting
unaligned stack pointers properly. But now it was fixed to do that,
so it's not needed anymore
Cc: mingo@elte.hu
Signed-off-by: Andi Kleen <ak@suse.de>
Tighten the requirements on both input to and output from the Dwarf2
unwinder.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
We're already well protected against module unloads because module
unload uses stop_machine(). The only exception is NMIs, but other
users already risk lockless accesses here.
This avoids some hackery in lockdep and also a potential deadlock
This matches what i386 does.
Signed-off-by: Andi Kleen <ak@suse.de>
On the Core2 cpus, the rdtsc instruction is not serializing (as defined
in the architecture reference since rdtsc exists) and due to the deep
speculation of these cores, it's possible that you can observe time go
backwards between cores due to this speculation. Since the kernel
already deals with this with the SYNC_RDTSC flag, the solution is
simple, only assume that the instruction is serializing on family 15...
The price one pays for this is a slightly slower gettimeofday (by a
dozen or two cycles), but that increase is quite small to pay for a
really-going-forward tsc counter.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
There is no guarantee that two RDTSCs in a row are monotonic,
so don't assume it on single core AMD systems.
This will make gettimeofday slower again
Signed-off-by: Andi Kleen <ak@suse.de>
-mregparm=3 has been enabled by default for some time on i386, and AFAIK
there aren't any problems with it left.
This patch removes the REGPARM config option and sets -mregparm=3
unconditionally.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Sometimes the soft watchdog fires after we're done oopsing.
See http://projects.info-pull.com/mokb/MOKB-25-11-2006.html for an example.
AK: changed to touch_nmi_watchdog()
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Make mce_remove_device() clean up the kobject in per_cpu(device_mce, cpu)
after it has been unregistered.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@suse.de>
interrupt array is referred for idt vectors instead of NR_IRQS, so change size
to NR_VECTORS - FIRST_EXTERNAL_VECTOR. Also change to static.
Signed-off-by: Yinghai Lu <yinghai@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
idt_table is in the .bss section, so clear_bss need to called at first
Signed-off-by: Yinghai Lu <yinghai.lu@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Add sysctl for kstack_depth_to_print. This lets users change
the amount of raw stack data printed in dump_stack() without
having to reboot.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
We do the exact same printk about a dozen lines above
with no intermediate printk's.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
When using memmap kernel parameter in EFI boot we should also add to memory map
memory regions of runtime services to enable their mapping later.
AK: merged and cleaned up the patch
Signed-off-by: Artiom Myaskouvskey <artiom.myaskouvskey@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Read/Write APIC_LVTPC and APIC_LVTTHMR only,
if get_maxlvt() returns certain values.
This is done like everywhere else in i386/kernel/apic.c,
so I guess its correct.
Suspends/Resumes to disk fine and eleminates an smp_error_interrupt()
here on a K8.
AK: ported to x86-64 too
Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Signed-off-by: Andi Kleen <ak@suse.de>
There are two consumers of apic=: the apic debug level and the low
level generic architecture code. early_param would warn when the
low level code rejected "debug". Avoid this.
Signed-off-by: Andi Kleen <ak@suse.de>
Here is a small patch for i386 which adds a cpufeature flag and
detection code for Intel's Branch Trace Store (BTS) feature. This
feature can be found on Intel P4 and Core 2 processors among others.
It can also be used by perfmon.
changelog:
- add CPU_FEATURE_BTS
- add Branch Trace Store detection
signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Here is a small patch for x86-64 which adds a cpufeature flag and
detection code for Intel's Branch Trace Store (BTS) feature. This
feature can be found on Intel P4 and Core 2 processors among others.
It can also be used by perfmon.
changelog:
- add CPU_FEATURE_BTS
- add Branch Trace Store detection
signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
irq_vector[] can now become static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
The Coverity checker noted that bad things might happen if
find_isa_irq_apic() returned -1.
[akpm@osdl.org: add debugging checks]
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Function efi_get_time called not only during init kernel phase but also
during suspend (from get_cmos_time).
When it is called from get_cmos_time the corresponding runtime service
should be called in virtual and not in physical mode.
Signed-off-by: Artiom Myaskouvskey <artiom.myaskouvskey@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: "Narayanan, Chandramouli" <chandramouli.narayanan@intel.com>
Cc: "Jiossy, Rami" <rami.jiossy@intel.com>
Cc: "Satt, Shai" <shai.satt@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Move the irqbalance quirks for E7320/E7520/E7525(Errata 23 in
http://download.intel.com/design/chipsets/specupdt/30304203.pdf) to early
quirks.
And add a PCI quirk for these platforms to check(which happens very late
during the boot) if the APIC routing is indeed set to default flat mode.
This fixes the breakage(in x86_64) of this quirk due to cpu hotplug which
selects physical mode instead of the logical flat(as needed for this errata
workaround).
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Change the 'no_control' field in the cpu struct to a more positive
and better term 'hotpluggable'. And change(/cleanup) the logic accordingly.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Add 'enable_cpu_hotplug' flag and when cleared, the hotplug control file
("online") will not be added under /sys/devices/system/cpu/cpuX/
Next patch doing PCI quirks will use this.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Mechanism of selecting physical mode in genapic when cpu hotplug is enabled on
x86_64, broke the quirk(quirk_intel_irqbalance()) introduced for working
around the transposing interrupt message errata in E7520/E7320/E7525 (revision
ID 0x9 and below. errata #23 in
http://download.intel.com/design/chipsets/specupdt/30304203.pdf).
This errata requires the mode to be in logical flat, so that interrupts can be
directed to more than one cpu(and thus use hardware IRQ balancing enabled by
BIOS on these platforms).
Following four patches fixes this by moving the quirk to early quirk and
forcing the x86_64 genapic selection to logical flat on these platforms.
Thanks to Shaohua for pointing out the breakage.
This patch:
Add write_pci_config_byte() to direct PCI access routines
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
o Convert more absolute symbols to section relative to keep the theme in
vmlinux.lds.S file and to avoid problem if kernel is relocated.
o Also put a message so that in future people can be aware of it and
avoid introducing absolute symbols.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Make the needlessly global alloc_gdt() static.
(against) pda-percpu-init
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Until not so long ago, there were system log messages pointing to
inconsistent MTRR setup of the video frame buffer caused by the way vesafb
and X worked. While vesafb was fixed meanwhile, I believe fixing it there
only hides a shortcoming in the MTRR code itself, in that that code is not
symmetric with respect to the ordering of attempts to set up two (or more)
regions where one contains the other. In the current shape, it permits
only setting up sub-regions of pre-exisiting ones. The patch below makes
this symmetric.
While working on that I noticed a few more inconsistencies in that code,
namely
- use of 'unsigned int' for sizes in many, but not all places (the patch
is converting this to use 'unsigned long' everywhere, which specifically
might be necessary for x86-64 once a processor supporting more than 44
physical address bits would become available)
- the code to correct inconsistent settings during secondary processor
startup tried (if necessary) to correct, among other things, the value
in IA32_MTRR_DEF_TYPE, however the newly computed value would never get
used (i.e. stored in the respective MSR)
- the generic range validation code checked that the end of the
to-be-added range would be above 1MB; the value checked should have been
the start of the range
- when contained regions are detected, previously this was allowed only
when the old region was uncacheable; this can be symmetric (i.e. the new
region can also be uncacheable) and even further as per Intel's
documentation write-trough and write-back for either region is also
compatible with the respective opposite in the other
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Just like on x86-64, don't touch foreign CPUs' memory if the watchdog
isn't enabled at all.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
While not strictly required with the current code (as the upper half of
page table entries generated by __set_fixmap() cannot be non-zero due
to the second parameter of this function being 'unsigned long'), the
use of set_pte() in __set_fixmap() in the context of clear_fixmap() is
still improper with CONFIG_X86_PAE (see the respective comment in
include/asm-i386/pgtable-3level.h) and would turn into a bug if that
second parameter ever gets changed to a 64-bit type.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
The final line of /proc/<pid>/maps on x86_64 for native 64-bit
tasks shows an incorrect ending address and incorrect permissions. There
is only a single page mapped in this vsyscall region, and it is accessible
for both read and execute.
The patch below fixes this. (Since 32-bit-compat tasks have a real vma
with correct perms/range, no change is necessary for that scenario.)
Before the patch, a "cat /proc/self/maps | tail -1" shows this:
ffffffffff600000-ffffffffffe00000 ---p 00000000 [...]
After the patch, this is the output:
ffffffffff600000-ffffffffff601000 r-xp 00000000 [...]
Signed-off-by: Ernie Petrides <petrides@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
gcc doesn't support -mtune=core2 yet, but will be soon. Use -mtune=generic or -mtune=i686
as fallback
TBD need benchmarking for INTEL_USERCOPY etc. So far I used the same defaults as MPENTIUMM
Signed-off-by: Andi Kleen <ak@suse.de>
Add an option to compile for Intel's Core 2
The Kconfig help is a mouthful due to the inventiveness of Intel's
product naming department.
Mainly for the 64bit cache line sizes because gcc doesn't support
optimizing for core2 yet. However it will and then the kernel
should be ready by passing the right option
Also fix the old MPSC help text to confirm better to reality.
Signed-off-by: Andi Kleen <ak@suse.de>
Add a way to disable the timer IRQ routing check via a boot option. The
VMI timer code uses this to avoid triggering the pester Mingo code, which
probes for some very unusual and broken motherboard routings. It fires
100% of the time when using a paravirtual delay mechanism instead of using
a realtime delay, since there is no elapsed real time, and the 4 timer IRQs
have not yet been delivered.
In addition, it is entirely possible, though improbable, that this bug
could surface on real hardware which picks a particularly bad time to enter
SMM mode, causing a long latency during one of the timer IRQs.
While here, make check_timer be __init.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
[chrisw: use no_timer_check to bring inline with x86_64 as per Andi's request]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
BIOS ROM areas may not be mapped into the guest address space, so be careful
when touching those addresses to make sure they appear to be mapped.
[akpm@osdl.org: fix unused var warning]
AK: Changed __get_user to probe_kernel_address
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Add the three bare TLB accessor functions to paravirt-ops. Most amusingly,
flush_tlb is redefined on SMP, so I can't call the paravirt op flush_tlb.
Instead, I chose to indicate the actual flush type, kernel (global) vs. user
(non-global). Global in this sense means using the global bit in the page
table entry, which makes TLB entries persistent across CR3 reloads, not
global as in the SMP sense of invoking remote shootdowns, so the term is
confusingly overloaded.
AK: folded in fix from Zach for PAE compilation
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Add APIC accessors to paravirt-ops. Unfortunately, we need two write
functions, as some older broken hardware requires workarounds for
Pentium APIC errata - this is the purpose of apic_write_atomic.
AK: replaced __inline with inline
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Two legacy power management modes are much easier to just explicitly disable
when running in paravirtualized mode - neither APM nor PnP is still relevant.
The status of ACPI is still debatable, and noacpi is still a common enough
boot parameter that it is not necessary to explicitly disable ACPI.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Allow selected bug checks to be skipped by paravirt kernels. The two most
important are the F00F workaround (which is either done by the hypervisor,
or not required), and the 'hlt' instruction check, which can break under
some hypervisors.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
1) Each hypervisor writes a probe function to detect whether we are
running under that hypervisor. paravirt_probe() registers this
function.
2) If vmlinux is booted with ring != 0, we call all the probe
functions (with registers except %esp intact) in link order: the
winner will not return.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Both lhype and Xen want to call the core of the x86 cpu detect code before
calling start_kernel.
(extracted from larger patch)
AK: folded in start_kernel header patch
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
It turns out that the most called ops, by several orders of magnitude,
are the interrupt manipulation ops. These are obvious candidates for
patching, so mark them up and create infrastructure for it.
The method used is that the ops structure has a patch function, which
is called for each place which needs to be patched: this returns a
number of instructions (the rest are NOP-padded).
Usually we can spare a register (%eax) for the binary patched code to
use, but in a couple of critical places in entry.S we can't: we make
the clobbers explicit at the call site, and manually clobber the
allowed registers in debug mode as an extra check.
And:
Don't abuse CONFIG_DEBUG_KERNEL, add CONFIG_DEBUG_PARAVIRT.
And:
AK: Fix warnings in x86-64 alternative.c build
And:
AK: Fix compilation with defconfig
And:
^From: Andrew Morton <akpm@osdl.org>
Some binutlises still like to emit references to __stop_parainstructions and
__start_parainstructions.
And:
AK: Fix warnings about unused variables when PARAVIRT is disabled.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Create a paravirt.h header for all the critical operations which need to be
replaced with hypervisor calls, and include that instead of defining native
operations, when CONFIG_PARAVIRT.
This patch does the dumbest possible replacement of paravirtualized
instructions: calls through a "paravirt_ops" structure. Currently these are
function implementations of native hardware: hypervisors will override the ops
structure with their own variants.
All the pv-ops functions are declared "fastcall" so that a specific
register-based ABI is used, to make inlining assember easier.
And:
+From: Andy Whitcroft <apw@shadowen.org>
The paravirt ops introduce a 'weak' attribute onto memory_setup().
Code ordering leads to the following warnings on x86:
arch/i386/kernel/setup.c:651: warning: weak declaration of
`memory_setup' after first use results in unspecified behavior
Move memory_setup() to avoid this.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
IOPL is implicitly saved and restored on task switch,
so explicit check is no longer needed.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Port two patches from i386 to x86_64 delay.c to make sure all rounding is done
upward instead of downward.
There is no sign in commit messages that the mismatch was done on purpose, and
"delay() guarantees sleeping at least for the specified time" is still a valid
rule IMHO.
The original x86 patches are both from pre-GIT era, i.e.:
"[PATCH] round up in __udelay()" in commit
54c7e1f5cc6771ff644d7bc21a2b829308bd126f
"[PATCH] add 1 in __const_udelay()" in commit
42c77a9801b8877d8b90f65f75db758822a0bccc
(both commits are from converted BK repository to x86_64).
AK: fixed gcc warning
linux/arch/x86_64/lib/delay.c:43: warning: suggest parentheses around + or - inside shift
(did this actually work?)
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andi Kleen <ak@suse.de>
This patch makes it possible to compile Calgary in but not use it by
default. In this mode, use 'iommu=calgary' to activate it.
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Andi Kleen <ak@suse.de>
This patch cleans up the previous "Use BIOS supplied BBAR information"
patch. Mostly stylistic clenaups, but also check for ioremap failure
when we ioremap the BBAR rather than when trying to use it.
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Laurent Vivier <Laurent.Vivier@bull.net>
Find the BBAR register address of each Calgary using the "Extended
BIOS Data Area" rather than calculating it ourselves. Also get the bus
topology (what PHB each bus is on) from Calgary rather than
calculating it ourselves.
This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=7407.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Andi Kleen <ak@suse.de>
The recent change to make x86_64 support i386 binaries compiled
with -mregparm=3 only covered signal handlers without SA_SIGINFO.
(the 3-arg "real-time" ones)
To be compatible with i386, both types should be supported.
Signed-off-by: Albert Cahalan <acahalan@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Instead of adding all kinds of more quirks try various timer
routing variants in check_timer.
In particular this tries to handle quirks from:
- Nvidia NF2-4 reference BIOS: wrong timer override
- Asus: Wrong timer override but no HPET table
- ATI: require timer disabled in 8259
- Some boards: require timer enabled in 8259
We just try many of the the known variants in the hopefully right order
in check_timer.
Trying pin 0/2 on Nvidia suggested by Tim Hockin.
TBD Experimental. Needs a lot of testing
Signed-off-by: Andi Kleen <ak@suse.de>
Makes the intention of the code cleaner to read and avoids
a potential deadlock on mmap_sem. Also change the types of
the arguments to not include __user because they're really
not user addresses.
Signed-off-by: Andi Kleen <ak@suse.de>
Clear the irq releated entries in irq_vector, irq_domain and vector_irq
instead of clearing irq_vector only. So when new irq is created, it
could reuse that vector. (actually is the second loop scanning from
FIRST_DEVICE_VECTOR+8). This could avoid the vectors are used up
with enough module inserting and removing
Cc: Eric W. Biedierman <ebiederm@xmission.com>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-By: Yinghai Lu <yinghai.lu@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
CLFLUSH is a lot faster than WBINVD so avoid the later if at all
possible.
Always pass the complete list of pages to other CPUs to cut down
the number of IPIs.
Minor other cleanup and sync with i386 version.
Signed-off-by: Andi Kleen <ak@suse.de>
The entry.S code at work_notifysig is surely wrong. It drops into unrelated
code if the branch to work_notifysig_v86 is taken, and CONFIG_VM86=n.
[PATCH] Make vm86 support optional
tree 9b5daef528
pushed to git Jan 8, 2006, and first appears in 2.6.16
The 'fix' here is to also compile out the vm86 test & branch when
CONFIG_VM86=n.
Signed-off-by: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Extend bzImage protocol to enable bootloaders to load a completely relocatable
bzImage. Now protected mode component of kernel is also relocatable and a
boot-loader can load the protected mode component at a differnt physical
address than 1MB. (If kernel was built with CONFIG_RELOCATABLE)
Kexec can make use of it to load this kernel at a different physical address
to capture kernel crash dumps.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
o Now CONFIG_PHYSICAL_START is being replaced with CONFIG_PHYSICAL_ALIGN.
Hardcoding the kernel physical start value creates a problem in relocatable
kernel context due to boot loader limitations. For ex, if somebody
compiles a relocatable kernel to be run from address 4MB, but this kernel
will run from location 1MB as grub loads the kernel at physical address
1MB. Kernel thinks that I am a relocatable kernel and I should run from
the address I have been loaded at. So somebody wanting to run kernel
from 4MB alignment location (for improved performance regions) can't do
that.
o Hence, Eric proposed that probably CONFIG_PHYSICAL_ALIGN will make
more sense in relocatable kernel context. At run time kernel will move
itself to a physical addr location which meets user specified alignment
restrictions.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
o Relocations generated w.r.t absolute symbols are not processed as by
definition, absolute symbols are not to be relocated. Explicitly warn
user about absolutions relocations present at compile time.
o These relocations get introduced either due to linker optimizations or
some programming oversights.
o Also create a list of symbols which have been audited to be safe and
don't emit warnings for these.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
This patch modifies the i386 kernel so that if CONFIG_RELOCATABLE is
selected it will be able to be loaded at any 4K aligned address below
1G. The technique used is to compile the decompressor with -fPIC and
modify it so the decompressor is fully relocatable. For the main
kernel relocations are generated. Resulting in a kernel that is relocatable
with no runtime overhead and no need to modify the source code.
A reserved 32bit word in the parameters has been assigned
to serve as a stack so we figure out where are running.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Print the addresses of non-absolute symbols relative to _text
so that ld will generate relocations. Allowing a relocatable
kernel to relocate them. We can't actually use the symbol names
because kallsyms includes static symbols that are not exported
from their object files.
Add the _text symbol definitions to the architectures which don't
define it otherwise linker will fail.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Defining __PHYSICAL_START and __KERNEL_START in asm-i386/page.h works but
it triggers a full kernel rebuild for the silliest of reasons. This
modifies the users to directly use CONFIG_PHYSICAL_START and linux/config.h
which prevents the full rebuild problem, which makes the code much
more maintainer and hopefully user friendly.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Currently when we are reserving the memory the kernel text
resides in we start at __PHYSICAL_START which happens to be
correct but not very obvious. In addition when we start relocating
the kernel __PHYSICAL_START is the wrong value, as it is an
absolute symbol that does not get relocated.
By starting the reservation at __pa_symbol(_text)
the code is clearer and will be correct when relocated.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Ld knows about 2 kinds of symbols, absolute and section
relative. Section relative symbols symbols change value
when a section is moved and absolute symbols do not.
Currently in the linker script we have several labels
marking the beginning and ending of sections that
are outside of sections, making them absolute symbols.
Having a mixture of absolute and section relative
symbols refereing to the same data is currently harmless
but it is confusing.
This must be done carefully as newer revs of ld do not place
symbols that appear in sections without data and instead
ld makes those symbols global :(
My ultimate goal is to build a relocatable kernel. The
safest and least intrusive technique is to generate
relocation entries so the kernel can be relocated at load
time. The only penalty would be an increase in the size
of the kernel binary. The problem is that if absolute and
relocatable symbols are not properly specified absolute symbols
will be relocated or section relative symbols won't be, which
is fatal.
The practical motivation is that when generating kernels that
will run from a reserved area for analyzing what caused
a kernel panic, it is simpler if you don't need to hard code
the physical memory location they will run at, especially
for the distributions.
[AK: and merged:]
o Also put a message so that in future people can be aware of it and
avoid introducing absolute symbols.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
On modern systems RAM errors don't cause NMIs, but it's usually
caused by PCI SERR. Mention PCI instead of RAM in the printk.
Reported by r_hayashi@ctc-g.co.jp (Ryutaro Hayashi)
Cc: r_hayashi@ctc-g.co.jp
Signed-off-by: Andi Kleen <ak@suse.de>
Resending as I believe the discussion about them established they were
correct.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Currently the idle loop has two nested loops -- one high level
in cpu_idle and in some low level idle functions another one.
Looping in the low level idle functions breaks the idle notifiers
because interrupts waking up sleep states need to execute
exit_idle() which is only in cpu_idle().
So don't do that, only loop in cpu_idle(). This only removes
code.
In some cases e.g. poll_idle the idle loop is a little longer
now because cpu_idle checks more things. I hope that isn't a problem
ACPI idle doesn't change behaviour because it never looped anyways.
Cc: len.brown@intel.com
Cc: eranian@hpl.hp.com
Signed-off-by: Andi Kleen <ak@suse.de>
Use the pcurrent field in the PDA to implement the "current" macro. This ends
up compiling down to a single instruction to get the current task.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Use the cpu_number in the PDA to implement raw_smp_processor_id. This is a
little simpler than using thread_info, though the cpu field in thread_info
cannot be removed since it is used for things other than getting the current
CPU in common code.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
sys_vm86 uses a struct kernel_vm86_regs, which is identical to pt_regs, but
adds an extra space for all the segment registers. Previously this structure
was completely independent, so changes in pt_regs had to be reflected in
kernel_vm86_regs. This changes just embeds pt_regs in kernel_vm86_regs, and
makes the appropriate changes to vm86.c to deal with the new naming.
Also, since %gs is dealt with differently in the kernel, this change adjusts
vm86.c to reflect this.
While making these changes, I also cleaned up some frankly bizarre code which
was added when auditing was added to sys_vm86.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
There are a few places where the change in struct pt_regs and the use of %gs
affect the userspace ABI. These are primarily debugging interfaces where
thread state can be inspected or extracted.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
This patch is the meat of the PDA change. This patch makes several related
changes:
1: Most significantly, %gs is now used in the kernel. This means that on
entry, the old value of %gs is saved away, and it is reloaded with
__KERNEL_PDA.
2: entry.S constructs the stack in the shape of struct pt_regs, and this
is passed around the kernel so that the process's saved register
state can be accessed.
Unfortunately struct pt_regs doesn't currently have space for %gs
(or %fs). This patch extends pt_regs to add space for gs (no space
is allocated for %fs, since it won't be used, and it would just
complicate the code in entry.S to work around the space).
3: Because %gs is now saved on the stack like %ds, %es and the integer
registers, there are a number of places where it no longer needs to
be handled specially; namely context switch, and saving/restoring the
register state in a signal context.
4: And since kernel threads run in kernel space and call normal kernel
code, they need to be created with their %gs == __KERNEL_PDA.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
When a CPU is brought up, a PDA and GDT are allocated for it. The GDT's
__KERNEL_PDA entry is pointed to the allocated PDA memory, so that all
references using this segment descriptor will refer to the PDA.
This patch rearranges CPU initialization a bit, so that the GDT/PDA are set up
as early as possible in cpu_init(). Also for secondary CPUs, GDT+PDA are
preallocated and initialized so all the secondary CPU needs to do is set up
the ldt and load %gs. This will be important once smp_processor_id() and
current use the PDA.
In all cases, the PDA is set up in head.S, before a CPU starts running C code,
so the PDA is always available.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
This patch has the basic definitions of struct i386_pda, and the segment
selector in the GDT.
asm-i386/pda.h is more or less a direct copy of asm-x86_64/pda.h. The most
interesting difference is the use of _proxy_pda, which is used to give gcc a
model for the actual memory operations on the real pda structure. No actual
reference is ever made to _proxy_pda, so it is never defined.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Use asm-offsets for the offsets of registers into the pt_regs struct, rather
than having hard-coded constants
I left the constants in the comments of entry.S because they're useful for
reference; the code in entry.S is very dependent on the layout of pt_regs,
even when using asm-offsets.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
This patch:
- makes ret_from_sys_call no longer global (all external users were
previously switched to use int_ret_from_sys_call)
- adjusts placement of a CFI_{REMEMBER,RESTORE}_STATE pair to better
fit logic flow
- eliminates an unnecessary pair of CFI_{REMEMBER,RESTORE}_STATE
- glues together function- and unwinder-wise the previously separate
system_call and int_ret_from_sys_call function fragments
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
ioremap must be balanced by an iounmap and failing to do so can result
in a memory leak.
Tested (compilation only):
- using allmodconfig
- making sure the files are compiling without any warning/error due to
new changes
Signed-off-by: Amol Lad <amol@verismonetworks.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Insert the Local APIC and IO APIC(s) into the resource tree. It allows the
APIC resources to be visible within /proc/iomem. The patch also takes into
account IO APIC(s) mapped in the PCI space by deferring the insertion until
after PCI has allocated its necessary resources.
Signed-off-by: Aaron Durbin <adurbin@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
i386 port of the sLeAZY-fpu feature. Chuck reports that this gives him a +/-
0.4% improvement on his simple benchmark
x86_64 description follows:
Right now the kernel on x86-64 has a 100% lazy fpu behavior: after *every*
context switch a trap is taken for the first FPU use to restore the FPU
context lazily. This is of course great for applications that have very
sporadic or no FPU use (since then you avoid doing the expensive save/restore
all the time). However for very frequent FPU users... you take an extra trap
every context switch.
The patch below adds a simple heuristic to this code: After 5 consecutive
context switches of FPU use, the lazy behavior is disabled and the context
gets restored every context switch. If the app indeed uses the FPU, the trap
is avoided. (the chance of the 6th time slice using FPU after the previous 5
having done so are quite high obviously).
After 256 switches, this is reset and lazy behavior is returned (until there
are 5 consecutive ones again). The reason for this is to give apps that do
longer bursts of FPU use still the lazy behavior back after some time.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Clean up the espfix code:
- Introduced PER_CPU() macro to be used from asm
- Introduced GET_DESC_BASE() macro to be used from asm
- Rewrote the fixup code in asm, as calling a C code with the altered %ss
appeared to be unsafe
- No longer altering the stack from a .fixup section
- 16bit per-cpu stack is no longer used, instead the stack segment base
is patched the way so that the high word of the kernel and user %esp
are the same.
- Added the limit-patching for the espfix segment. (Chuck Ebbert)
[jeremy@goop.org: use the x86 scaling addressing mode rather than shifting]
Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Zachary Amsden <zach@vmware.com>
Acked-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
When a spinlock lockup occurs, arrange for the NMI code to emit an all-cpu
backtrace, so we get to see which CPU is holding the lock, and where.
Cc: Andi Kleen <ak@muc.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
This patch removes the default_ldt[] array, as it has been unused since
iBCS stopped being supported. This means it is now possible to actually
set an empty LDT segment.
In order to deal with this, the set_ldt_desc/load_LDT pair has been
replaced with a single set_ldt() operation which is responsible for both
setting up the LDT descriptor in the GDT, and reloading the LDT register.
If there are no LDT entries, the LDT register is loaded with a NULL
descriptor.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Acked-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Here is a patch (used by perfmon2) to detect the presence of the Precise Event
Based Sampling (PEBS) feature for i386. The patch also adds the cpu_has_pebs
macro.
- adds X86_FEATURE_PEBS
- adds cpu_has_pebs to test for X86_FEATURE_PEBS
Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Here is a patch (used by perfmon2) to detect the presence of the
Precise Event Based Sampling (PEBS) feature for Intel 64-bit processors.
The patch also adds the cpu_has_pebs macro.
changelog:
- adds X86_FEATURE_PEBS
- adds cpu_has_pebs to test for X86_FEATURE_PEBS
Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
This just got removed on x86-64, do the same on 32bit.
It always annoyed me when this ate a line of oops output pushing
interesting stuff off the screen.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
The unwinder has some extra newlines, which eat up loads of screen
space when it spews. (See https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=137900
for a nasty example).
warning_symbol-> and warning-> already printk a newline, so don't add one
in the strings passed to them.
[AK: redone for new code]
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Fix checks that failed to realize that values are 4-kB-unit-sized (note the
format strings in this same diff context which *do* realize the unit size,
via appended "000"!). Also fix an incorrect below-1MB area check (as
gathered from Jan Beulich's unapplied patch at
http://www.ussg.iu.edu/hypermail/linux/kernel/0411.1/1378.html ) Update
mtrr_add_page() docu to make 4-kB-sized calculation more obvious.
Given several further items mentioned in Jan's patch mail, all in all MTRR
code seems surprisingly buggy, for a surprisingly long period of time (many
years). Further work/investigation would be useful.
TBD Note that my patch is pretty much UNTESTED, since I can only verify that it
TBD successfully boots my machine, but I cannot test against actual buggy
TBD hardware which would require these (formerly broken) checks. Long -mm
TBD simmering would make sense, especially since these now-working checks might
TBD turn out to have adverse effects on unaffected hardware.
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] Import updates from i386's i8259.c
[MIPS] *-berr: Header inclusions for DEC bus error handlers
[MIPS] Compile __do_IRQ() when really needed
[MIPS] genirq: use name instead of typename
[MIPS] Do not use handle_level_irq for ioasic_dma_irq_type.
[MIPS] pte_offset(dir,addr): parenthesis fix
The recent change to convert the is_enabled flag in the PCI device to an
atomic count broke the IA64 compilation.
As pcibios_disable_device is only ever called if the reference count
is zero, convert the if to a BUG_ON.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
A fixup to add missing header inclusions for bus error handlers for
DECstation system after the recent switch to get_irq_regs().
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
__do_IRQ() is needed only by irq handlers that can't use
default handlers defined in kernel/irq/chip.c.
For others platforms there's no need to compile this function
since it won't be used. For those platforms this patch defines
GENERIC_HARDIRQS_NO__DO_IRQ symbol which is used exactly for
this purpose.
Futhermore for platforms which do not use __do_IRQ(), end()
method which is part of the 'irq_chip' structure is not used.
This patch simply removes this method in this case.
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The "typename" field was obsoleted by the "name" field.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (43 commits)
sh: sh775x/titan fixes for irq header changes.
sh: update r7780rp defconfig.
sh: compile fixes for header cleanup.
sh: Fixup pte_mkhuge() build failure.
sh: set KBUILD_IMAGE to something sensible.
sh: show held locks in stack trace with lockdep.
sh: platform_pata support for R7780RP
sh: stacktrace/lockdep/irqflags tracing support.
sh: Fixup movli.l/movco.l atomic ops for gcc4.
sh: dyntick infrastructure.
sh: Clock framework tidying.
sh: Turn off IRQs around get_timer_offset() calls.
sh: Get the PGD right in oops case with 64-bit PTEs.
sh: Fix store queue bitmap end.
sh: More flexible + SH7780 earlyprintk SCIF support.
sh: Fixup various PAGE_SIZE == 4096 assumptions.
sh: Fixup 4K irq stacks.
sh: dma-api channel capability extensions.
sh: Drop name overload in dma-sh.
sh: Make dma-isa depend on ISA_DMA_API.
...
* git://git.infradead.org/users/dhowells/workq-2.6:
Actually update the fixed up compile failures.
WorkQueue: Fix up arch-specific work items where possible
WorkStruct: make allyesconfig
WorkStruct: Pass the work_struct pointer instead of context data
WorkStruct: Merge the pending bit into the wq_data pointer
WorkStruct: Typedef the work function prototype
WorkStruct: Separate delayable and non-delayable events.
Adds support for RTCs (through genrtc) for M68KNOMMU.
Board-specific code will have to link the appropriate RTC driver to the
mach_hwclk callback, at minimum.
This patch switches the 68360 code over to using rtc_time.
Signed-off-by: Gavin Lambert <gavinl@compacsort.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The 523x timer TRR register is a full 32bits, the older register (on
other ColdFire parts) was only 16 bits. Use the right type of
__raw_read when accessing it.
Problem found by Yaroslav Vinogradov <yaroslav.vinogradov@freescale.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following moves the creation of IPR interupts into setup-7750.c
and updates a few other things to make it all work after the "Drop
CPU subtype IRQ headers" commit. It boots and runs fine on my titan
board.
- adds an ipr_idx to the ipr_data and uses a function in the subtype
code to calculate the address of the IPR registers
- adds a function to enable individual interrupt mode for externals
in the subtype code and calls that from the titan board code
instead of doing it directly.
- I changed the shift in the ipr_data to be the actual # of bits to
shift, instead of the numnber / 4 - made it easier to match with
the manual.
Signed-off-by: Jamie Lenehan <lenehan@twibble.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Since some header inclusion paths were cleaned up, compilation
broke. Add in the headers we need directly to build again.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds a platform device for the directly connected
CF interface on R7780RP boards, for use with the
pata_platform libata driver.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds basic NO_IDLE_HZ support to the SH timer API so timers
are able to wire it up. Taken from the ARM version, as it fit in
to our API with very few changes needed.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This syncs up the SH clock framework with the linux/clk.h API,
for which there were only some minor changes required, namely
the clk_get() dev_id and subsequent callsites.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Since all of the sys_timer sources currently do this on their own
within the ->get_offset() path, it's more sensible to just have
the caller take care of it when grabbing xtime_lock. Incidentally,
this is more in line with what others (ie, ARM) are doing already.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Previously this was using a static pgd shift in the reporting
code, simply flip this to PGDIR_SHIFT which does the right
thing depending on varying PTE magnitudes on the SH-X2 MMU.
While we're at it, and since it's been recently added, use
get_TTB() for fetching the TTB, rather than the open coded
instructions.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The end of the store queue bitmap is miscalculated when searching
for a free range in sq_remap(), missing the PAGE_SHIFT shift that's
done in sq_api_init(). This runs in to workloads where we can scan
beyond the end of the bitmap.
Spotted by Paul Jackson:
http://marc.theaimsgroup.com/?l=linux-kernel&m=116493191224097&w
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This makes the early printk support somewhat more flexible,
moving the port definition to a config option, and making the
port initialization configurable for sh-ipl+g users.
At the same time, this allows us to trivially wire up the
SH7780 SCIF0, so that's thrown in too more or less for free.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
There were a number of places that made evil PAGE_SIZE == 4k
assumptions that ended up breaking when trying to play with
8k and 64k page sizes, this fixes those up.
The most significant change is the way we load THREAD_SIZE,
previously this was done via:
mov #(THREAD_SIZE >> 8), reg
shll8 reg
to avoid a memory access and allow the immediate load. With
a 64k PAGE_SIZE, we're out of range for the immediate load
size without resorting to special instructions available in
later ISAs (movi20s and so on). The "workaround" for this is
to bump up the shift to 10 and insert a shll2, which gives a
bit more flexibility while still being much cheaper than a
memory access.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
There was a clobber issue with the register we were saving
the stack in, so we switch to a register that we handle in
the clobber list properly already.
This also follows the x86 changes for allowing the softirq
checks from hardirq context.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This extends the SH DMA API for allowing handling of DMA
channels based off of their respective capabilities.
A couple of functions are added to the existing API,
the core bits are register_chan_caps() for registering
channel capabilities, and request_dma_bycap() for fetching
a channel dynamically based off of a capability set.
Signed-off-by: Mark Glaisher <mark.glaisher@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Pass along the dev_id from request_dma() all the way down,
rather than inserting an artificial name relating to the TEI
line that we were doing before.
This makes the line a bit less obvious, but dev_id is the proper
behaviour for this regardless.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Previously we linked in the ISA DMA wrapper unconditionally.
As there are very few users of this, it's better to make it
conditional.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Handle the case where no registered DMACs exist somewhat more
gracefully. While we're at it, check for sysdev_create_file()
failing.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The implementation of system call tracing in the kernel has a
couple of ordering problems:
- the validity of the system call number is checked before
calling out to system call tracing code, and should be
done after
- the system call number used when tracing is the one the
system call was invoked with, while the system call tracing
code can legitimatly change the call number (for example
strace permutes fork into clone)
This patch fixes both of these problems, and also reoders the
code slightly to make the direct path through the code the
common case.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Handle simple TLB miss faults which can be resolved completely
from the page table in assembler.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds support for a generic push switch framework. Adaptable for
various switches, including GPIO switches and the push switches commonly
found on Renesas debug boards.
This allows switch states to be trivially reported through sysfs.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Remove extra bits from the pmd structure and store a kernel logical
address rather than a physical address. This allows it to be directly
dereferenced. Another piece of wierdness inherited from x86.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add TTB accessor functions and give it a sensible default
value. We will use this later for optimizing the fault
path.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Remove the previous saving of fault codes into the thread_struct
as they are never used, and appeared to be inherited from x86.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This fixes up the kernel for gcc4. The existing exception handlers
needed some wrapping for pt_regs access, acessing the registers
via a RELOC_HIDE() pointer.
The strcpy() issues popped up here too, so add -ffreestanding and
kill off the symbol export.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Previously big endian was simply assumed if little endian was
not set, which led to some cflags ordering issues. There's not
much point to not having a big endian option, so shove one in
a choice and wire it up in the Makefile.
This lets us clean up some of the cflags ordering while we're
at it.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds some preliminary support for the SH-X2 MMU, used by
newer SH-4A parts (particularly SH7785).
This MMU implements a 'compat' mode with SH-X MMUs and an
'extended' mode for SH-X2 extended features. Extended features
include additional page sizes (8kB, 4MB, 64MB), as well as the
addition of page execute permissions.
The extended mode attributes are placed in a second data array,
which requires us to switch to 64-bit PTEs when in X2 mode.
With the addition of the exec perms, we also overhaul the mmap
prots somewhat, now that it's possible to handle them more
intelligently.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This drops the various IRQ headers that were floating around
and primarily providing hardcoded IRQ definitions for the
various CPU subtypes. This quickly got to be an unmaintainable
mess, made even more evident by the subtle breakage introduced
by the SH-2 and SH-2A changes.
Now that subtypes are able to register IRQ maps directly, just
rip all of the headers out.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
All of the various CPU subtypes currently hardcode TIMER_IRQ,
switch this to a config option in the few places we need this.
This allows further removal of hardcoded IRQ headers..
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds support for the Solution Engine 7206 and 7619.
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This splits out common bits from the existing exception handler for
use between SH-2/SH-2A and SH-3/4, and adds support for the SH-2/2A
exceptions.
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
SH-2A has special division hardware as opposed to a full-fledged FPU,
wire up the exception handlers for this.
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This implements initial support for the SH7206 (SH-2A) and SH7619
(SH-2) MMU-less CPUs.
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Fix up arch-specific work items where possible to use the new work_struct and
delayed_work structs.
Three places that enqueue bits of their stack and then return have been marked
with #error as this is not permitted.
Signed-Off-By: David Howells <dhowells@redhat.com>
Conflicts:
drivers/ata/libata-scsi.c
include/linux/libata.h
Futher merge of Linus's head and compilation fixups.
Signed-Off-By: David Howells <dhowells@redhat.com>
Conflicts:
drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c
Fix up merge failures with Linus's head and fix new compilation failures.
Signed-Off-By: David Howells <dhowells@redhat.com>
* master.kernel.org:/pub/scm/linux/kernel/git/paulus/powerpc: (194 commits)
[POWERPC] Add missing EXPORTS for mpc52xx support
[POWERPC] Remove obsolete PPC_52xx and update CLASSIC32 comment
[POWERPC] ps3: add a default zImage target
[POWERPC] Add of_platform_bus support to mpc52xx psc uart driver
[POWERPC] typo fix and whitespace cleanup on mpc52xx-uart driver
[POWERPC] Fix debug printks for 32-bit resources in the PCI code
[POWERPC] Replace kmalloc+memset with kzalloc
[POWERPC] Linkstation / kurobox support
[POWERPC] Add the e300c3 core to the CPU table.
[POWERPC] ppc: m48t35 add missing bracket
[POWERPC] iSeries: don't build head_64.o unnecessarily
[POWERPC] iSeries: stop dt_mod.o being rebuilt unnecessarily
[POWERPC] Fix cputable.h for combined build
[POWERPC] Allow CONFIG_BOOTX_TEXT on iSeries
[POWERPC] Allow xmon to build on legacy iSeries
[POWERPC] Change ppc64_defconfig to use AUTOFS_V4 not V3
[POWERPC] Tell firmware we can handle POWER6 compatible mode
[POWERPC] Clean images in arch/powerpc/boot
[POWERPC] Fix OF pci flags parsing
[POWERPC] defconfig for lite5200 board
...
The support for the 52xx-based systems is now included under
CONFIG_CLASSIC32, since the 52xx chips have a 603e-based core.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add a powerpc make target that can be loaded by the ps3 bootloader (kboot) and
set this as the default image to build for that platform.
Until the compressed zImage wrapper is made, this arranges for a stripped
vmlinux image to be built.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 32-bit version and 64-bit version are almost equal. Unify them. This
makes further improvements (for example, copying with parallel, supporting
PREFETCH, etc.) easier.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This is a fix for a typo repeated several times in #error directives for
invalid SiByte configurations.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (34 commits)
[S390] Don't use small stacks when lockdep is used.
[S390] cio: Use device_reprobe() instead of bus_rescan_devices().
[S390] cio: Retry internal operations after vary off.
[S390] cio: Use path verification for last path gone after vary off.
[S390] non-unique constant/macro identifiers.
[S390] Memory detection fixes.
[S390] cio: Make ccw_dev_id_is_equal() more robust.
[S390] Convert extmem spin_lock into a mutex.
[S390] set KBUILD_IMAGE.
[S390] lockdep: show held locks when showing a stackdump
[S390] Add dynamic size check for usercopy functions.
[S390] Use diag260 for memory size detection.
[S390] pfault code cleanup.
[S390] Cleanup memory_chunk array usage.
[S390] Misaligned wait PSW at memory detection.
[S390] cpu shutdown rework
[S390] cpcmd <-> __cpcmd calling issues
[S390] Bad kexec control page allocation.
[S390] Reset infrastructure for re-IPL.
[S390] Some documentation typos.
...
Remove use of __rom_end symbol all together. This helps clean out the
miscellaneous symbols lying around in the m68knommu linker script.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Here is a small patch to automatically detect the DRAM size on m520x.
It was generated against 2.6.17-uc0, and tested on an Intec 5208 dev board.
Signed-off-by: Michael Broughton <mbobowik@telusplanet.net>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It turns out SHMAT, SHMDT, SHMGET and SHMCTL support in sys_ipc() for
m68knommu in 2.6 kernel(uClinux-dist-20060803 release) is missing.
(copied from m68k sources, report by David Wu <davidwu@arcturusnetworks.com>)
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes the following compile error with
CONFIG_BLK_DEV_INITRD=n and -Werror-implicit-function-declaration:
...
CC arch/m68knommu/kernel/setup.o
/home/bunk/linux/kernel-2.6/linux-2.6.18-rc5-mm1/arch/m68knommu/kernel/setup.c: In function 'setup_arch':
/home/bunk/linux/kernel-2.6/linux-2.6.18-rc5-mm1/arch/m68knommu/kernel/setup.c:268: error: implicit declaration of function 'paging_init'
make[2]: *** [arch/m68knommu/kernel/setup.o] Error 1
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The lock dependency validator adds a bunch of extra stack frames to
the stack, which can cause stack overflows. Especially seen on 31 bit
where the small stack is only 4k.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
VMALLOC_END on 31bit should be 0x8000000UL instead of 0x7fffffffL.
The page mask which is used to make sure memory_end is on 4MB/2MB
boundary is wrong and not needed. Therefore remove it.
Make sure a vmalloc area does also exist and work on (future)
machines with 4TB and more memory.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There's no need to have a spin_lock here, but need sleepable context
for vmem_map. Therefore convert the spin_lock into a mutex.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Set KBUILD_IMAGE to a sane value. This enables "make rpm"
Signed-off-by: Christian Borntraeger <cborntra@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Follow i386/x86_64:
lockdep can be used to print held locks when printing a
backtrace. This can be useful when debugging things like
'scheduling while atomic' asserts.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use a wrapper for copy_to/from_user to chose the best usercopy method.
The mvcos instruction is better for sizes greater than 256 bytes, if
mvcos is not available a page table walk is better for sizes greater
than 1024 bytes. Also removed the redundant copy_to/from_user_std_small
functions.
Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Avoid the tprot loop if diag260 works and reports that there are no
holes in memory. The tprot instruction can lead to a significant delay
in the ipl process if the virtual guest has a lot of memory and the
host is under memory pressure.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Need this at yet another file and don't want to add yet another
extern...
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If the memory detection code would ever reach the point where it would
load the wait psw, it would generate a specification exception and the
system would crash at ipl time. This is because of a misaligned wait
psw. It needs to be on a double word boundary instead of a word
boundary.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Let one master cpu kill all other cpus instead of sending an external
interrupt to all other cpus so they can kill themselves.
Simplifies reipl/shutdown functions a lot.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In case of reipl cpcmd gets called when all other cpus are not running
anymore. To prevent deadlocks change __cpcmd so that it doesn't take
any locks and call cpcmd or __cpcmd, whatever is correct in the current
context.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In case of re-IPL and diag308 doesn't work we have to reset all devices
manually and wait synchronously that each reset finished.
This patch adds the necessary infrastucture and the first exploiter of it.
Subsystems that need to add a function that needs to be called at re-IPL
may register/unregister this function via
struct reset_call {
struct reset_call *next;
void (*fn)(void);
};
void register_reset_call(struct reset_call *reset);
void unregister_reset_call(struct reset_call *reset);
When the registered function get called the context is:
- all cpus beside the current one are stopped
- all machine checks and interrupts are disabled
- prefixing is disabled
- a default machine check handler is available for use
The registered functions may not take any locks are sleep.
For the common I/O layer part of this patch:
Introduce a reset_call css_reset that does the following:
- clear all subchannels
- perform a rchp on all channel paths and wait for the resulting
machine checks
This replaces the calls to clear_all_subchannels() and
cio_reset_channel_paths() for kexec and ccw reipl. reipl_ccw_dev() now
uses reipl_find_schid() to determine the subchannel id for a given
device id.
Also remove cio_reset_channel_paths() and friends since they are not
needed anymore.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
segment save will exit with a lock held if the passed segment doesn't
exist. Any subsequent call to segment_save will lead to a deadlock.
Fix this and give up the lock before returning.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Since the diag 308 reipl method is superior to the ccw method, we should
use it whenever it is possible. We can do that, if the user has not
specified a new reipl ccw device and the system has been ipled from
a ccw device.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If reboot fails (e.g. because wrong devno has been specified by the user),
we should just stop all cpus, but should not trigger a kernel panic.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If multiple kernel images are installed on one DASD, the loadparm can be used
to select the boot configuration. This patch introduces the following two new
sysfs attributes:
/sys/firmware/ipl/loadparm: shows loadparm of current system (ro)
/sys/firmware/reipl/ccw/loadparm: loadparm used for next reboot (rw)
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The SALIPL entry point has an needless memory detection routine as we
later check the memory size again. The SALIPL code also uses diagnose
0x060 if we are running under VM, but this diagnose is not compatible
with the 64 bit addressing mode. The solution is to get rid of this
code and rely on the memory detection in the startup code.
Signed-off-by: Christian Borntraeger <cborntra@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Replace the 'is_b' variable with 'slot_b' in at91_mmc_data.
Also add the new 'chipselect' variable for CF/PCMCIA and 'bus_width_16'
variable for NAND.
This (and previous patches) will unfortunately break the current MMC,
USB Gadget and PCMCIA drivers. Updates and fixes for those drivers will
be submitted to the various subsystem maintainers.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cure the damage done by the former PCI debug printks fix for the case
of 64-bit resources (commit 685143ac1f)
which broke it for the plain vanilla 32-bit resources...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Support for the Kurobox(HG)/LinkStation-I NAS systems by Buffalo
Technology, should be also applicable to the PPC TeraStation family.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Remove CPU_FTR_16M_PAGE from the cupfeatures mask at runtime on iSeries.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
xmon still does not run on iSeries, but this allows us to build a combined
kernel that includes it.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Defconfig ppc64 kernels running under late-model distros
may hang in the automounter rc.d script, which seems to be
expecting autofs version 4. This patch uses the newer autofs.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds the "logical" PVR value used by POWER6 in "compatible" mode
to the list of PVR values that the kernel tells firmware it is able to
handle.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add a rule to clean up the various generated image files in
arch/powerpc/boot.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
For PCI devices with only io ports, of_bus_pci_get_flags() will fall
through and still mark the resource as IORESOURCE_MEM.
Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Adds utility routines used by 52xx device drivers and board support
code. Main functionality is to add device nodes to the of_platform_bus,
retrieve the IPB bus frequency, and find+ioremap device registers.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There is no need to expose these settings outside the scope
of the interrupt controller code itself.
Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The Efika board isn't different enough from other 52xx based boards to
justify a separate platform. This patch merges it with the support
code for all other 52xx based boards.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
platforms/embedded6xx is probably going away, and 52xx boards need
some extra support the 52xx interrupt controller and DMA engine
anyway. It makes sense to group all the 52xx bits into a single path.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
No other chips use this device, it belongs in a 52xx-specific path.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
It doesn't hurt to have this enabled on legacy iSeries and will mean it
is available for a combined build.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Commit 3ccfc65c50 missed the same fixes for
legacy iSeries specific code, so make some more symbols no longer global.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Mostly taken from corresponding Makefile's make-clean rule.
Tested by (cross)compiling for $ARCH PPC and POWERPC and checking
output of git-status.
Signed-off-by: Rutger Nijlunsing <git-commit@tux.tmfweb.nl>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds support for flash device descriptions to the OF device tree.
It's inspired by and partially borrowed from Sergei's patch "[RFC]
Adding MTD to device tree.patch".
Signed-off-by: Vitaly Wool <vwool@ru.mvista.com>
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This allows any secondary CPU thread also to become boot cpu for
POWER5. The patch is required to solve kdump boot issue when the
kdump kernel is booted with parameter "maxcpus=1". XICS init code
tries to match the current boot cpu id with "reg" property in each CPU
node in the device tree. But CPU node is created only for primary
thread CPU ids and "reg" property only reflects primary CPU ids. So
when a kernel is booted on a secondary cpu thread above condition will
never meet and the default distribution server is left as zero. This
leads to route the interrupts to CPU 0, but which is not online at
this time.
We use ibm,ppc-interrupt-server#s to check for both primary and
secondary CPU ids. Accordingly default distribution server value is
initialized from "ibm,ppc-interrupt-gserver#s" property. We loop
through ibm,ppc-interrupt-gserver#s property to find the global
distribution server from the last entry that matches with boot cpuid.
Signed-off-by: Mohan Kumar M <mohan@in.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
It's currently not possible to build the default zImage
target if PS3 is the only selected platform. This is
a hack to fall back to building the pseries style
zImage, so the build is successful. This will probably
change in the future, if someone writes a PS3 specific
boot wrapper.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
A few code paths need to check whether or not they are running
on the PS3's LV1 hypervisor before making hcalls. This introduces
a new firmware feature bit for this, FW_FEATURE_PS3_LV1.
Now when both PS3 and IBM_CELL_BLADE are enabled, but not PSERIES,
FW_FEATURE_PS3_LV1 and FW_FEATURE_LPAR get enabled at compile time,
which is a bug. The same problem can also happen for (PPC_ISERIES &&
!PPC_PSERIES && PPC_SOMETHING_ELSE). In order to solve this, I
introduce a new CONFIG_PPC_NATIVE option that is set when at least
one platform is selected that can run without a hypervisor and then
turns the firmware feature check into a run-time option.
The new cell oprofile support that was recently merged does not
work on hypervisor based platforms like the PS3, therefore make
it depend on PPC_CELL_NATIVE instead of PPC_CELL. This may change
if we get oprofile support for PS3.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
When renaming CONFIG_PS3 to CONFIG_PPC_PS3, a few occurrences have been
missed.
I also fixed up the alignment in arch/powerpc/platforms/Makefile.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
It may be desireable to build a kernel for cell without
spufs, e.g. as the initial kboot kernel. This requires
that the SPU specific parts of the core dump and the xmon
code depend on CONFIG_SPU_BASE instead of CONFIG_PPC_CELL.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Currently, we only send a sigtrap if the current task is being ptraced.
This is somewhat inconsistant, and it breaks utrace support in fedora.
Removing the check should do the right thing in all cases.
Cc: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This changes the spu_create system call to return an error (-ENODEV) if
and isolated spu context is requested on hardware that doesn't support
isolated mode.
Tested on systemsim with and without isolation support
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This patch fixes a compile issue for the Efika platform recently
introduced by API changes.
Signed-off-by: Nicolas DET <nd@bplan-gmbh.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Change the oprofile_cpu_type in cputables.c to be ppc64/970MP. Oprofile
needs to distinquish the MP from other 970 processors so it can add some
new counters specific to the 970MP.
Signed-off-by: Mike Wolf <mjw@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
In the common cell kernel, I want to have ps3 enabled
to find potential bugs at compile-time.
Also enable SPU disassembly in xmon.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Adds a ps3_defconfig for the PS3 game console.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Adds spu support for the PS3 platform.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Adds support for early access to the parameter data from the PS3 'Other OS'
flash memory area. The parameter data mainly holds user preferences like
static ip address.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Adds some needed bits for a config option PS3_USE_LPAR_ADDR that disables
the PS3 lpar address translation mechanism. This is a currently needed
workaround for limitations in the design of the generic cell spu support.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Adds the core platform support for the PS3 game console and other devices
using the PS3 hypervisor.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This fixes the xmon support for the cell spu to be compatable with the split
spu platform code.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This adds a platform specific spu management abstraction and the coresponding
routines to support the IBM Cell Blade. It also removes the hypervisor only
resources that were included in struct spu.
Three new platform specific routines are introduced, spu_enumerate_spus(),
spu_create_spu() and spu_destroy_spu(). The underlying design uses a new
type, struct spu_management_ops, to hold function pointers that the platform
setup code is expected to initialize to instances appropriate to that platform.
For the IBM Cell Blade support, I put the hypervisor only resources that were
in struct spu into a platform specific data structure struct spu_pdata.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This includes:
* version 1.24 of ppc-dis.c
* version 1.88 of ppc-opc.c
* version 1.23 of ppc.h
I can't vouch for the accuracy etc. of these changes, but it brings
us into line with binutils - and from a cursory test appears to work
fine.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
While adding spu disassembly support it struck me that we're actually
carrying quite a lot of code around, just to do disassembly in the case
of a crash.
While on large systems it's not an issue, on smaller ones it might be
nice to have xmon - but without the weight of the disassembly support.
For a Cell build this saves ~230KB (!), and for pSeries ~195KB.
We still support the 'di' and 'sdi' commands, however they just dump
the instruction in hex.
Move the definitions into a header to clean xmon.c just a tiny bit.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This patch adds a "sdi" command to xmon, to disassemble the contents
of an spu's local store.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This patch imports and munges the spu disassembly code from binutils.
All files originated from version 1.1 in binutils cvs.
* spu.h, spu-insns.h and spu-opc.c are unchanged except for pathnames.
* spu-dis.c has been edited heavily:
* use printf instead of info->fprintf_func and similar.
* pass the instruction in rather than reading it.
* we have no equivalent to symbol_at_address_func, so we just assume
there is never a symbol at the address given.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
In order to do disassembly of spu binaries in xmon, we need to abstract
the disassembly function from ppc_inst_dump.
We do this by making the actual disassembly function a function pointer
that we pass to ppc_inst_dump(). To save updating all the callers, we
turn ppc_inst_dump() into generic_inst_dump() and make ppc_inst_dump()
a wrapper which always uses print_insn_powerpc().
Currently we pass the dialect into print_insn_powerpc(), but we always
pass 0 - so just make it a local.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Add a command to xmon to dump the memory of a spu's local store.
This mimics the 'd' command which dumps regular memory, but does
a little hand holding by taking the user supplied address and
finding that offset in the local store for the specified spu.
This makes it easy for example to look at what was executing on a spu:
1:mon> ss
...
Stopped spu 04 (was running)
...
1:mon> sf 4
Dumping spu fields at address c0000000019e0a00:
...
problem->spu_npc_RW = 0x228
...
1:mon> sd 4 0x228
d000080080318228 01a00c021cffc408 4020007f217ff488 |........@ ..!...|
Aha, 01a00c02, which is of course rdch $2,$ch24 !
--
Updated to only do the setjmp goo around the spu access, and not
around prdump because it does its own (via mread).
Also the num variable is now common between sf and sd, so you don't
have to keep typing the spu number in if you're repeating commands
on the same spu.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
After stopping spus in xmon I often find myself trawling through the
field dumps to find out which spus were running. The spu stopping
code actually knows what's running, so let's print it out to save
the user some futzing.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
My patch to add spu helpers to xmon (a898497088)
introduced a few sparse warnings, because I was dereferencing an __iomem
pointer.
I think the best way to handle it is to actually use the appropriate in_beXX
functions. Need to rejigger the DUMP macro a little to accomodate that.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
With soft-disabled interrupts in power_save, we can
still get external exceptions on Cell, even if we are
in pause(0) a.k.a. sleep state.
When the CPU really wakes up through the 0x100 (system reset)
vector, while we have already started processing the 0x500
(external) exception, we get a panic in unrecoverable_exception()
because of the lost state.
This occurred in Systemsim for Cell, but as far as I can see,
it can theoretically occur on any machine that uses the
system reset exception to get out of sleep state.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This patch adds SPU elf notes to the coredump. It creates a separate note
for each of /regs, /fpcr, /lslr, /decr, /decr_status, /mem, /signal1,
/signal1_type, /signal2, /signal2_type, /event_mask, /event_status,
/mbox_info, /ibox_info, /wbox_info, /dma_info, /proxydma_info, /object-id.
A new macro, ARCH_HAVE_EXTRA_NOTES, was created for architectures to
specify they have extra elf core notes.
A new macro, ELF_CORE_EXTRA_NOTES_SIZE, was created so the size of the
additional notes could be calculated and added to the notes phdr entry.
A new macro, ELF_CORE_WRITE_EXTRA_NOTES, was created so the new notes
would be written after the existing notes.
The SPU coredump code resides in spufs. Stub functions are provided in the
kernel which are hooked into the spufs code which does the actual work via
register_arch_coredump_calls().
A new set of __spufs_<file>_read/get() functions was provided to allow the
coredump code to read from the spufs files without having to lock the
SPU context for each file read from.
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Dwayne Grant McConnell <decimal@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Devices with no "reg" nor "dcr-reg" property are given a bus_id which
is the node name alone. This means that if more than one such device
with the same names are present in the system, sysfs will have
collisions when creating the symlinks and will fail registering the
devices.
This works around that problem by assigning successive numbers to such
devices.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds code to look at the properties firmware puts in the device
tree to determine what compatibility mode the partition is in on
POWER6 machines, and set the ELF aux vector AT_HWCAP and AT_PLATFORM
entries appropriately.
Specifically, we look at the cpu-version property in the cpu node(s).
If that contains a "logical" PVR value (of the form 0x0f00000x), we
call identify_cpu again with this PVR value. A value of 0x0f000001
indicates the partition is in POWER5+ compatibility mode, and a value
of 0x0f000002 indicates "POWER6 architected" mode, with various
extensions disabled. We also look for various other properties:
ibm,dfp, ibm,purr and ibm,spurr.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add PPU event-based and cycle-based profiling support to Oprofile for Cell.
Oprofile is expected to collect data on all CPUs simultaneously.
However, there is one set of performance counters per node. There are
two hardware threads or virtual CPUs on each node. Hence, OProfile must
multiplex in time the performance counter collection on the two virtual
CPUs.
The multiplexing of the performance counters is done by a virtual
counter routine. Initially, the counters are configured to collect data
on the even CPUs in the system, one CPU per node. In order to capture
the PC for the virtual CPU when the performance counter interrupt occurs
(the specified number of events between samples has occurred), the even
processors are configured to handle the performance counter interrupts
for their node. The virtual counter routine is called via a kernel
timer after the virtual sample time. The routine stops the counters,
saves the current counts, loads the last counts for the other virtual
CPU on the node, sets interrupts to be handled by the other virtual CPU
and restarts the counters, the virtual timer routine is scheduled to run
again. The virtual sample time is kept relatively small to make sure
sampling occurs on both CPUs on the node with a relatively small
granularity. Whenever the counters overflow, the performance counter
interrupt is called to collect the PC for the CPU where data is being
collected.
The oprofile driver relies on a firmware RTAS call to setup the debug bus
to route the desired signals to the performance counter hardware to be
counted. The RTAS call must set the routing registers appropriately in
each of the islands to pass the signals down the debug bus as well as
routing the signals from a particular island onto the bus. There is a
second firmware RTAS call to reset the debug bus to the non pass thru
state when the counters are not in use.
Signed-off-by: Carl Love <carll@us.ibm.com>
Signed-off-by: Maynard Johnson <mpjohn@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The following routines are added to arch/powerpc/platforms/cell/pmu.c:
cbe_clear_pm_interrupts()
cbe_enable_pm_interrupts()
cbe_disable_pm_interrupts()
cbe_query_pm_interrupts()
cbe_pm_irq()
cbe_init_pm_irq()
This also adds a routine in arch/powerpc/platforms/cell/interrupt.c and
some macros in cbe_regs.h to manipulate the IIC_IR register:
iic_set_interrupt_routing()
Signed-off-by: Kevin Corry <kevcorry@us.ibm.com>
Signed-off-by: Carl Love <carll@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Move some PMU-related macros and function prototypes from cbe_regs.h
and pmu.h in arch/powerpc/platforms/cell/ to a new header at
include/asm-powerpc/cell-pmu.h
This is cleaner to use from the oprofile code, since that sits in
arch/powerpc/oprofile, not in the cell platform directory.
Signed-off-by: Kevin Corry <kevcorry@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
More macros for manipulating bits in the Cell PMU control registers.
Signed-off-by: Kevin Corry <kevcorry@us.ibm.com>
Signed-off-by: Carl Love <carll@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add symbol-exports for the new routines in arch/powerpc/platforms/cell/pmu.c.
They are needed for Oprofile, which can be built as a module.
Signed-off-by: Kevin Corry <kevcorry@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
In order to fit with the "don't-run-spus-outside-of-spu_run" model, this
patch starts the isolated-mode loader in spu_run, rather than
spu_create. If spu_run is passed an isolated-mode context that isn't in
isolated mode state, it will run the loader.
This fixes potential races with the isolated SPE app doing a
stop-and-signal before the PPE has called spu_run: bugzilla #29111.
Also (in conjunction with a mambo patch), this addresses #28565, as we
always set the runcntrl register when entering spu_run.
It is up to libspe to ensure that isolated-mode apps are cleaned up
after running to completion - ie, put the app through the "ISOLATE EXIT"
state (see Ch11 of the CBEA).
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This change adds a read accessor for the SPE problem-state run control
register.
This is required for for applying (userspace) changes made to the run
control register while the SPE is stopped - simply asserting the master
run control bit is not sufficient. My next patch for isolated-mode
setup requires this.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When the user changes the runcontrol register, an SPU might be
running without a process being attached to it and waiting for
events. In order to prevent this, make sure we always disable
the priv1 master control when we're not inside of spu_run.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When fixing spufs to map the 'mem' file backing store cacheable,
I incorrectly set the physical mapping to use both cache-inhibited
and guarded mapping, which resulted in a serious performance
degradation.
Debugged-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When one of the spufs files is mapped into a process address
space, regular users can use ptrace to attempt accessing
them with access_process_vm(). With the way that the
mappings currently work, this likely causes an oops.
Setting the vm_flags to VM_IO makes sure that ptrace can
not access them but returns an error code. This is not
the perfect solution in case of the local store mapping,
but it fixes the oops in a well-defined way.
Also remove leftover VM_RESERVED flags in spufs. The
VM_RESERVED flag is on it's way out and not checked by
the memory managment code anymore.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Christoph Hellwig <chellwig@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When there is pending signals, current spufs_run_spu() always returns
-ERESTARTSYS and it is called again automatically.
But, if spe already stopped by stop-and-signal or halt instruction,
returning -ERESTARTSYS makes stop-and-signal/halt lost and
spu run over the end-point.
For your convenience, I attached a sample code to restage this bug.
If there is no bug, printed NPC will be 0x4000.
Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When we attempt an MFC DMA to an unmapped address, the event
returned from spu_run should be SPE_EVENT_SPE_DATA_STORAGE,
not SPE_EVENT_INVALID_DMA.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Replace the use of the platform specific variable spu.nid with the
platform independednt variable spu.node.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>