This adds the back-end for the PMU on POWER7 processors. POWER7
has 4 fully-programmable counters and two fixed-function counters
(which do respect the freeze conditions, can generate interrupts,
and are writable, unlike PMC5/6 on POWER5+/6).
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18992.36329.189378.17992@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We currently log hw.sample_period for PERF_SAMPLE_PERIOD, however this is
incorrect. When we adjust the period, it will only take effect the next
cycle but report it for the current cycle. So when we adjust the period
for every cycle, we're always wrong.
Solve this by keeping track of the last_period.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For easy extension of the sample data, put it in a structure.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (76 commits)
x86, apic: Fix dummy apic read operation together with broken MP handling
x86, apic: Restore irqs on fail paths
x86: Print real IOAPIC version for x86-64
x86: enable_update_mptable should be a macro
sparseirq: Allow early irq_desc allocation
x86, io-apic: Don't mark pin_programmed early
x86, irq: don't call mp_config_acpi_gsi() if update_mptable is not enabled
x86, irq: update_mptable needs pci_routeirq
x86: don't call read_apic_id if !cpu_has_apic
x86, apic: introduce io_apic_irq_attr
x86/pci: add 4 more return parameters to IO_APIC_get_PCI_irq_vector(), fix
x86: read apic ID in the !acpi_lapic case
x86: apic: Fixmap apic address even if apic disabled
x86: display extended apic registers with print_local_APIC and cpu_debug code
x86: read apic ID in the !acpi_lapic case
x86: clean up and fix setup_clear/force_cpu_cap handling
x86: apic: Check rev 3 fadt correctly for physical_apic bit
x86/pci: update pirq_enable_irq() to setup io apic routing
x86/acpi: move setup io apic routing out of CONFIG_ACPI scope
x86/pci: add 4 more return parameters to IO_APIC_get_PCI_irq_vector()
...
kvm_vcpu_block() unhalts vpu on an interrupt/timer without checking
if interrupt window is actually opened.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
* topic/asoc: (135 commits)
ASoC: Apostrophe patrol
ASoC: codec tlv320aic23 fix bogus divide by 0 message
ASoC: fix NULL pointer dereference in soc_suspend()
ASoC: Fix build error in twl4030.c
ASoC: SSM2602: assign last substream to the master when shutting down
ASoC: Blackfin: document how anomaly 05000250 is handled
ASoC: Blackfin: set the transfer size according the ac97_frame size
ASoC: SSM2602: remove unsupported sample rates
ASoC: TWL4030: Check the interface format for 4 channel mode
ASoC: TWL4030: Use reg_cache in twl4030_init_chip
ASoC: Initialise dev for the dummy S/PDIF DAI
ASoC: Add dummy S/PDIF codec support
ASoC: correct print specifiers for unsigneds
ASoC: Modify mpc5200 AC97 driver to use V9 of spin_event_timeout()
ASoC: Switch FSL SSI DAI over to symmetric_rates
ASoC: Mark MPC5200 AC97 as BROKEN until PowerPC merge issues are resolved
ASoC: Fabric bindings for STAC9766 on the Efika
ASoC: Support for AC97 on Phytec pmc030 base board.
ASoC: AC97 driver for mpc5200
ASoC: Main rewite of the mpc5200 audio DMA code
...
This patch includes the basic infrastructure to use swiotlb
bounce buffering on 32-bit powerpc. It is not yet enabled on
any platforms. Probably the most interesting bit is the
addition of addr_needs_map to dma_ops - we need this as
a dma_op because the decision of whether or not an addr
can be mapped by a device is device-specific.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit 45db924089 ("powerpc/spufs: Remove
double check for non-negative dentry") removed the only user of the
out_dput label, so remove it and the code following it.
Gets rid of this warning:
arch/powerpc/platforms/cell/spufs/inode.c: In function 'spufs_create':
arch/powerpc/platforms/cell/spufs/inode.c:647: warning: label 'out_dput' defined but not used
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
resource_size_t is 64 bits on PowerPC 64.
Gets rid of this warning:
arch/powerpc/kernel/pci_64.c: In function 'pcibios_map_io_space':
arch/powerpc/kernel/pci_64.c:504: warning: format '%016lx' expects type 'long unsigned int', but argument 2 has type 'resource_size_t'
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Gets rid of this warning:
arch/powerpc/xmon/xmon.c: In function 'dump_log_buf':
arch/powerpc/xmon/xmon.c:2133: warning: unused variable 'i'
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
resource_size_t is 64 bits on pseries
Gets rid of these warnings:
arch/powerpc/platforms/pseries/iommu.c: In function 'pci_dma_bus_setup_pSeries':
arch/powerpc/platforms/pseries/iommu.c:391: warning: format '%lx' expects type 'long unsigned int', but argument 2 has type 'resource_size_t'
arch/powerpc/platforms/pseries/iommu.c:417: warning: format '%lx' expects type 'long unsigned int', but argument 2 has type 'resource_size_t'
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is a random collection of added ifdef's around portions of
code that only mak sense on server processors. Using either
CONFIG_PPC_STD_MMU_64 or CONFIG_PPC_BOOK3S as seems appropriate.
This is meant to make the future merging of Book3E 64-bit support
easier.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch has no effect other than re-ordering PACA fields on
current server CPUs. It however is a pre-requisite for future
support of BookE 64-bit processors. Various parts of the PACA
struct are now moved under some ifdef's, either the new
CONFIG_PPC_BOOK3S or CONFIG_PPC_STD_MMU_64, whatever seems more
appropriate.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.craashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To prepare for future support of Book3E 64-bit PowerPC processors,
which use a completely different exception handling, we move that
code to a new exceptions-64s.S file.
This file is #included from head_64.S due to some of the absolute
address requirements which can currently only be fulfilled from
within that file.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch introduce a new Kconfig option, CONFIG_PPC_BOOK3S
that represents processors that are compliant with the "classic"
(aka "server") variant of the PowerPC architecture.
It replaces CONFIG_6xx on 32-bit (though the symbol is still
defined for compatibility) and encompass all currently supported
64-bit processors.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently, load_up_altivec and give_up_altivec are duplicated
in 32-bit and 64-bit. This creates a common implementation that
is moved away from head_32.S, head_64.S and misc_64.S and into
vector.S, using the same macros we already use for our common
implementation of load_up_fpu.
I also moved the VSX code over to vector.S though in that case
I didn't make it build on 32-bit (yet).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For some obscure reason, we only set init_bootmem_done after initializing
bootmem when NUMA isn't enabled. We even document this next to the declaration
of that global in system.h which of course I didn't read before I had to
debug why some WIP code wasn't working properly...
This patch changes it so that we always set it after bootmem is initialized
which should have always been the case... go figure !
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The MMU context_lock can be taken from switch_mm() while the
rq->lock is held. The rq->lock can also be taken from interrupts,
thus if we get interrupted in destroy_context() with the context
lock held and that interrupt tries to take the rq->lock, there's
a possible deadlock scenario with another CPU having the rq->lock
and calling switch_mm() which takes our context lock.
The fix is to always ensure interrupts are off when taking our
context lock. The switch_mm() path is already good so this fixes
the destroy_context() path.
While at it, turn the context lock into a new style spinlock.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch fixes a couple of issues that can happen as a result
of steal_context() dropping the context_lock when all possible
PIDs are ineligible for stealing (hopefully an extremely hard to
hit occurence).
This case exposes the possibility of a stale context_mm[] entry
to be seen since destroy_context() doesn't clear it and the free
map isn't re-tested. It also means steal_context() will not notice
a context freed while the lock was help, thus possibly trying to
steal a context when a free one was available.
This fixes it by always returning to the caller from steal_context
when it dropped the lock with a return value that causes the
caller to re-samble the number of free contexts, along with
properly clearing the context_mm[] array for destroyed contexts.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reworked by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This adds block-step support on powerpc, including a PTRACE_SINGLEBLOCK
request for ptrace.
The BookE implementation is tweaked to fire a single step after a
block step in order to mimmic the server behaviour.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
As subject says, add dts files for Xilinx ML510 reference design with
the PCI host bridge device.
Signed-off-by: Roderick Colenbrander <thunderbird2k@gmail.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch refactors some of the xilinx_intc interrupt controller driver
and adds support for cascading an i8259 off one of the irq lines.
This patch was based on the ML510 support work done by Roderick
Colenbrander.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch adds support for the Xilinx plbv46-pci-1.03.a PCI host
bridge IPcore.
Signed-off-by: Roderick Colenbrander <thunderbird2k@gmail.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Counter type is a frequently used value and we do a lot of
bit juggling by encoding and decoding it from attr->config.
Clean this up by creating a separate attr->type field.
Also clean up the various similarly complex user-space bits
all around counter attribute management.
The net improvement is significant, and it will be easier
to add a new major type (which is what triggered this cleanup).
(This changes the ABI, all tools are adapted.)
(PowerPC build-tested.)
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This cleans up the kilauea/halekala board ports to use the ppc40x_simple
platform support.
Tested-by: Stefan Roese <sr@denx.de>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This cleans up the makalu board port to use the ppc40x_simple platform
support.
Tested-by: Stefan Roese <sr@denx.de>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Now that the 4xx NAND driver is available again in arch/powerpc, let's
enable it on Sequoia. This patch also disables the early debug messages
(CONFIG_PPC_EARLY_DEBUG) in the Sequoia defconfig.
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Commit b23f3325 ("perf_counter: Rename various fields") fixed up
most of the uses of the renamed fields, but missed one instance
of "record_type" in powerpc-specific code which needs to be changed
to "sample_type", and a "PERF_RECORD_ADDR" in the same statement that
needs to be changed to "PERF_SAMPLE_ADDR", causing compilation
errors on powerpc. This fixes it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18983.3111.770392.800486@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The single board defconfig files were missed during the cleanup
of CONFIG_PCI_LEGACY in the multi-board config files. This
disables the option for the single board configs, as it isn't
used by anything for these boards.
Reported-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
When using interrupting counters and limited (non-interrupting)
counters at the same time, it's possible that we get an
interrupt in write_mmcr0() after writing MMCR0 but before we
have set up the counters using limited PMCs. What happens then
is that we get into perf_counter_interrupt() with
counter->hw.idx = 0 for the limited counters, leading to the
"oops trying to read PMC0" error message being printed.
This fixes the problem by making perf_counter_interrupt()
robust against counter->hw.idx being zero (the counter is just
ignored in that case) and also by changing write_mmcr0() to
write MMCR0 initially with the counter overflow interrupt
enable bits masked (set to 0). If the MMCR0 value requested by
the caller has either of those bits set, we write MMCR0 again
with the requested value of those bits after setting up the
limited counters properly.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Cc: Stephane Eranian <eranian@googlemail.com>
LKML-Reference: <18982.17684.138182.954599@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit ef923214 ("perf_counter: powerpc: use u64 for event
codes internally") introduced a bug where the return value from
function find_alternative_bdecode gets put into a u64 variable
and later tested to see if it is < 0. The effect is that we
get extra, bogus event code alternatives on POWER5 and POWER5+,
leading to error messages such as "oops compute_mmcr failed"
being printed and counters not counting properly.
This fixes it by using s64 for the return type of
find_alternative_bdecode and for the local variable that the
caller puts the value in. It also makes the event argument a
u64 on POWER5+ for consistency.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Cc: Stephane Eranian <eranian@googlemail.com>
LKML-Reference: <18982.17586.666132.90983@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The structure isn't hw only and when I read event, I think about those
things that fall out the other end. Rename the thing.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Cc: Stephane Eranian <eranian@googlemail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A few renames:
s/irq_period/sample_period/
s/irq_freq/sample_freq/
s/PERF_RECORD_/PERF_SAMPLE_/
s/record_type/sample_type/
And change both the new sample_type and read_format to u64.
Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It was __devinit, but it is also within a CONFIG_HOTPLUG guarded section
of code, so the __devinit does nothing but cause the following warning:
WARNING: vmlinux.o(.text+0x107a8): Section mismatch in reference from the function pcibios_finish_adding_to_bus() to the function .devinit.text:pcibios_claim_one_bus()
The function pcibios_finish_adding_to_bus() references
the function __devinit pcibios_claim_one_bus().
This is often because pcibios_finish_adding_to_bus lacks a __devinit
annotation or the annotation of pcibios_claim_one_bus is wrong.
It is also only (externally) used in arch/powerpc/kernel/of_platform.c
which cannot be built as a module so don't export it.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There's no need to wrap PPC_INST_NOP in a static inline.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These macros were used in the original port, but since commit
e4486fe316 (ftrace, use probe_kernel API to modify code) they
are unused.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use ppc_function_entry() from code-patching.h.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch updates the output from /proc/ppc64/lparcfg to display the
processor virtualization resource allocations for a shared processor
partition.
This information is already gathered via the h_get_ppp call, we just
have to make sure that the ibm,partition-performance-parameters-level
property is >= 1 to ensure that the information is valid.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
RTAS event scan has to run across all cpus. Right now we use a kernel
thread and set_cpus_allowed but in doing so we wake up the previous cpu
unnecessarily.
Some ftrace output shows this:
previous cpu (2):
[002] 7.022331: sched_switch: task swapper:0 [140] ==> rtasd:194 [120]
[002] 7.022338: sched_switch: task rtasd:194 [120] ==> migration/2:9 [0]
[002] 7.022344: sched_switch: task migration/2:9 [0] ==> swapper:0 [140]
next cpu (3):
[003] 7.022345: sched_switch: task swapper:0 [140] ==> rtasd:194 [120]
[003] 7.022371: sched_switch: task rtasd:194 [120] ==> swapper:0 [140]
We can use schedule_delayed_work_on and avoid the unnecessary wakeup.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We can compile and boot with NR_CPUS=8192, so make this the max. 1024
was an arbitrary decision anyway.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Conflicts:
arch/mips/sibyte/bcm1480/irq.c
arch/mips/sibyte/sb1250/irq.c
Merge reason: we gathered a few conflicts plus update to latest upstream fixes.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge reason: merge almost-rc8 into perfcounters/core, which was -rc6
based - to pick up the latest upstream fixes.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The implementation we just revived has issues, such as using a
Kconfig-defined virtual address area in kernel space that nothing
actually carves out (and thus will overlap whatever is there),
or having some dependencies on being self contained in a single
PTE page which adds unnecessary constraints on the kernel virtual
address space.
This fixes it by using more classic PTE accessors and automatically
locating the area for consistent memory, carving an appropriate hole
in the kernel virtual address space, leaving only the size of that
area as a Kconfig option. It also brings some dma-mask related fixes
from the ARM implementation which was almost identical initially but
grew its own fixes.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Make FIXADDR_TOP a compile time constant and cleanup a
couple of definitions relative to the layout of the kernel
address space on ppc32. We also print out that layout at
boot time for debugging purposes.
This is a pre-requisite for properly fixing non-coherent
DMA allocactions.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This reverts commit 33f00dcedb.
While it was a good idea to try to use the mm/vmalloc.c allocator instead
of our own (in fact, ours is itself a dup on an old variant of the vmalloc
one), unfortunately, the approach is terminally busted since
dma_alloc_coherent() can be called at interrupt time or in atomic contexts
and there's little chances we'll make the code in mm/vmalloc.c cope with\ that :-(
Until we can get the generic code to forbid that idiocy and fix all
drivers abusing it, we pretty much have no choice but revert to
our custom virtual space allocator.
There's also a problem with SMP safety since freeing such mapping
would require an IPI which cannot be done at interrupt time.
However, right now, I don't think we support any platform that is
both SMP and has non-coherent DMA (don't laugh, I know such things
do exist !) so we can sort that out later.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This implements interrupt throttling on powerpc. Since we don't have
individual count enable/disable or interrupt enable/disable controls
per counter, this simply sets the hardware counter to 0, meaning that
it will not interrupt again until it has counted 2^31 counts, which
will take at least 2^30 cycles assuming a maximum of 2 counts per
cycle. Also, we set counter->hw.period_left to the maximum possible
value (2^63 - 1), so we won't report overflows for this counter for
the forseeable future.
The unthrottle operation restores counter->hw.period_left and the
hardware counter so that we will once again report a counter overflow
after counter->hw.irq_period counts.
[ Impact: new perfcounters robustness feature on PowerPC ]
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <18971.35823.643362.446774@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The recent rework of the MMU PID handling for non-hash CPUs has a
subtle bug in the !SMP "optimized" variant of the PID stealing
function. It clears the PID in the mm context before it calls
local_flush_tlb_mm(). However, the later will not flush anything
if the PID in the context is clear...
Signed-off-by: Hideo Saito <hsaito.ppc@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add a few more mpc5200 PSC defines. More bit fields defines for mpc5200
PSC registers.
Signed-off-by: Jon Smirl <jonsmirl@gmail.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Until now we have had a 1:1 mapping between storage device physical
block size and the logical block sized used when addressing the device.
With SATA 4KB drives coming out that will no longer be the case. The
sector size will be 4KB but the logical block size will remain
512-bytes. Hence we need to distinguish between the physical block size
and the logical ditto.
This patch renames hardsect_size to logical_block_size.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Something in the HW or FW setup is busted and MSIs aren't working with
IPR on Bimini, so until we figure out exaxtly what's up, we quirk them
out
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We shouldn't directly access sysdata to get the device node. We should
be calling pci_device_to_OF_node().
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If CONFIG_PPC_EMULATED_STATS is enabled, make available counters for the
various classes of emulated instructions under
/sys/kernel/debug/powerpc/emulated_instructions/ (assumed debugfs is mounted on
/sys/kernel/debug). Optionally (controlled by
/sys/kernel/debug/powerpc/emulated_instructions/do_warn), rate-limited warnings
can be printed to the console when instructions are emulated.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Hello All,
Quite a while back Michael Ellerman had posted a patch to add support to xmon to print the contents of the console log pointed to by __log_buf.
Here's the link to that patch - http://ozlabs.org/pipermail/linuxppc64-dev/2005-March/003657.html
I've ported the patch in the above link to 2.6.30-rc5 and have tested it.
Thanks & regards,
Vinay
Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently, the 32-bit code uses sg->length instead of sg->dma_lentgh
to report sg_dma_len. However, since the default dma code for 32-bit
(the dma_direct case) sets dma_length and length to the same thing,
we should be able to use dma_length there as well. This gets rid of
some 32-vs-64-bit ifdefs, and is needed by the swiotlb code which
actually distinguishes between dma_length and length.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch removes an unnecessary double check if the dentry returned by
lookup_create() is actually non-negative. Since lookup_create() itself returns
an error in this case just remove the check.
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I have been looking at sources of OS jitter and notice that after a long
NO_HZ idle period we wakeup too early:
relative time (us) event
timer irq exit
999946.405 timer irq entry
4.835 timer irq exit
21.685 timer irq entry
3.540 timer (tick_sched_timer) entry
Here we slept for just under a second then took a timer interrupt that did
nothing. 21.685 us later we wake up again and do the work.
We set a rather low shift value of 16 for the decrementer clockevent, which I
think is causing this issue. On this box we have a 207MHz decrementer and see:
clockevent: decrementer mult[3501] shift[16] cpu[0]
For calculations of large intervals this mult/shift combination could be
off by a significant amount. I notice the sparc code has a loop that iterates
to find a mult/shift combination that maximises the shift value while
keeping mult under 32bit. With the patch below we get:
clockevent: decrementer mult[35015c20] shift[32] cpu[15]
And we no longer see the spurious wakeups.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The pcnet32 driver has had in its device id table for some time a match
against the "broken" vendor ID. No need to fake out the vendor ID
with an explicit fixup.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Removed building setup-irq on ppc32, we don't use it anymore
* Remove duplicate prototype for setup_grackle() code that needs it
gets it from <asm/grackle.h>
* Removed gratuitous pci_io_size type differences between ppc32/ppc64
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There doesn't appear to be any specific reason that we need to setup the
pseries specific notifier in generic arch pci code. Move it into pseries
land.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We shouldn't directly access sysdata to get the device node to just
go get the pci_controller. We can call pci_bus_to_host() for this
purpose.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We shouldn't directly access sysdata to get the device node but call
pci_bus_to_OF_node() for this purpose.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We shouldn't directly access sysdata to get the device node but call
pci_bus_to_OF_node() for this purpose.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We shouldn't directly access sysdata to get the pci_controller. Instead
use pci_bus_to_host() for this purpose. In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We shouldn't directly access sysdata to get the pci_controller. Instead
use pci_bus_to_host() for this purpose. In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We shouldn't directly access sysdata to get the pci_controller. Instead
use pci_bus_to_host() for this purpose. In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We shouldn't directly access sysdata to get the pci_controller. Instead
use pci_bus_to_host() for this purpose. In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We shouldn't directly access sysdata to get the pci_controller. Instead
use pci_bus_to_host() for this purpose. In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We shouldn't directly access sysdata to get the pci_controller. Instead
use pci_bus_to_host() for this purpose. In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This adds the PowerPC 2.06 tlbie mnemonics and keeps backwards
compatibilty for CPUs before 2.06.
Only useful for bare metal systems.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc: Enable MMU feature sections for inline asm
This adds the ability to do MMU feature sections for inline asm.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cleans up the VSX load/store instructions by moving them into
ppc-opcode.h.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Make macros more braces happy.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We believe if a toolchain supports PT_GNU_STACK that it sets the proper
PHDR permissions. Therefor elf_read_implies_exec() should only be true
if we don't see PT_GNU_STACK set.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
So select GENERIC_HARDIRQS_NO__DO_IRQ to disable it.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Don't call __do_IRQ() directly in gatwick_action(), use
generic_handle_irq().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We should no longer have any irq code that needs __do_IRQ(), so
remove the fallback to __do_IRQ().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The guts of do_IRQ() isn't really the right place to be documenting
the ppc_md.get_irq() interface. So move the comment into machdep.h
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Makes do_IRQ() shorter and clearer.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rather than a giant ifdef in the body of do_IRQ(), including a
dangling else, move the irq stack logic into a separate routine and
do the ifdef there.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
ps3 has 4 ipis per cpu and can use the new smp_request_message_ipi to
reduce path length when receiving an ipi.
This has the side effect of setting IRQF_PERCPU.
Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Adds support for the "unused" page hint which can be used in shared
memory partitions to flag pages not in use, which will then be stolen
before active pages by the hypervisor when memory needs to be moved to
LPARs in need of additional memory. Failure to mark pages as 'unused'
makes the LPAR slower to give up unused memory to other partitions.
This adds the kernel parameter 'cmo_free_hint' to disable this
functionality.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
mpic_find() was overloaded to do two things, finding the mpic instance
for a given interrupt and returning if it's an IPI. Instead we introduce
mpic_is_ipi() and simplify mpic_find() to just return the mpic instance
Also silences the warning:
arch/powerpc/sysdev/mpic.c: In function 'mpic_irq_set_priority':
arch/powerpc/sysdev/mpic.c:1382: warning: 'is_ipi' may be used uninitialized in this function
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Now that leds-gpio is a proper OF platform driver, the Warp can use
the leds-gpio driver rather than the old out-of-kernel driver.
One side-effect is the leds-gpio driver always turns the leds off
while the old driver left them alone. So we have to set them back to
the correct settings.
Signed-off-by: Sean MacLennan <smaclennan@pikatech.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Previouslly we just always set the inbound window to 2G. This was
broken for systems with >2G. If a system has >=4G we will need
SWIOTLB support to handle that case.
We now allocate PCICSRBAR/PEXCSRBAR right below the lowest PCI outbound
address for MMIO or the 4G boundary (if the lowest PCI address is above
4G).
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The P2020 is a dual e500v2 core based SOC with:
* 3 PCIe controllers
* 2 General purpose DMA controllers
* 2 sRIO controllers
* 3 eTSECS
* USB 2.0
* SDHC
* SPI, I2C, DUART
* enhanced localbus
* and optional Security (P2020E) security w/XOR acceleration
The p2020 DS reference board is pretty similar to the existing MPC85xx
DS boards and has a ULI 1575 connected on one of the PCIe controllers.
Signed-off-by: Ted Peters <Ted.Peters@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
We we build with resource_size_t as a 64-bit quantity we get:
arch/powerpc/sysdev/fsl_rio.c: In function 'fsl_rio_setup':
arch/powerpc/sysdev/fsl_rio.c:1029: warning: format '%08x' expects type 'unsigned int', but argument 4 has type 'resource_size_t'
arch/powerpc/sysdev/fsl_rio.c:1029: warning: format '%08x' expects type 'unsigned int', but argument 5 has type 'resource_size_t'
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
In these configuration we expect to have large amounts of memory (> 4G)
and thus will bounce via swiotlb some region of PCI address space.
The outbound windows were wasting 512M of address space by leaving a
gap between the top of the outbound window and the 4G boundary. By
moving the top of the outbound window up to the 4G boundary we can
reclaim the vast majority of the 512M (minus space needed for PEXCSRBAR)
and thus reduces the amount of memory we have to bounce.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Its feasible based on how the PCI address map is setup that the region
of PCI address space used for MSIs differs for each PHB on the same SoC.
Instead of assuming that the address mappes to CCSRBAR 1:1 we read
PEXCSRBAR (BAR0) for the PHB that the given pci_dev is on.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
For serial flash support we need to:
- Add QE Par IO Bank E device tree node, a GPIO from this bank is
used for SPI chip-select line;
- Add serial-flash node;
- Add proper module alias into of/base.c.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Select HAS_RAPIDIO symbol and add rio nodes for MPC8568E-MDS
and MPC8569E-MDS boards.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Note that eSDHC and DUART0 are mutually exclusive on MPC8569E-MDS
boards. Default option is DUART0, so eSDHC is disabled by default.
U-Boot will fixup device tree if eSDHC should be used instead of
DUART0.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This patch fixes bogus reg = <> property in the localbus node,
and fixes interrupt property (should be "interrupts").
Also add node for NAND support.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
fsl,exec-units-mask should be 0xbfe to include SNOW unit in
MPC8569E's security engine.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This patch adds PCI IDs for MPC8569 and MPC8569E processors,
plus adds appropriate quirks for these IDs, and thus makes
PCI-E actually work on MPC8569E-MDS boards.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Between the addition of the ecm/mcm law nodes and the fact that the
get_immrbase() has been using the range property of the SoC to determine
the base address of CCSR space we no longer need the reg property at
the soc node level. It has been ill specified and varied between device
trees to cover either the {e,m}cm-law node, some odd subset of CCSR
space or all of CCSR space.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Add fsl,qe-num-riscs and fsl,qe-num-snums to all the devices trees which
have qe node.
Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The MPC8569 is similiar to the MPC8568. It doubles the number of
QUICC Engine RISC cores from 2 to 4. Removes eTSECs, TLU and adds
the eSDHC controller.
Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The latest QE chip may have more Serial Number(SNUM)s of thread to use. We
will get the number of SNUMs from device tree by reading the new property
"fsl,qe-num-snums", and set 28 as the default number of SNUMs so that it is
compatible with the old QE chips' device trees which don't have this new
property. The macro QE_NUM_OF_SNUM is defined as the maximum number in QE
snum table which is 256.
Also we update the snum_init[] array with 18 more new SNUMs which are
confirmed to be useful on new chip.
Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com>
Acked-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Change the RISC allocation to macros instead of enum, add function to read
the number of risc engines from the new property "fsl,qe-num-riscs" under
the qe node in dts. Add new property "fsl,qe-num-riscs" description in
qe.txt
Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com>
Acked-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Keep an unique machine def for the MPC8568 MDS board to handle some
subtle differences between the future MDS board. Also set the bcsrs in
setup_arch() only for mpc8568_mds because other mds has different bcsr
settings.
Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Removed the need for asm/mpc86xx.h as it was only used in mpc86xx_smp.c
and just moved the defines it cared about into there. Also fixed up
the ioremap to only map the one 4k page we need access to and to iounmap
when we are done.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Also, convert them to resource_size_t (which is unsigned long
on 64-bit, so it's not a change there).
We will be using these on fsl 32b to indicate the start and size
address of memory that the pci controller can actually reach - this
is needed to determine if an address requires bounce buffering. For
now, initialize them to a standard value; in the near future, the
value will be calculated based on how the inbound windows are
programmed.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Acked-by: Ben Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The rstcr register mapping code was written sometime ago before
of_iomap() existed. We can use it and clean up the code a bit
and get rid of one user of get_immrbase() in the process.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The new dts places most of the devices in physical address space
above 32-bits, which allows us to have more than 4GB of RAM present.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Refactor the check to determine if the quirk is applicable to the boards
into one inline function so we only have to change one place to add more
boards that the quirks might be applicable to.
Also removed a warning related to unused temp variable.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The cell-index property isn't used on PCI nodes and is ill defined.
Remove it for now and if someone comes up with a good reason and
consistent definition for it we can add it back
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
It's still in the git history if anyone wants it.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Commit 9e35ad38 ("perf_counter: Rework the perf counter
disable/enable") added code to the powerpc hw_perf_enable (renamed
from hw_perf_restore) to test cpuhw->disabled and return immediately
if it is not set (i.e. if the PMU is already enabled).
Unfortunately the test got added before cpuhw was initialized,
resulting in an oops the first time hw_perf_enable got called.
This fixes it by moving the initialization of cpuhw to before
cpuhw->disabled is tested.
[ Impact: fix oops-causing bug on powerpc ]
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <18960.56772.869734.304631@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
I don't think anything guarantees that the objects in data.page_aligned
are a multiple of PAGE_SIZE, thus the section may end on any boundary.
So the following section, .data.cacheline_aligned needs an explicit
alignment.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Refresh and set these options:
CONFIG_SYSFS_DEPRECATED_V2: y -> n
CONFIG_INPUT_JOYSTICK: y -> n
CONFIG_HID_SONY: n -> m
CONFIG_RTC_DRV_PS3: - -> m
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function
graph tracer broke. This was discovered on my x86 boxes.
The issue is that gcc used the same register for an output as it did for
an input in an asm statement. I first thought this was a bug in gcc and
reported it. I was notified that gcc was correct and that the output had
to be flagged as an "early clobber".
I noticed that powerpc had the same issue and this patch fixes it.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
pr_debug() can now result in code being generated even when #DEBUG
is not defined. That's not really desirable in the ftrace code
which we want to be snappy.
With CONFIG_DYNAMIC_DEBUG=y:
size before:
text data bss dec hex filename
3334 672 4 4010 faa arch/powerpc/kernel/ftrace.o
size after:
text data bss dec hex filename
2616 360 4 2980 ba4 arch/powerpc/kernel/ftrace.o
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
Revert "mm: add /proc controls for pdflush threads"
viocd: needs to depend on BLOCK
block: fix the bio_vec array index out-of-bounds test
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Fix PCI ROM access
powerpc/pseries: Really fix the oprofile CPU type on pseries
serial/nwpserial: Fix wrong register read address and add interrupt acknowledge.
powerpc/cell: Make ptcal more reliable
powerpc: Allow mem=x cmdline to work with 4G+
powerpc/mpic: Fix incorrect allocation of interrupt rev-map
powerpc: Fix oprofile sampling of marked events on POWER7
powerpc/iseries: Fix pci breakage due to bad dma_data initialization
powerpc: Fix mktree build error on Mac OS X host
powerpc/virtex: Fix duplicate level irq events.
powerpc/virtex: Add uImage to the default images list
powerpc/boot: add simpleImage.* to clean-files list
powerpc/8xx: Update defconfigs
powerpc/embedded6xx: Update defconfigs
powerpc/86xx: Update defconfigs
powerpc/85xx: Update defconfigs
powerpc/83xx: Update defconfigs
powerpc/fsl_soc: Remove mpc83xx_wdt_init, again
This uses values from the MMCRA, SIAR and SDAR registers on
powerpc to supply more precise information for overflow events,
including a data address when PERF_RECORD_ADDR is specified.
Since POWER6 uses different bit positions in MMCRA from earlier
processors, this converts the struct power_pmu limited_pmc5_6
field, which only had 0/1 values, into a flags field and
defines bit values for its previous use (PPMU_LIMITED_PMC5_6)
and a new flag (PPMU_ALT_SIPR) to indicate that the processor
uses the POWER6 bit positions rather than the earlier
positions. It also adds definitions in reg.h for the new and
old positions of the bit that indicates that the SIAR and SDAR
values come from the same instruction.
For the data address, the SDAR value is supplied if we are not
doing instruction sampling. In that case there is no guarantee
that the address given in the PERF_RECORD_ADDR subrecord will
correspond to the instruction whose address is given in the
PERF_RECORD_IP subrecord.
If instruction sampling is enabled (e.g. because this counter
is counting a marked instruction event), then we only supply
the SDAR value for the PERF_RECORD_ADDR subrecord if it
corresponds to the instruction whose address is in the
PERF_RECORD_IP subrecord. Otherwise we supply 0.
[ Impact: support more PMU hardware features on PowerPC ]
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <18955.37028.48861.555309@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Although the perf_counter API allows 63-bit raw event codes,
internally in the powerpc back-end we had been using 32-bit
event codes. This expands them to 64 bits so that we can add
bits for specifying threshold start/stop events and instruction
sampling modes later.
This also corrects the return value of can_go_on_limited_pmc;
we were returning an event code rather than just a 0/1 value in
some circumstances. That didn't particularly matter while event
codes were 32-bit, but now that event codes are 64-bit it
might, so this fixes it.
[ Impact: extend PowerPC perfcounter interfaces from u32 to u64 ]
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <18955.36874.472452.353104@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of specifying the irq_period for a counter, provide a target interrupt
frequency and dynamically adapt the irq_period to match this frequency.
[ Impact: new perf-counter attribute/feature ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20090515132018.646195868@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The current disable/enable mechanism is:
token = hw_perf_save_disable();
...
/* do bits */
...
hw_perf_restore(token);
This works well, provided that the use nests properly. Except we don't.
x86 NMI/INT throttling has non-nested use of this, breaking things. Therefore
provide a reference counter disable/enable interface, where the first disable
disables the hardware, and the last enable enables the hardware again.
[ Impact: refactor, simplify the PMU disable/enable logic ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A couple of issues crept in since about 2.6.27 related to accessing PCI
device ROMs on various powerpc machines.
First, historically, we don't allocate the ROM resource in the resource
tree. I'm not entirely certain of why, I susepct they often contained
garbage on x86 but it's hard to tell. This causes the current generic
code to always call pci_assign_resource() when trying to access the said
ROM from sysfs, which will try to re-assign some new address regardless
of what the ROM BAR was already set to at boot time. This can be a
problem on hypervisor platforms like pSeries where we aren't supposed
to move PCI devices around (and in fact probably can't).
Second, our code that generates the PCI tree from the OF device-tree
(instead of doing config space probing) which we mostly use on pseries
at the moment, didn't set the (new) flag IORESOURCE_SIZEALIGN on any
resource. That means that any attempt at re-assigning such a resource
with pci_assign_resource() would fail due to resource_alignment()
returning 0.
This fixes this by doing these two things:
- The code that calculates resource flags based on the OF device-node
is improved to set IORESOURCE_SIZEALIGN on any valid BAR, and while at
it also set IORESOURCE_READONLY for ROMs since we were lacking that too
- We now allocate ROM resources as part of the resource tree. However
to limit the chances of nasty conflicts due to busted firmwares, we
only do it on the second pass of our two-passes allocation scheme,
so that all valid and enabled BARs get precedence.
This brings pSeries back the ability to access PCI ROMs via sysfs (and
thus initialize various video cards from X etc...).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
My previous pach for fixing the oprofile CPU type got somewhat mismerged
(by my fault) when it collided with another related patch. This should
finally (fingers crossed) fix the whole thing.
We make sure we keep the -old- oprofile type and CPU type whenever
one of them was specified in the first pass through the function.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There have been a series of checkstops on QS21 related to
ptcal being set up incorrectly. On systems that only
have memory on a single node, ptcal fails when it gets
a pointer to memory on the remote node.
Moreover, agressive prefetching in memcpy and other
functions may accidentally touch the first cache line
of the page that we reserve for ptcal, which causes
an ECC checkstop.
We now allocate pages only from the specified node, moves the
ptcal area into the middle of the allocated page to avoid
potential prefetch problems and prints the address of the
ptcal area to facilitate diagnostics.
Signed-off-by: Gerhard Stenzel <gerhard.stenzel@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We're currently choking on mem=4g (and above) due to memory_limit
being specified as an unsigned long. Make memory_limit
phys_addr_t to fix this.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Before when we were setting up the irq host map for mpic we passed in
just isu_size for the size of the linear map. However, for a number of
mpic implementations we have no isu (thus pass in 0) and will end up
with a no linear map (size = 0). This causes us to always call
irq_find_mapping() from mpic_get_irq().
By moving the allocation of the host map to after we've determined the
number of sources we can actually benefit from having a linear map for
the non-isu users that covers all the interrupt sources.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Description
-----------
Change ppc64 oprofile kernel driver to use the SLOT bits (MMCRA[37:39]only on
older processors where those bits are defined.
Background
----------
The performance monitor unit of the 64-bit POWER processor family has the
ability to collect accurate instruction-level samples when profiling on marked
events (i.e., "PM_MRK_<event-name>"). In processors prior to POWER6, the MMCRA
register contained "slot information" that the oprofile kernel driver used to
adjust the value latched in the SIAR at the time of a PMU interrupt. But as of
POWER6, these slot bits in MMCRA are no longer necessary for oprofile to use,
since the SIAR itself holds the accurate sampled instruction address. With
POWER6, these MMCRA slot bits were zero'ed out by hardware so oprofile's use of
these slot bits was, in effect, a NOP. But with POWER7, these bits are no
longer zero'ed out; however, they serve some other purpose rather than slot
information. Thus, using these bits on POWER7 to adjust the SIAR value results
in samples being attributed to the wrong instructions. The attached patch
changes the oprofile kernel driver to ignore these slot bits on all newer
processors starting with POWER6.
Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
Signed-off-by: Michael Wolf <mjw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit 4fc665b88a "powerpc: Merge 32 and
64-bit dma code" made changes to the PCI initialisation code that added
an assignment to archdata.dma_data but only for 32 bit code. Commit
7eef440a54 "powerpc/pci: Cosmetic cleanups
of pci-common.c" removed the conditional compilation. Unfortunately,
the iSeries code setup the archdata.dma_data before that assignment was
done - effectively overwriting the dma_data with NULL.
Fix this up by moving the iSeries setup of dma_data into a
pci_dma_dev_setup callback.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The mktree utility defines some variables as "uint", although this is not a
standard C type, and so cross-compiling on Mac OS X fails. Change this to
"unsigned int".
Signed-off-by: Timur Tabi <timur@freescale.com>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The interrupt controller was not handling level interrupts correctly
such that duplicate interrupts were happening. This fixes the problem
and adds edge type interrupts which are needed in Xilinx hardware.
Signed-off-by: John Linn <john.linn@xilinx.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
It is common to use U-Boot on Xilinx Virtex platforms. This patch
ensures that CONFIG_DEFAULT_UIMAGE is selected for virtex
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Fix this build error when CONFIG_BLOCK is not set:
drivers/cdrom/viocd.c: In function 'viocd_blk_open':
drivers/cdrom/viocd.c:156: error: dereferencing pointer to incomplete type
drivers/cdrom/viocd.c: In function 'viocd_blk_release':
drivers/cdrom/viocd.c:162: error: dereferencing pointer to incomplete type
drivers/cdrom/viocd.c: In function 'viocd_blk_ioctl':
drivers/cdrom/viocd.c:170: error: dereferencing pointer to incomplete type
drivers/cdrom/viocd.c: In function 'viocd_blk_media_changed':
drivers/cdrom/viocd.c:176: error: dereferencing pointer to incomplete type
...
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Merge reason: both topics modify the APIC code but were able to do it in
parallel so far. An upcoming patch generates a conflict so
merge them to avoid the conflict.
Signed-off-by: Ingo Molnar <mingo@elte.hu>