The GART currently implements the iommu=[no]fullflush command line
parameters which influence its IO/TLB flushing strategy. This patch
makes these parameters generic so that they can be used by the AMD IOMMU
too.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch moves the invocation of the flushing functions to the
map/unmap helpers because its common code in all dma_ops relevant
mapping/unmapping code.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently AMD IOMMU code triggers a BUG_ON if NULL is passed as the
device. This is inconsistent with other IOMMU implementations.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
gart alloc_coherent need to do virtual mapppings only when an
allocated buffer is not DMA-capable for a device.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86's common alloc_coherent (dma_alloc_coherent in dma-mapping.h) sets
up the gfp flag according to the device dma_mask but Calgary doesn't
need it because of virtual mappings. This patch avoids unnecessary low
zone allocation.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently, GART IOMMU ingores device's dma_mask when it does virtual
mappings. So it could give a device a virtual address that the device
can't access to.
This patch fixes the above problem.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Non real IOMMU implemenations (which doesn't do virtual mappings,
e.g. swiotlb, pci-nommu, etc) need to use proper gfp flags and
dma_mask to allocate pages in their own dma_alloc_coherent()
(allocated page need to be suitable for device's coherent_dma_mask).
This patch makes dma_alloc_coherent do this job so that IOMMUs don't
need to take care of it any more.
Real IOMMU implemenataions can simply ignore the gfp flags.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We need to use __GFP_DMA for NULL device argument (fallback_dev) with
pci-nommu. It's a hack for ISA (and some old code) so we need to use
GFP_DMA.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The check to see if dev->dma_mask is NULL in pci-nommu is more
appropriate for dma_alloc_coherent().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: cpu_init(): fix memory leak when using CPU hotplug
x86: pda_init(): fix memory leak when using CPU hotplug
x86, xen: Use native_pte_flags instead of native_pte_val for .pte_flags
x86: move mtrr cpu cap setting early in early_init_xxxx
x86: delay early cpu initialization until cpuid is done
x86: use X86_FEATURE_NOPL in alternatives
x86: add NOPL as a synthetic CPU feature bit
x86: boot: stub out unimplemented CPU feature words
Exception stacks are allocated each time a CPU is set online.
But the allocated space is never freed. Thus with one CPU hotplug
offline/online cycle there is a memory leak of 24K (6 pages) for
a CPU.
Fix is to allocate exception stacks only once -- when the CPU is
set online for the first time.
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
pda->irqstackptr is allocated whenever a CPU is set online.
But it is never freed. This results in a memory leak of 16K
for each CPU offline/online cycle.
Fix is to allocate pda->irqstackptr only once.
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Krzysztof Helt found MTRR is not detected on k6-2
root cause:
we moved mtrr_bp_init() early for mtrr trimming,
and in early_detect we only read the CPU capability from cpuid,
so some cpu doesn't have that bit in cpuid.
So we need to add early_init_xxxx to preset those bit before mtrr_bp_init
for those earlier cpus.
this patch is for v2.6.27
Reported-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move early cpu initialization after cpu early get cap so the
early cpu initialization can fix up cpu caps.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
After fixing the u32 thinko I sill had occasional hickups on ATI chipsets
with small deltas. There seems to be a delay between writing the compare
register and the transffer to the internal register which triggers the
interrupt. Reading back the value makes sure, that it hit the internal
match register befor we compare against the counter value.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We use the HPET only in 32bit mode because:
1) some HPETs are 32bit only
2) on i386 there is no way to read/write the HPET atomic 64bit wide
The HPET code unification done by the "moron of the year" did
not take into account that unsigned long is different on 32 and
64 bit.
This thinko results in a possible endless loop in the clockevents
code, when the return comparison fails due to the 64bit/332bit
unawareness.
unsigned long cnt = (u32) hpet_read() + delta can wrap over 32bit.
but the final compare will fail and return -ETIME causing endless
loops.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use X86_FEATURE_NOPL to determine if it is safe to use P6 NOPs in
alternatives. Also, replace table and loop with simple if statement.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The long noops ("NOPL") are supposed to be detected by family >= 6.
Unfortunately, several non-Intel x86 implementations, both hardware
and software, don't obey this dictum. Instead, probe for NOPL
directly by executing a NOPL instruction and see if we get #UD.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Some BIOSes (the Intel DG33BU, for example) wrongly claim to have DMAR
when they don't. Avoid the resulting crashes when it doesn't work as
expected.
I'd still be grateful if someone could test it on a DG33BU with the old
BIOS though, since I've killed mine. I tested the DMI version, but not
this one.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the detection of the northbridges in the AMD family 0x11
processors. It also fixes the magic numbers there while changing this code.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The minimum reprogramming delta was hardcoded in HPET ticks,
which is stupid as it does not work with faster running HPETs.
The C1E idle patches made this prominent on AMD/RS690 chipsets,
where the HPET runs with 25MHz. Set it to 5us which seems to be
a reasonable value and fixes the problems on the bug reporters
machines. We have a further sanity check now in the clock events,
which increases the delta when it is not sufficient.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Tested-by: Dmitry Nezhevenko <dion@inhex.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When calibration against PIT fails, the warning that we print is misleading.
In a virtualized environment the VM may get descheduled while calibration
or, the check in PIT calibration may fail due to other virtualization
overheads.
The warning message explicitly assumes that calibration failed due to SMI's
which may not be the case. Change that to something proper.
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Manually adding "io_delay=0xed" fixes system lockups in ioapic
mode on this machine.
System Information
Manufacturer: Hewlett-Packard
Product Name: Presario F700 (KA695EA#ABF)
Base Board Information
Manufacturer: Quanta
Product Name: 30D3
Reference:
https://bugzilla.redhat.com/show_bug.cgi?id=459546
Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The TSC calibration function is still very complicated, but this makes
it at least a little bit less so by moving the PIT part out into a
helper function of its own.
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
Larry Finger reported at http://lkml.org/lkml/2008/9/1/90:
An ancient laptop of mine started throwing errors from b43legacy when
I started using 2.6.27 on it. This has been bisected to commit bfc0f59
"x86: merge tsc calibration".
The unification of the TSC code adopted mostly the 64bit code, which
prefers PMTIMER/HPET over the PIT calibration.
Larrys system has an AMD K6 CPU. Such systems are known to have
PMTIMER incarnations which run at double speed. This results in a
miscalibration of the TSC by factor 0.5. So the resulting calibrated
CPU/TSC speed is half of the real CPU speed, which means that the TSC
based delay loop will run half the time it should run. That might
explain why the b43legacy driver went berserk.
On the other hand we know about systems, where the PIT based
calibration results in random crap due to heavy SMI/SMM
disturbance. On those systems the PMTIMER/HPET based calibration logic
with SMI detection shows better results.
According to Alok also virtualized systems suffer from the PIT
calibration method.
The solution is to use a more wreckage aware aproach than the current
either/or decision.
1) reimplement the retry loop which was dropped from the 32bit code
during the merge. It repeats the calibration and selects the lowest
frequency value as this is probably the closest estimate to the real
frequency
2) Monitor the delta of the TSC values in the delay loop which waits
for the PIT counter to reach zero. If the maximum value is
significantly different from the minimum, then we have a pretty safe
indicator that the loop was disturbed by an SMI.
3) keep the pmtimer/hpet reference as a backup solution for systems
where the SMI disturbance is a permanent point of failure for PIT
based calibration
4) do the loop iteration for both methods, record the lowest value and
decide after all iterations finished.
5) Set a clear preference to PIT based calibration when the result
makes sense.
The implementation does the reference calibration based on
HPET/PMTIMER around the delay, which is necessary for the PIT anyway,
but keeps separate TSC values to ensure the "independency" of the
resulting calibration values.
Tested on various 32bit/64bit machines including Geode 266Mhz, AMD K6
(affected machine with a double speed pmtimer which I grabbed out of
the dump), Pentium class machines and AMD/Intel 64 bit boxen.
Bisected-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Return the correct return value when the CPUID driver partially
completes a request (we should return the number of bytes actually
read or written, instead of the error code.)
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Return the correct return value when the MSR driver partially
completes a request (we should return the number of bytes actually
read or written, instead of the error code.)
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Propagate error (-ENXIO) from smp_call_function_single() in the CPUID
driver. This can happen when a CPU is unplugged while the CPUID
driver is open.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Propagate error (-ENXIO) from smp_call_function_single(). These
errors can happen when a CPU is unplugged while the MSR driver is
open.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
I noticed that my sched_clock() was slow on a number of machine, so I
started looking at cpufreq.
The below seems to fix the problem for me.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: crash on non-TSC-equipped CPUs
Don't enable the TSC notifier if we *either*:
1. don't have a CPU, or
2. have a CPU with constant TSC.
In either of those cases, the notifier is either damaging (1) or useless(2).
From: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
During CPU hot-remove the sysfs directory created by
threshold_create_bank(), defined in
arch/x86/kernel/cpu/mcheck/mce_amd_64.c, has to be removed before
its parent directory, created by mce_create_device(), defined in
arch/x86/kernel/cpu/mcheck/mce_64.c . Moreover, when the CPU in
question is hotplugged again, obviously the latter has to be created
before the former. At present, the right ordering is not enforced,
because all of these operations are carried out by CPU hotplug
notifiers which are not appropriately ordered with respect to each
other. This leads to serious problems on systems with two or more
multicore AMD CPUs, among other things during suspend and hibernation.
Fix the problem by placing threshold bank CPU hotplug callbacks in
mce_cpu_callback(), so that they are invoked at the right places,
if defined. Additionally, use kobject_del() to remove the sysfs
directory associated with the kobject created by
kobject_create_and_add() in threshold_create_bank(), to prevent the
kernel from crashing during CPU hotplug operations on systems with
two or more multicore AMD CPUs.
This patch fixes bug #11337.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: work around MTRR mask setting, v2
x86: fix section mismatch warning - uv_cpu_init
x86: fix VMI for early params
x86: fix two modpost warnings in mm/init_64.c
x86: fix 1:1 mapping init on 64-bit (memory hotplug case)
x86: work around MTRR mask setting
x86: PAT Update validate_pat_support for intel CPUs
devmem, x86: PAT Change /dev/mem mmap with O_SYNC to use UC_MINUS
x86: PAT proper tracking of set_memory_uc and friends
x86: fix BUG: unable to handle kernel paging request (numaq_tsc_disable)
x86: export pv_lock_ops non-GPL
x86, mmiotrace: silence section mismatch warning - leave_uniprocessor
x86: use WARN() in arch/x86/kernel
x86: use WARN() in arch/x86/mm/ioremap.c
werror: fix pci calgary
x86: fix oprofile + hibernation badness
x86, SGI UV: hardcode the TLB flush interrupt system vector
x86: fix Xorg startup/shutdown slowdown with PAT
x86: fix "kernel won't boot on a Cyrix MediaGXm (Geode)"
x86 iommu: remove unneeded parenthesis
improve the debug printout:
- make it actually display something
- print it only once
would be nice to have a WARN_ONCE() facility, to feed such things to
kerneloops.org.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
WARNING: vmlinux.o(.cpuinit.text+0x3cc4): Section mismatch in reference from the function uv_cpu_init() to the function .init.text:uv_system_init()
The function __cpuinit uv_cpu_init() references
a function __init uv_system_init().
If uv_system_init is only used by uv_cpu_init then
annotate uv_system_init with a matching annotation.
uv_system_init was ment to be called only once, so do it from codepath
(native_smp_prepare_cpus) which is called once, right before activation
of other cpus (smp_init).
Note: old code relied on uv_node_to_blade being initialized to 0,
but it'a not initialized from anywhere.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
alloc_coherent dma_ops callback was added to GART, however, it doesn't
return a size aligned address wrt dma_alloc_coherent, as
DMA-mapping.txt defines. This patch fixes it.
This patch also removes unused gart_map_simple
(dma_mapping_ops->map_simple has gone).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
pci-dma.c doesn't use map_simple hook any more so we can remove it
from struct dma_mapping_ops now.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All the x86 DMA-API functions are defined in asm/dma-mapping.h. This patch
moves the dma_*_coherent functions also to this header file because they are
now small enough to do so.
This is done as a separate patch because it also includes some renaming and
restructuring of the dma-mapping.h file.
Signed-off-by: Joerg Roedel <joerg.roede@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All dma_ops implementations support the alloc_coherent and free_coherent
callbacks now. This allows a big simplification of the dma_alloc_coherent
function which is done with this patch. The dma_free_coherent functions is also
cleaned up and calls now the free_coherent callback of the dma_ops
implementation.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>