This implementation in x86_64 is clean and consistent, but we
sacrifice it for the sake of being equal to i386 (since the other
way around would be harder).
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Although those constants are always defined in x86_64,
and will have the effect of just including the headers
in the very way we did before, I'm doing this in a separate
patch to be conservative and avoid surprises.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The code is now the same between i386 and x86_64. We already
know what happens when it reaches this point: They go away
from the arch-specific headers, and suddenly appears in the common
header.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We provide a bogus macro for x86_64 in case CONFIG_X86_LOCAL_APIC
is not set. It will always be set for x86_64, so the effect
is just to make the code equal to i386.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
APIC_DEFINITION is not defined in x86_64, so in practice, we keep
our old code here. But as a nice side effect, the code is now
equal to smp_32.h.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The new cacheflush.h API's didn't have any comments describing
how they're to be used yet and the conventions around these functions.
This patch adds comments to this effect; in order for that to be
a logical series, some prototypes had to move around.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Using a naked parameterless macro could lead to other tokens being
unexpectedly replaced.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When compilers became generally better at optimizing code than humans, the
register keyword became mostly useless. For the floppy driver it certainly
is since it's so slow compared to the rest of the system that optimizing
access to a single variable or two isn't going to make any real difference
So let's just leave it to the compiler - it'll do a better job anyway.
This patch does away with a few register keywords in the x86 floppy driver.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On AMD SMM protected memory is part of the address map, but handled
internally like an MTRR. That leads to large pages getting split
internally which has some performance implications. Check for the
AMD TSEG MSR and split the large page mapping on that area
explicitely if it is part of the direct mapping.
There is also SMM ASEG, but it is in the first 1MB and already covered by
the earlier split first page patch.
Idea for this came from an earlier patch by Andreas Herrmann
On a RevF dual Socket Opteron system kernbench shows a clear
improvement from this:
(together with the earlier patches in this series, especially the
split first 2MB patch)
[lower is better]
no split stddev split stddev delta
Elapsed Time 87.146 (0.727516) 84.296 (1.09098) -3.2%
User Time 274.537 (4.05226) 273.692 (3.34344) -0.3%
System Time 34.907 (0.42492) 34.508 (0.26832) -1.1%
Percent CPU 322.5 (38.3007) 326.5 (44.5128) +1.2%
=> About 3.2% improvement in elapsed time for kernbench.
With GB pages on AMD Fam1h the impact of splitting is much higher of course,
since it would split two full GB pages (together with the first
1MB split patch) instead of two 2MB pages. I could not benchmark
a clear difference in kernbench on gbpages, so I kept it disabled
for that case
That was only limited benchmarking of course, so if someone
was interested in running more tests for the gbpages case
that could be revisited (contributions welcome)
I didn't bother implementing this for 32bit because it is very
unlikely the 32bit lowmem mapping overlaps into the TSEG near 4GB
and the 2MB low split is already handled for both.
[ mingo@elte.hu: do it on gbpages kernels too, there's no clear reason
why it shouldnt help there. ]
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
RDMSR for 64bit values with exception handling.
Makes it easier to deal with 64bit valued MSRs. The old 64bit code
base had that too as checking_rdmsrl(), but it got dropped somehow.
Signed-off-by: Andi Kleen <andi@firstfloor.org>
Cc: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a new function to force split large pages into 4k pages.
This is needed for some followup optimizations.
I had to add a new field to cpa_data to pass down the information
that try_preserve_large_page should not run.
Right now no set_page_4k() because I didn't need it and all the
specialized users I have in mind would be more comfortable with
pure addresses. I also didn't export it because it's unlikely
external code needs it.
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When end_pfn is not aligned to 2MB (or 1GB) then the kernel might
map more memory than end_pfn. Account this in max_pfn_mapped.
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Even on 32bit 2MB pages can map more memory than is in the true
max_low_pfn if end_pfn is not highmem and not aligned to 2MB.
Add a end_pfn_map similar to x86-64 that accounts for this
fact. This is important for code that really needs to know about
all mapping aliases.
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: andreas.herrmann3@amd.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All of early setup runs with interrupts disabled, so there is no
need to set up early exception handlers for vectors >= 32
This saves some minor text size.
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some of pde bits weren't documented, add the short description to them.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
revert:
"x86: fix breakage of vSMP irq operations"
the irqflags.h unification will solve this in a cleaner way.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
> yhlu@mpk:~/xx/xx/kernel/x86/linux-2.6> git-bisect bad
> d1c707188ad646c8094cac9afb1738e7d0196ff2 is first bad commit
> commit d1c707188ad646c8094cac9afb1738e7d0196ff2
> Author: Glauber de Oliveira Costa <gcosta@redhat.com>
> Date: Wed Mar 19 14:25:53 2008 -0300
>
> x86: include mach_apic.h in smpboot_64.c and smpboot.c
>
> After the inclusion, a lot of files needs fixing for conflicts,
> some of them in the headers themselves, to accomodate for both
> i386 and x86_64 versions.
>
> [ mingo@elte.hu: build fix ]
>
> Signed-off-by: Glauber Costa <gcosta@redhat.com>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
>
> :040000 040000 19f574e64bb8003bbe984f3a8c1315db969dfdcd
> 6ffe96588c77bc936705599fa110107856201115 M arch
> :040000 040000 61269347ad4f384ed85cc87c4f2d004ed94492ac
> 8f5c713da25579a3cdf63db3d4c2f795261d0521 M include
> yhlu@mpk:~/xx/xx/kernel/x86/linux-2.6>
>
attached patch fixes that.
Increase the maximum physical address size of x86_64 system
to 44-bits. This is in preparation for future chips that
support larger physical memory sizes.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This (simplified) piece of code didn't behave as expected due to
incorrect constraints in some of the bitops functions, when
X86_FEATURE_xxx is referring to other than the first long:
int test(struct cpuinfo_x86 *c) {
if (cpu_has(c, X86_FEATURE_xxx))
clear_cpu_cap(c, X86_FEATURE_xxx);
return cpu_has(c, X86_FEATURE_xxx);
}
I'd really like understand, though, what the policy of (not) having a
"memory" clobber in these operations is - currently, this appears to
be totally inconsistent. Also, many comments of the non-atomic
functions say those may also be re-ordered - this contradicts the use
of "asm volatile" in there, which again I'd like to understand.
As much as all of these, using 'int' for the 'nr' parameter and
'void *' for the 'addr' one is in conflict with
Documentation/atomic_ops.txt, especially because bt{,c,r,s} indeed
take the bit index as signed (which hence would really need special
precaution) and access the full 32 bits (if 'unsigned long' was used
properly here, 64 bits for x86-64) pointed at, so invalid uses like
referencing a 'char' array cannot currently be caught.
Finally, the code with and without this patch relies heavily on the
-fno-strict-aliasing compiler switch and I'm not certain this really
is a good idea.
In the light of all of this I'm sending this as RFC, as fixing the
above might warrant a much bigger patch...
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
make known_pat_cpu to think amd k8 and fam10h is ok too.
also make tom2 below to be WRBACK
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Introduce ioremap_wc for wc remap.
(generic wrapper is in a later patch)
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a set_memory_wc interface(), similar to set_memory_uc interface.
Callers has to call set_memory_uc, set_memory_wb and
set_memory_wc, set_memory_wb as pairs.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use reserve_memtype and free_memtype interfaces in set_memory_uc/set_memory_wb
interfaces to avoid aliasing.
Usage model of set_memory_uc and set_memory_wb is for RAM memory and users
will first call set_memory_uc and call set_memory_wb after use to reset the
attribute.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make ioremap_change_attr() non-static and use prot_val in place of ioremap_mode.
This interface is used in subsequent PAT patches.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Sets up pat_init() infrastructure.
PAT MSR has following setting.
PAT
|PCD
||PWT
|||
000 WB _PAGE_CACHE_WB
001 WC _PAGE_CACHE_WC
010 UC- _PAGE_CACHE_UC_MINUS
011 UC _PAGE_CACHE_UC
We are effectively changing WT from boot time setting to WC.
UC_MINUS is used to provide backward compatibility to existing /dev/mem
users(X).
reserve_memtype and free_memtype are new interfaces for maintaining alias-free
mapping. It is currently implemented in a simple way with a linked list and
not optimized. reserve and free tracks the effective memory type, as a result
of PAT and MTRR setting rather than what is actually requested in PAT.
pat_init piggy backs on mtrr_init as the rules for setting both pat and mtrr
are same.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
do simple memtest after init_memory_mapping
use find_e820_area_size to find all ram range that is not reserved.
and do some simple bits test to find some bad ram.
if find some bad ram, use reserve_early to exclude that range.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch replaces numeric constant with an appropriate macro
Also 0x800000000000UL is changed to bit shifting which is complement
to the code comment (thanks hpa for notice)
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There really is no need for a redundant implementation here, just keep
the alternative name for allowing consumers to use consistent naming.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch removes the write-only timer_uses_ioapic_pin_0
(gsi can't be <= 15 in the line of it's fake usage in mpparse_32.c).
Spotted by the GNU C compiler.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- Fix the the build breakage when PARAVIRT is defined
but PCI is not
This fixes problem reported at:
http://marc.info/?l=linux-kernel&m=120525966600698&w=2
- Make is_vsmp_box() available even when PARAVIRT is not defined.
This is needed to determine if tsc's are reliable as a time source
even when PARAVIRT is not defined.
- split vsmp_init to use is_vsmp_box() and set_vsmp_pv_ops()
set_vsmp_pv_ops will do nothing if PCI is not enabled in the config.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86_64 has two nr_ioapics = 0 statements. In 32-bit, it can be done
too. We do it through the smpboot_clear_io_apic() inline function,
to cope with subarchitectures (visws) that does not compile mpparse in
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
change smpboot_setup_io_apic() by to match x86_64 behaviour
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is a very large patch, because it depends on a lot
of auxiliary static functions. But they all have been modified
to the point that they're sufficiently close now. So they're just
merged in smpboot.c
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is to match i386. The former name was cuter,
but the current is more meaningful and more general,
since cpu_id can be a logical id.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
voyager would conflict with it, but the types are ultimately
compatible. So remove the extern definition from voyager_smp.c
in favour of the common one
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
After the inclusion, a lot of files needs fixing for conflicts,
some of them in the headers themselves, to accomodate for both
i386 and x86_64 versions.
[ mingo@elte.hu: build fix ]
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Two more files goes away. nmi_64.h and nmi_32.h gives birth
to nmi.h
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of declaring them inside of X86_64 ifdef, do it
unconditionally
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This mapping already exists in x86_64, just provide it for
i386
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
now that it is the same between arches, put it into smpboot.c
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It is used to match i386. The definition for the non-paravirt
case is moved to smp.h instead of smp_32.h
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When acpi=off or there is no SRAT defined, apicid_to_node is got from K8
Northbridge PCI configuration space in k8_scan_nodes() in
arch/x86_64/mm/k8toplogy.c.
The problem is that it assumes bsp apic id is 0 at that point.
For four socket system with Quad core cpus installed, all cpus apic id
is offset by 4, and bsp apic id is 4.
For eight socket system with dual core cpus installed, all cpus apic id
is offset by 2, and bsp apic id is 2.
We need get boot_cpu_id --- bsp apic id, before k8_scan_nodes by called.
So create early_acpi_boot_init and early_get_smp_config for get boot_cpu_id.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>