linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-02 02:01:29 +00:00

Author	SHA1	Message	Date
Alok Kataria	2dbe06faf3	x86: merge the TSC cpu-freq code Unify the TSC cpufreq code. Signed-off-by: Alok N Kataria <akataria@vmware.com> Signed-off-by: Dan Hecht <dhecht@vmware.com> Cc: Dan Hecht <dhecht@vmware.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 07:43:26 +02:00
Alok Kataria	bfc0f5947a	x86: merge tsc calibration Merge the tsc calibration code for the 32bit and 64bit kernel. The paravirtualized calculate_cpu_khz for 64bit now points to the correct tsc_calibrate code as in 32bit. Original native_calculate_cpu_khz for 64 bit is now called as calibrate_cpu. Also moved the recalibrate_cpu_khz function in the common file. Note that this function is called only from powernow K7 cpu freq driver. Signed-off-by: Alok N Kataria <akataria@vmware.com> Signed-off-by: Dan Hecht <dhecht@vmware.com> Cc: Dan Hecht <dhecht@vmware.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 07:43:25 +02:00
Alok Kataria	0ef9553332	x86: merge sched_clock handling Move the basic global variable definitions and sched_clock handling in the common "tsc.c" file. - Unify notsc kernel command line handling for 32 bit and 64bit. - Functional changes for 64bit. - "tsc_disabled" is updated if "notsc" is passed at boottime. - Fallback to jiffies for sched_clock, incase notsc is passed on commandline. Signed-off-by: Alok N Kataria <akataria@vmware.com> Signed-off-by: Dan Hecht <dhecht@vmware.com> Cc: Dan Hecht <dhecht@vmware.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 07:43:25 +02:00
Cyrill Gorcunov	746f2eb790	x86: apic_32.c - add lapic resource Add lapic resource into kernel resource map and mark it as busy Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> CC: Maciej W. Rozycki <macro@linux-mips.org>	2008-07-09 07:43:24 +02:00
Jack Steiner	83f5d894ca	x86: map UV chipset space - UV support Create page table entries to map the SGI UV chipset GRU. local MMR & global MMR ranges. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 07:43:23 +02:00
Jack Steiner	3a9e189d69	x86: map UV chipset space - pagetable Add boot-time function for creating additional 2MB page table entries for mapping chipset specific cached/uncached ranges. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 07:43:23 +02:00
Yinghai Lu	fc9036ea1a	x86: let early_reserve_e820 update e820_saved too so when it is called after early_param, e820_saved get updated too. esp for mpc update. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 07:43:22 +02:00
Yinghai Lu	a0a0becd2d	x86: make e820_saved have update from setup_data seperate reserve_setup_data into e820_reserved_setup_data, and reserve_early_setup_data. So could use e820_reserved_setup_data to backup e820 with setup_data. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Bernhard Walle <bwalle@suse.de> Cc: Ying Huang <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 07:43:22 +02:00
Yinghai Lu	0be15526be	x86: move saving e820_saved to setup_memory_map so other path that will override memory_setup or machine_specific_memory_setup could have e820_saved too. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 07:43:21 +02:00
Vitaly Bordug	ba0fc709e1	powerpc: Add missing reference to coherent_dma_mask There is dma_mask in of_device upon of_platform_device_create() but we don't actually set coherent_dma_mask. This may cause weird behavior of USB subsystem using of_device USB host drivers. Signed-off-by: Vitaly Bordug <vitb@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-08 21:06:35 -07:00
Thomas Bogendoerfer	14defd90f5	[MIPS] Fix 32bit kernels on R4k with 128 byte cache line size The generated copy_page for R4k CPU with a 128 byte cache line size used Create Dirty Exclusive cache line operations even if only part of the cache line was filled. This change avoids generating cache operations, if only part of the cache line size is copied in one loop. It also increases the maxmimum loop size, because the generated code even fits into the available space for r4k CPUs with 128 byte cache line size. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2008-07-08 19:33:46 +01:00
Shane McDonald	b32dfbb9c5	[MIPS] Atlas, decstation: Fix section mismatches triggered by defconfigs Resolve these mismatches by defining affected functions with the __cpuinit attribute, rather than __init. Signed-off-by: Shane McDonald <mcdonald.shane@gmail.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2008-07-08 19:33:46 +01:00
Bernhard Walle	5dfcf14d5b	x86: use FIRMWARE_MEMMAP on x86/E820 This patch uses the /sys/firmware/memmap interface provided in the last patch on the x86 architecture when E820 is used. The patch copies the E820 memory map very early, and registers the E820 map afterwards via firmware_map_add_early(). Signed-off-by: Bernhard Walle <bwalle@suse.de> Acked-by: Greg KH <gregkh@suse.de> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: kexec@lists.infradead.org Cc: yhlu.kernel@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 17:55:42 +02:00
Yinghai Lu	6247943d8a	x86: remove acpi_srat config v2 use ACPI_NUMA directly and move srat_32.c to mm/ Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 15:49:08 +02:00
Yinghai Lu	698839fe04	x86: remove have_arch_parse_srat -v2 we already have the same srat handling interface for 32bit. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 15:49:01 +02:00
Jeremy Fitzhardinge	5a654ba7a8	x86/cpa: use an undefined PTE bit for testing CPA Rather than using _PAGE_GLOBAL - which not all CPUs support - to test CPA, use one of the reserved-for-software-use PTE flags instead. This allows CPA testing to work on CPUs which don't support PGD. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:30 +02:00
Jeremy Fitzhardinge	ef5e94af16	x86_32: remove __PAGE_KERNEL(_EXEC) From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Older x86-32 processors do not support global mappings (PGD), so must only use it if the processor supports it. The _PAGE_KERNEL* flags always have _PAGE_KERNEL set, since logically we always want it set. This is OK even on processors which do not support PGD, since all _PAGE flags are masked with __supported_pte_mask before being turned into a real in-pagetable pte. On 32-bit systems, __supported_pte_mask is initialized to not contain _PAGE_GLOBAL, and it is then added if the CPU is found to support it. The x86-32 code used to use __PAGE_KERNEL/__PAGE_KERNEL_EXEC for this purpose, but they're now redundant and can be removed. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:29 +02:00
Jeremy Fitzhardinge	8490638cf0	x86: always set _PAGE_GLOBAL in _PAGE_KERNEL* flags Consistently set _PAGE_GLOBAL in _PAGE_KERNEL flags. This makes 32- and 64-bit code consistent, and removes some special cases where __PAGE_KERNEL* did not have _PAGE_GLOBAL set, causing confusion as a result of the inconsistencies. This patch only affects x86-64, which generally always supports PGD. The x86-32 patch is next. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:28 +02:00
Jeremy Fitzhardinge	574977a2ed	x86_64/setup: unconditionally populate the pgd When allocating a new pud, unconditionally populate the pgd (why did we bother to create a new pud if we weren't going to populate it?). This will only happen if the pgd slot was empty, since any existing pud will be reused. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:27 +02:00
Ingo Molnar	aea5f9f89b	x86: fix "x86: let setup_arch call init_apic_mappings for 32bit" add back this line lost from trap_init(): set_trap_gate(0, &divide_error); Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:27 +02:00
Yinghai Lu	329513a35d	x86: move prefill_possible_map calling early call it right after we are done with MADT/mptable handling, instead of doing that in setup_per_cpu_areas() later on... this way for_possible_cpu() can be used early. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:24 +02:00
Yinghai Lu	5f4765f96e	x86: move init_cpu_to_node after get_smp_config when acpi=off, cpu_to_apicid is ready after get_smp_config so need to move init_cpu_to_node after it. otherwise, we will get wrong cpu->node mapping, and it will rely on amd_detect_cmp() to correct it - but that is too late as setup_per_cpu_data is already called before that so we will get per_cpu_data on the wrong node. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:23 +02:00
Yinghai Lu	cb95a13a8a	x86: merge zones_sizes_init for numa and non numa on 32-bit move out e820_register_active_regions from non numa zones_sizes_init() and remove numa version zones_sizes_init(). and let 32 bit call remove_all_active_ranges() in setup_arch() directly like 64-bit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:22 +02:00
Yinghai Lu	d9a81b4411	x86: do not printout if we do not find setup_data Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:22 +02:00
Yinghai Lu	4fcc545a74	x86: make early_res_to_bootmem print out less 80 width chars Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:20 +02:00
Yinghai Lu	dc8e8120ad	x86: change copy_e820_map to append_e820_map so it has a more meaningful name. also change it to static. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:19 +02:00
Bernhard Walle	32105f7fd8	x86: find offset for crashkernel reservation automatically This patch removes the need of the crashkernel=...@offset parameter to define a fixed offset for crashkernel reservation. That feature can be used together with a relocatable kernel where the kexec-tools relocate the kernel and get the actual offset from /proc/iomem. The use case is a kernel where the .text+.data+.bss is after 16M physical memory (debug kernel with lockdep on x86_64 can cause that) which caused a major pain in autoconfiguration in our distribution. Also, that patch unifies crashdump architectures a bit since IA64 has that semantics from the very beginning of the kdump port. Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: vgoyal@redhat.com Cc: Bernhard Walle <bwalle@suse.de> Cc: kexec@lists.infradead.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:18 +02:00
Alok Kataria	fd6493e166	x86: cleanup e820_setup_gap(), v2 e820_search_gap also take a end_addr parameter to limit search from start_addr to end_addr. Signed-off-by: AloK N Kataria <akataria@vmware.com> Acked-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: "lenb@kernel.org" <lenb@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:17 +02:00
Mike Travis	6a2f47ca27	x86: add check for node passed to node_to_cpumask, v3 * When CONFIG_DEBUG_PER_CPU_MAPS is set, the node passed to node_to_cpumask and node_to_cpumask_ptr should be validated. If invalid, then a dump_stack is performed and a zero cpumask is returned. v2: Slightly different version to remove a compiler warning. v3: Redone to reflect moving setup.c -> setup_percpu.c Signed-off-by: Mike Travis <travis@sgi.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:16 +02:00
Jeremy Fitzhardinge	cd5dce2fb0	x86: fix CPA self-test for "x86/paravirt: groundwork for 64-bit Xen support" Ingo Molnar wrote: > -tip auto-testing found pagetable corruption (CPA self-test failure): > > [ 32.956015] CPA self-test: > [ 32.958822] 4k 2048 large 508 gb 0 x 2556[ffff880000000000-ffff88003fe00000] miss 0 > [ 32.964000] CPA ffff88001d54e000: bad pte 1d4000e3 > [ 32.968000] CPA ffff88001d54e000: unexpected level 2 > [ 32.972000] CPA ffff880022c5d000: bad pte 22c000e3 > [ 32.976000] CPA ffff880022c5d000: unexpected level 2 > [ 32.980000] CPA ffff8800200ce000: bad pte 200000e3 > [ 32.984000] CPA ffff8800200ce000: unexpected level 2 > [ 32.988000] CPA ffff8800210f0000: bad pte 210000e3 > > config and full log can be found at: > > http://redhat.com/~mingo/misc/config-Mon_Jun_30_11_11_51_CEST_2008.bad > http://redhat.com/~mingo/misc/log-Mon_Jun_30_11_11_51_CEST_2008.bad Phew. OK, I've worked this out. Short version is that's it's a false alarm, and there was no real failure here. Long version: * I changed the code to create the physical mapping pagetables to reuse any existing mapping rather than replace it. Specifically, reusing an pud pointed to by the pgd caused this symptom to appear. * The specific PUD being reused is the one created statically in head_64.S, which creates an initial 1GB mapping. * That mapping doesn't have _PAGE_GLOBAL set on it, due to the inconsistency between __PAGE_* and PAGE_. The CPA test attempts to clear _PAGE_GLOBAL, and then checks to see that the resulting range is 1) shattered into 4k pages, and 2) has no _PAGE_GLOBAL. * However, since it didn't have _PAGE_GLOBAL on that range to start with, change_page_attr_clear() had nothing to do, and didn't bother shattering the range, * resulting in the reported messages The simple fix is to set _PAGE_GLOBAL in level2_ident_pgt. An additional fix to make CPA testing more robust by using some other pagetable bit (one of the unused available-to-software ones). This would solve spurious CPA test warnings under Xen which uses _PAGE_GLOBAL for its own purposes (ie, not under guest control). Also, we should revisit the use of _PAGE_GLOBAL in asm-x86/pgtable.h, and use it consistently, and drop MAKE_GLOBAL. The first time I proposed it it caused breakages in the very early CPA code; with luck that's all fixed now. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Mark McLoughlin <markmc@redhat.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:15 +02:00
Yinghai Lu	996cf4438f	x86: don't reallocate pgt for node0 kva ram already mapped right after away, so don't need to get that for low ram. avoid wasting one copy of pgdat. also add node id in early_res name in case we get it from find_e820_area. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:15 +02:00
Yinghai Lu	28bb223795	x86: move reserve_setup_data to setup.c Ying Huang would like setup_data to be reserved, but not included in the no save range. Here we try to modify the e820 table to reserve that range early. also add that in early_res in case bootloader messes up with the ramdisk. other solution would be 1. add early_res_to_highmem... 2. early_res_to_e820... but they could reserve another type memory wrongly, if early_res has some resource reserved early, and not needed later, but it is not removed from early_res in time. Like the RAMDISK (already handled). Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: andi@firstfloor.org Tested-by: Huang, Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:14 +02:00
Jeremy Fitzhardinge	102d0a4b56	x86, paravirt, 64-bit: fix compile errors with IA32_EMULATION off Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:13 +02:00
Jeremy Fitzhardinge	1a98fd14f4	x86: setup_arch() && early_ioremap_init() Looks like the setup.c unification missed the early_ioremap init from the early_ioremap unification. Unconditionally call early_ioremap_init(). needed for "x86/paravirt: groundwork for 64-bit Xen support". Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Mark McLoughlin <markmc@redhat.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:11 +02:00
Yinghai Lu	914bebfad4	x86: use disable_apic in 32bit change the enable_local_apic to static force_enable_local_apic for 32bit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:08 +02:00
Yinghai Lu	a04ad82d0b	x86: fix init_memory_mapping over boundary, v4 use PMD_SHIFT to calculate boundary also adjust size for pre-allocated table size Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:07 +02:00
Yinghai Lu	b4df32f4ae	x86: fix warning in e820_reserve_resources with 32bit when 64bit resource is not enabled, we get: arch/x86/kernel/e820.c: In function ‘e820_reserve_resources’: arch/x86/kernel/e820.c:1217: warning: comparison is always false due to limited range of data type because res->start/end is resource_t aka u32. it will overflow. fix it with temp end of u64 Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:07 +02:00
Yinghai Lu	7482b0e962	x86: fix init_memory_mapping over boundary v3 some ram-end boundary only has page alignment, instead of 2M alignment. v2: make init_memory_mapping more solid: start could be any value other than 0 v3: fix NON PAE by handling left over in kernel_physical_mapping Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:06 +02:00
Yinghai Lu	f3294a33e7	x86: let setup_arch call init_apic_mappings for 32bit instead of calling it from trap_init() also move init ioapic mapping out of apic_32.c so 32 bit do same as 64 bit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:04 +02:00
Yinghai Lu	ab67715c72	x86: early res print out alignment v2 v2: fix print info to cont Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:03 +02:00
Jeremy Fitzhardinge	22b45144f6	x86/paravirt: groundwork for 64-bit Xen support, fix #2 Ingo Molnar wrote: > that fixed the build but now we've got a boot crash with this config: > > time.c: Detected 2010.304 MHz processor. > spurious 8259A interrupt: IRQ7. > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > IP: [<0000000000000000>] > PGD 0 > Thread overran stack, or stack corrupted > Oops: 0010 [1] SMP > CPU 0 > I don't know if this will fix this bug, but it's definitely a bugfix. It was trashing random pages by overwriting them with pagetables... Don't trash a large pmd's data when mapping physical memory. This is a bugfix for "x86_64: adjust mapping of physical pagetables to work with Xen". Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:02 +02:00
Jeremy Fitzhardinge	457da70ec0	x86/paravirt: groundwork for 64-bit Xen support, fix Ingo Molnar wrote: > * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > > >>> It quickly broke the build in testing: >>> >>> include/asm/pgalloc.h: In function ‘paravirt_pgd_free': >>> include/asm/pgalloc.h:14: error: parameter name omitted >>> arch/x86/kernel/entry_64.S: In file included from >>> arch/x86/kernel/traps_64.c:51:include/asm/pgalloc.h: In function >>> ‘paravirt_pgd_free': >>> include/asm/pgalloc.h:14: error: parameter name omitted >>> >>> >> No, looks like my fault. The non-PARAVIRT version of >> paravirt_pgd_free() is: >> >> static inline void paravirt_pgd_free(struct mm_struct mm, pgd_t ) {} >> >> but C doesn't like missing parameter names, even if unused. >> >> This should fix it: >> > > that fixed the build but now we've got a boot crash with this config: > > time.c: Detected 2010.304 MHz processor. > spurious 8259A interrupt: IRQ7. > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > IP: [<0000000000000000>] > PGD 0 > Thread overran stack, or stack corrupted > Oops: 0010 [1] SMP > CPU 0 > > with: > > http://redhat.com/~mingo/misc/config-Thu_Jun_26_12_46_46_CEST_2008.bad > Use SWAPGS_UNSAFE_STACK in ia32entry.S in the places where the active stack is the usermode stack. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:02 +02:00
Yinghai Lu	e7b3789524	x86: move fix mapping page table range early do that in init_memory_mapping also remove one init_ohci1394_dma_on_all_controllers Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:01 +02:00
Yinghai Lu	042623bbab	x86: clean up ARCH_SETUP asm-x86/paravirt.h already have protection with CONFIG_PARAVIRT inside Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:00 +02:00
Bernhard Walle	611dfd7819	x86: limit E820 map when a user-defined memory map is specified This patch brings back limiting of the E820 map when a user-defined E820 map is specified. While the behaviour of i386 (32 bit) was to limit the E820 map (and /proc/iomem), the behaviour of x86-64 (64 bit) was not to limit. That patch limits the E820 map again for both x86 architectures. Code was tested for compilation and booting on a 32 bit and 64 bit system. Signed-off-by: Bernhard Walle <bwalle@suse.de> Acked-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: kexec@lists.infradead.org Cc: vgoyal@redhat.com Cc: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:15:59 +02:00
Jeremy Fitzhardinge	8207c2570a	x86: fix pte allocation in "x86: introduce init_memory_mapping for 32bit" The patch "x86: introduce init_memory_mapping for 32bit" does not allocate enough space for PTEs if the CPU does not implement PSE. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:15:58 +02:00
Jeremy Fitzhardinge	9f9d489a3e	x86/paravirt, 64-bit: make load_gs_index() a paravirt operation Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:15:58 +02:00
Jeremy Fitzhardinge	fab58420ac	x86/paravirt, 64-bit: add adjust_exception_frame 64-bit Xen pushes a couple of extra words onto an exception frame. Add a hook to deal with them. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:15:57 +02:00
Jeremy Fitzhardinge	6680415481	x86, 64-bit: ia32entry: replace privileged instructions with pvops Replace privileged instructions with the corresponding pvops in ia32entry.S. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:15:55 +02:00
Jeremy Fitzhardinge	2be29982a0	x86/paravirt: add sysret/sysexit pvops for returning to 32-bit compatibility userspace In a 64-bit system, we need separate sysret/sysexit operations to return to a 32-bit userspace. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citirx.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:15:52 +02:00
Jeremy Fitzhardinge	c7245da6ae	x86/paravirt, 64-bit: don't restore user rsp within sysret There's no need to combine restoring the user rsp within the sysret pvop, so split it out. This makes the pvop's semantics closer to the machine instruction. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citirx.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:13:37 +02:00
Jeremy Fitzhardinge	d75cd22fdd	x86/paravirt: split sysret and sysexit Don't conflate sysret and sysexit; they're different instructions with different semantics, and may be in use at the same time (at least within the same kernel, depending on whether its an Intel or AMD system). sysexit - just return to userspace, does no register restoration of any kind; must explicitly atomically enable interrupts. sysret - reloads flags from r11, so no need to explicitly enable interrupts on 64-bit, responsible for restoring usermode %gs Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citirx.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:13:15 +02:00
Jeremy Fitzhardinge	e04e0a630d	x86: use __KERNEL_DS as SS when returning to a kernel thread This is needed when the kernel is running on RING3, such as under Xen. x86_64 has a weird feature that makes it #GP on iret when SS is a null descriptor. This need to be tested on bare metal to make sure it doesn't cause any problems. AMD specs say SS is always ignored (except on iret?). Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:12 +02:00
Jeremy Fitzhardinge	478de5a9d6	x86: save %fs and %gs before load_TLS() and arch_leave_lazy_cpu_mode() We must do this because load_TLS() may need to clear %fs and %gs. (e.g. under Xen). Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:11 +02:00
Jeremy Fitzhardinge	3fe0a63efd	x86, 64-bit: __switch_to(): move arch_leave_lazy_cpu_mode() to the right place We must leave lazy mode before switching the %fs and %gs selectors. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:10 +02:00
Eduardo Habkost	0814e0bace	x86, 64-bit: split set_pte_vaddr() We will need to set a pte on l3_user_pgt. Extract set_pte_vaddr_pud() from set_pte_vaddr(), that will accept the l3 page table as parameter. This change should be a no-op for existing code. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:09 +02:00
Jeremy Fitzhardinge	7c934d3990	x86, 64-bit: create small vmemmap mappings if PSE not available If PSE is not available, then fall back to 4k page mappings for the vmemmap area. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:08 +02:00
Jeremy Fitzhardinge	4f9c11dd49	x86, 64-bit: adjust mapping of physical pagetables to work with Xen This makes a few of changes to the construction of the initial pagetables to work better with paravirt_ops/Xen. The main areas are: 1. Support non-PSE mapping of memory, since Xen doesn't currently allow 2M pages to be mapped in guests. 2. Make sure that the ioremap alias of all pages are dropped before attaching the new page to the pagetable. This avoids having writable aliases of pagetable pages. 3. Preserve existing pagetable entries, rather than overwriting. Its possible that a fair amount of pagetable has already been constructed, so reuse what's already in place rather than ignoring and overwriting it. The algorithm relies on the invariant that any page which is part of the kernel pagetable is itself mapped in the linear memory area. This way, it can avoid using ioremap on a pagetable page. The invariant holds because it maps memory from low to high addresses, and also allocates memory from low to high. Each allocated page can map at least 2M of address space, so the mapped area will always progress much faster than the allocated area. It relies on the early boot code mapping enough pages to get started. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:07 +02:00
Jeremy Fitzhardinge	f97013fd8f	x86, 64-bit: split x86_64_start_kernel Split x86_64_start_kernel() into two pieces: The first essentially cleans up after head_64.S. It clears the bss, zaps low identity mappings, sets up some early exception handlers. The second part preserves the boot data, reserves the kernel's text/data/bss, pagetables and ramdisk, and then starts the kernel proper. This split is so that Xen can call the second part to do the set up it needs done. It doesn't need any of the first part setups, because it doesn't boot via head_64.S, and its redundant or actively damaging. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:06 +02:00
Eduardo Habkost	a6523748bd	paravirt/x86, 64-bit: move __PAGE_OFFSET to leave a space for hypervisor Set __PAGE_OFFSET to the most negative possible address + 16*PGDIR_SIZE. The gap is to allow a space for a hypervisor to fit. The gap is more or less arbitrary, but it's what Xen needs. When booting native, kernel/head_64.S has a set of compile-time generated pagetables used at boot time. This patch removes their absolutely hard-coded layout, and makes it parameterised on __PAGE_OFFSET (and __START_KERNEL_map). Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:04 +02:00
Jeremy Fitzhardinge	97349135fe	x86/paravirt: add debugging for missing operations Rather than just jumping to 0 when there's a missing operation, raise a BUG. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:03 +02:00
Jeremy Fitzhardinge	d8d5900ef8	x86: preallocate and prepopulate separately Jan Beulich points out that vmalloc_sync_all() assumes that the kernel's pmd is always expected to be present in the pgd. The current pgd construction code will add the pgd to the pgd_list before its pmds have been pre-populated, thereby making it visible to vmalloc_sync_all(). However, because pgd_prepopulate_pmd also does the allocation, it may block and cannot be done under spinlock. The solution is to preallocate the pmds out of the spinlock, then populate them while holding the pgd_list lock. This patch also pulls the pmd preallocation and mop-up functions out to be common, assuming that the compiler will generate no code for them when PREALLOCTED_PMDS is 0. Also, there's no need for pgd_ctor to clear the pgd again, since it's allocated as a zeroed page. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:02 +02:00
Jeremy Fitzhardinge	eba0045ff8	x86/paravirt: add a pgd_alloc/free hooks Add hooks which are called at pgd_alloc/free time. The pgd_alloc hook may return an error code, which if non-zero, causes the pgd allocation to be failed. The hooks may be used to allocate/free auxillary per-pgd information. also fix: > * Ingo Molnar <mingo@elte.hu> wrote: > > include/asm/pgalloc.h: In function ‘paravirt_pgd_free': > include/asm/pgalloc.h:14: error: parameter name omitted > arch/x86/kernel/entry_64.S: In file included from > arch/x86/kernel/traps_64.c:51:include/asm/pgalloc.h: In function ‘paravirt_pgd_free': > include/asm/pgalloc.h:14: error: parameter name omitted Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:01 +02:00
Jeremy Fitzhardinge	67350a5c45	x86: simplify vmalloc_sync_all vmalloc_sync_all() is only called from register_die_notifier and alloc_vm_area. Neither is on any performance-critical paths, so vmalloc_sync_all() itself is not on any hot paths. Given that the optimisations in vmalloc_sync_all add a fair amount of code and complexity, and are fairly hard to evaluate for correctness, it's better to just remove them to simplify the code rather than worry about its absolute performance. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:59 +02:00
Ingo Molnar	330ddd2089	x86: build fix fix: In file included from arch/x86/kernel/setup.c:118: include/asm/highmem.h:64: error: expected identifier or ‘(' before ‘do' include/asm/highmem.h:64: error: expected identifier or ‘(' before ‘while' include/asm/highmem.h:67: error: expected identifier or ‘(' before ‘do' include/asm/highmem.h:67: error: expected identifier or ‘(' before ‘while' Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:57 +02:00
Ingo Molnar	3442682a54	x86: remove extra newline from setup.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:56 +02:00
Yinghai Lu	5092301c72	x86: we only have init_pg_tables_end for 32bit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:55 +02:00
Yinghai Lu	29f784e369	x86: change some functions in setup.c to static Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:54 +02:00
Yinghai Lu	d1b20afec3	x86: make x86_find_smp_config depends on 64 bit too Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:54 +02:00
Yinghai Lu	0196bcbb15	x86: move parse elfvorehdr back to setup.c Signed-off-by: Yinghai <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:53 +02:00
Yinghai Lu	bdba0e700c	x86: move reserve_standard_io_resources back to setup.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:52 +02:00
Yinghai Lu	ccb4defa71	x86: move back crashkernel back to setup.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:51 +02:00
Yinghai Lu	257b0fde99	x86: move parse_setup_data back to setup.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:50 +02:00
Yinghai Lu	217b8ce890	x86: move boot_params back to setup.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:50 +02:00
Yinghai Lu	55f262391a	x86: rename setup_32.c to setup.c and let 64 bit use that instead of setup_64.c [ mingo@elte.hu ] x86: build fix fix: arch/x86/kernel/setup.c: In function ‘setup_arch': arch/x86/kernel/setup.c:561: error: implicit declaration of function ‘efi_reserve_early' and: arch/x86/kernel/setup.c:766: error: implicit declaration of function 'init_cpu_to_node' and: arch/x86/kernel/setup.c:676: warning: operation on 'max_pfn_mapped' may be undefined Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:49 +02:00
Yinghai Lu	f2f865fe6e	x86: space to tab in setup_arch Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:48 +02:00
Yinghai Lu	76934ed4b3	x86: merge 64bit setup_arch into setup_32 Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:47 +02:00
Yinghai Lu	46d671b525	x86: add extra includes for 64bit support Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:46 +02:00
Yinghai Lu	7dea23ecd1	x86: put global variable for 32bit all together those variables are not needed by 64 bit. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:46 +02:00
Yinghai Lu	eb1379cb29	x86: update reserve_initrd to support 64bit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:45 +02:00
Yinghai Lu	08afc7c0dd	x86: we can use full bootmem after have init_memory_mapping So remove outdated comments Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:44 +02:00
Yinghai Lu	378b39a4f9	x86: rename setup.c to setup_percpu.c some functions need to be moved to setup_numa.c after we merge setup32/64.c, some funcs need to be moved back to setup.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:43 +02:00
Yinghai Lu	b9d19f4a51	x86: fix memory setup bug interesting... [ 0.000000] mapped low ram: 0 - 20000000 [ 0.000000] low ram: 00000000 - 1fff0000 [ 0.000000] bootmap 00002000 - 00006000 max_pfn_mapped > max_low_pfn? it seems init_memory_mapping reveals an old bug. please check attached test patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:42 +02:00
Bernhard Walle	383bc5cecc	x86, crashdump, /proc/vmcore: remove CONFIG_EXPERIMENTAL from kdump I would suggest to remove the "experimental" status from Kdump. Kdump is now in the kernel since a long time and used by Enterprise distributions. I don't think that "experimental" is true any more. Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: vgoyal@redhat.com Cc: kexec@lists.infradead.org Cc: Bernhard Walle <bwalle@suse.de> Cc: akpm@linux-foundation.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:41 +02:00
Paul Jackson	200001eb14	x86 boot: only pick up additional EFI memmap if add_efi_memmap flag Applies on top of the previous patch: x86 boot: add code to add BIOS provided EFI memory entries to kernel Instead of always adding EFI memory map entries (if present) to the memory map after initially finding either E820 BIOS memory map entries and/or kernel command line memmap entries, -instead- only add such additional EFI memory map entries if the kernel boot option: add_efi_memmap is specified. Requiring this 'add_efi_memmap' option is backward compatible with kernels that didn't load such additional EFI memory map entries in the first place, and it doesn't override a configuration that tries to replace all E820 or EFI BIOS memory map entries with ones given entirely on the kernel command line. Signed-off-by: Paul Jackson <pj@sgi.com> Cc: "Yinghai Lu" <yhlu.kernel@gmail.com> Cc: "Jack Steiner" <steiner@sgi.com> Cc: "Mike Travis" <travis@sgi.com> Cc: "Huang Cc: Ying" <ying.huang@intel.com> Cc: "Andi Kleen" <andi@firstfloor.org> Cc: "Andrew Morton" <akpm@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:41 +02:00
Paul Jackson	5dab8ec139	mm, generic, x86 boot: more tweaks to hex prints of some pfn addresses Fix some problems with (and applies on top of) a previous patch: x86 boot: show pfn addresses in hex not decimal in some kernel info printks Primarily change "0x%8lx" format, which displays with a right aligned space filled hex number (spaces between the "0x" prefix and the number), into "%0#10lx" format, which zero fills instead of space fills, and which uses the printf flag '#' to request the "0x" prefix instead of hard coding it. Also replace some other "0x%lx" formats with "%#lx", making use of the '#' printf flag again. Signed-off-by: Paul Jackson <pj@sgi.com> Cc: "Yinghai Lu" <yhlu.kernel@gmail.com> Cc: "Jack Steiner" <steiner@sgi.com> Cc: "Mike Travis" <travis@sgi.com> Cc: "Huang Cc: Ying" <ying.huang@intel.com> Cc: "Andi Kleen" <andi@firstfloor.org> Cc: "Andrew Morton" <akpm@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:40 +02:00
Alok Kataria	3381959da5	x86: cleanup e820_setup_gap(), add e820_search_gap(), v2 This is a preparatory patch for the next patch in series. Moves some code from e820_setup_gap to a new function e820_search_gap. This patch is a part of a bug fix where we walk the ACPI table to calculate a gap for PCI optional devices. v1->v2: Patch on top of tip/master. Fixes a bug introduced in the last patch about the typeof "last". Also the new function e820_search_gap now returns if we found a gap in e820_map. Signed-off-by: Alok N Kataria <akataria@vmware.com> Cc: lenb@kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:39 +02:00
Yinghai Lu	c987d12f84	x86: remove end_pfn in 64bit and use max_pfn directly. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:38 +02:00
Yinghai Lu	232b957ae9	x86: change size if e820_update/remove_range in case someone using crazy parameter while calling them. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:36 +02:00
Yinghai Lu	d86623a0d5	x86: add table_top check for alloc_low_page in 64 bit that range is from find_e820_area, so don't try to use end_pfn to see if out of boundary...use table_top instead to avoid possible strange result while cross the boundary... also change early_printk to printk, because init_memory_mapping is after early param parsing, and console=uart8250 already working at that time. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:36 +02:00
Yinghai Lu	1a0db38e5f	x86: get max_pfn_mapped in init_memory_mapping so don't shift that in the loop Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:35 +02:00
Yinghai Lu	976dd4dc99	x86: fix e820_update_range size when overlapping before that we relay on sanitize_e820_map to remove the overlap. but e820_update_range(,,E820_RESERVED, E820_RAM) will not work this patch fix that who is going to use this? Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:34 +02:00
Yinghai Lu	3a58a2a6c8	x86: introduce init_memory_mapping for 32bit #3 move kva related early backto initmem_init for numa32 Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:33 +02:00
Yinghai Lu	cfb0e53b05	x86: introduce init_memory_mapping for 32bit #2 moving relocate_initrd early Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:32 +02:00
Yinghai Lu	4e29684c40	x86: introduce init_memory_mapping for 32bit #1 ... so can we use mem below max_low_pfn earlier. this allows us to move several functions more early instead of waiting to after paging_init. That includes moving relocate_initrd() earlier in the bootup, and kva related early setup done in initmem_init. (in followup patches) Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:32 +02:00
Jeremy Fitzhardinge	4583ed514e	x86, 64-bit: unify early_ioremap The 32-bit early_ioremap will work equally well for 64-bit, so just use it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:28 +02:00
Jeremy Fitzhardinge	bb23e403e5	x86, 64-bit: use p??_populate() to attach pages to pagetable Use the _populate() functions to attach new pages to a pagetable, to make sure the right paravirt_ops calls get called. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:27 +02:00
Jeremy Fitzhardinge	fc8b8a60ff	x86, 64-bit: use write_gdt_entry in vsyscall_set_cpu Use write_gdt_entry to generate the special vgetcpu descriptor in the vsyscall page. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:26 +02:00
Jeremy Fitzhardinge	ada8570823	x86: remove open-coded save/load segment operations This removes a pile of buggy open-coded implementations of savesegment and loadsegment. (They are buggy because they don't have memory barriers to prevent them from being reordered with respect to memory accesses.) Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:25 +02:00
Cyrill Gorcunov	4de0043617	x86: nmi_watchdog - introduce nmi_watchdog_active() helper Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: macro@linux-mips.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:51:42 +02:00
Cyrill Gorcunov	c376d45432	x86: nmi_watchdog - use NMI_NONE by default There is no need to keep NMI_DISABLED definition and use it for nmi_watchdog by default. Here is the point why: - IO-APIC and APIC chips are programmed for nmi_watchdog support at very early stage of kernel booting and not having nmi_watchdog specified as boot option lead only to nmi_watchdog becomes to NMI_NONE anyway - enable nmi_watchdog thru /proc/sys/kernel/nmi if it was not specified at boot is not possible too (even having this sysfs entry) Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: macro@linux-mips.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:51:41 +02:00
Cyrill Gorcunov	2b6addad2d	x86: nmi_watchdog - remove useless check Since nmi_watchdog is unsigned variable we may safely remove the check for negative value. Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: macro@linux-mips.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:51:40 +02:00
Cyrill Gorcunov	116f570e5d	x86: nmi_watchdog - use nmi_watchdog variable for printing Since it is possible NMI_ definitions could be changed one day we better print out real nmi_watchdog value instead of constant string. Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: macro@linux-mips.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:51:39 +02:00
Cyrill Gorcunov	47a486cc11	x86: perfctr-watchdog.c - coding style cleanup Just some code beautification. Nothing else. Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: macro@linux-mips.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:51:39 +02:00
Paul Jackson	e2fc252e0c	x86 boot: show pfn addresses in hex not decimal in some kernel info printks Page frame numbers (the portion of physical addresses above the low order page offsets) are displayed in several kernel debug and info prints in decimal, not hex. Decimal addresse are unreadable. Use hex. Signed-off-by: Paul Jackson <pj@sgi.com> Cc: "Yinghai Lu" <yhlu.kernel@gmail.com> Cc: "Jack Steiner" <steiner@sgi.com> Cc: "Mike Travis" <travis@sgi.com> Cc: "Huang Cc: Ying" <ying.huang@intel.com> Cc: "Andi Kleen" <andi@firstfloor.org> Cc: "Andrew Morton" <akpm@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:51:37 +02:00
Paul Jackson	c4ba1320b7	x86 boot: allow overlapping early reserve memory ranges Add support for overlapping early memory reservations. In general, they still can't overlap, and will panic with "Overlapping early reservations" if they do overlap. But if a memory range is reserved with the new call: reserve_early_overlap_ok() rather than with the usual call: reserve_early() then subsequent early reservations are allowed to overlap. This new reserve_early_overlap_ok() call is only used in one place so far, which is the "BIOS reserved" reservation for the the EBDA region, which out of Paranoia reserves more than what the BIOS might have specified, and which thus might overlap with another legitimate early memory reservation (such as, perhaps, the EFI memmap.) Signed-off-by: Paul Jackson <pj@sgi.com> Cc: "Yinghai Lu" <yhlu.kernel@gmail.com> Cc: "Jack Steiner" <steiner@sgi.com> Cc: "Mike Travis" <travis@sgi.com> Cc: "Huang Cc: Ying" <ying.huang@intel.com> Cc: "Andi Kleen" <andi@firstfloor.org> Cc: "Andrew Morton" <akpm@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:51:26 +02:00
Paul Jackson	05486fa7e6	x86 boot: x86_64 efi compiler warning fix Fix a compiler warning. Rather than always casting a u32 to a pointer (which generates a warning on x86_64 systems) instead separate the x86_32 and x86_64 assignments entirely with ifdefs. Until other recent changes to this code, it used to have x86_64 separated like this. Signed-off-by: Paul Jackson <pj@sgi.com> Cc: "Yinghai Lu" <yhlu.kernel@gmail.com> Cc: "Jack Steiner" <steiner@sgi.com> Cc: "Mike Travis" <travis@sgi.com> Cc: "Huang Cc: Ying" <ying.huang@intel.com> Cc: "Andi Kleen" <andi@firstfloor.org> Cc: "Andrew Morton" <akpm@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:29 +02:00
Paul Jackson	157fabf095	x86 boot: e820 code indentation fix Fix indentation. An earlier code merge got the indentation of four lines of code off by a tab. Signed-off-by: Paul Jackson <pj@sgi.com> Cc: "Yinghai Lu" <yhlu.kernel@gmail.com> Cc: "Jack Steiner" <steiner@sgi.com> Cc: "Mike Travis" <travis@sgi.com> Cc: "Huang Cc: Ying" <ying.huang@intel.com> Cc: "Andi Kleen" <andi@firstfloor.org> Cc: "Andrew Morton" <akpm@linux-foundation.org> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:28 +02:00
Yinghai Lu	295deae401	x86: setup_arch 32bit move kvm_guest_init later Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:27 +02:00
Yinghai Lu	9a2e593026	x86: setup_arch 32bit move command line copying early Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:26 +02:00
Yinghai Lu	7465252ea0	x86: setup_arch 32bit move efi check later Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:25 +02:00
Yinghai Lu	11cd0bc140	x86: move some func calling from setup_arch to paging_init those function depend on paging setup pgtable, so they could access the ram in bootmem region but just get mapped. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:24 +02:00
Yinghai Lu	c09434571d	x86: numa32 pfn print out using hex instead Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:23 +02:00
Yinghai Lu	6a07a0edac	x86: fix compile warning in init_64.c len is long and ret is only for NUMA Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:23 +02:00
Ingo Molnar	3eb11edc13	x86: build fix fix: arch/x86/kernel/setup_32.c:409: error: 'enable_local_apic' undeclared (first use in this function) arch/x86/kernel/setup_32.c:409: error: (Each undeclared identifier is reported only once arch/x86/kernel/setup_32.c:409: error: for each function it appears in.) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:22 +02:00
Yinghai Lu	346cafecde	x86: clean up min_low_pfn for 32bit we already had early_res support, so don't need to track min_low_pfn. keep it to 0 always. also use init_bootmem_node instead of init_bootmem, so don't touch min_low_pfn. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:21 +02:00
Yinghai Lu	2ec65f8b89	x86: clean up using max_low_pfn on 32-bit so that max_low_pfn is not changed after it is set. so we can move that early and out of initmem_init. could call find_low_pfn_range just after max_pfn is set. also could move reserve_initrd out of setup_bootmem_allocator so 32bit is more like 64bit. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:20 +02:00
Yinghai Lu	bef1568d97	x86: move reservetop and vmalloc parsing to pgtable_32.c also change reserve_top_address to __init attibute Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:19 +02:00
Yinghai Lu	90d967e0ef	x86: move find_max_low_pfn to init_32.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:18 +02:00
Yinghai Lu	7f0be02c5e	x86: move boot_params declaring to setup.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:17 +02:00
Yinghai Lu	225c37d71b	x86: introduce reserve_initrd Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:16 +02:00
Yinghai Lu	b2ac82a090	x86: introduce initmem_init for 32 bit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:15 +02:00
Yinghai Lu	1f75d7e32e	x86: introduce initmem_init for 64 bit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:14 +02:00
Yinghai Lu	17b4cceb1f	x86: move elfcorehdr parsing to setup.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:13 +02:00
Yinghai Lu	ce97c40e28	x86: move reserve_standard_io_resource to setup.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:12 +02:00
Yinghai Lu	f81be876ea	x86: remove two duplicated funcs in setup_32.c early_cpu_init is declared in processor.h memory_setup is defined in e820.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:11 +02:00
Yinghai Lu	0f0124fa74	x86: merge setup64.c into common_64.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:10 +02:00
Yinghai Lu	a9c1182fbd	x86: seperate probe_roms into another file it is only needed for 32bit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:05 +02:00
Yinghai Lu	7a1fd9866c	x86: add e820_remove_range ... so could add real hole in e820 agp check is using request_mem_region, and could fail if e820 is reserved... Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:37 +02:00
Yinghai Lu	9a25034759	x86: change identify_cpu to static Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:35 +02:00
Yinghai Lu	f580366f77	x86: seperate funcs from setup_64 to cpu common_64.c Signed-off-by: Yinghai Lu <yhlu.kernel@mail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:34 +02:00
Yinghai Lu	04606618bb	x86: remove some acpi ifdefs in setup_32/64 Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:33 +02:00
Yinghai Lu	ce38cc7996	x86: clean up init_amd() 1. move out calling of check_enable_amd_mmconf_dmi out of setup_64.c put it into init_amd(), so don't need to make extra dmi check for system with other cpus. 2. 15 --> 0xf Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:32 +02:00
Yinghai Lu	3c999f1426	x86: check command line when CONFIG_X86_MPPARSE is not set, v2 if acpi=off, acpi=noirq and pci=noacpi, we need to disable apic. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:31 +02:00
Jeremy Fitzhardinge	88a6846c70	xen: set max_pfn_mapped Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:30 +02:00
Jeremy Fitzhardinge	b792c75590	xen: reserve ISA space in e820 map [ TODO: release the underlying memory back to Xen. ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:29 +02:00
Jeremy Fitzhardinge	be5bf9fa1c	xen: reserve Xen-specific memory in e820 map Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:28 +02:00
Yinghai Lu	d52d53b8a5	RFC x86: try to remove arch_get_ram_range want to remove arch_get_ram_range, and use early_node_map instead. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:27 +02:00
Ingo Molnar	1ea598c297	x86: fix sleep.c build error fix: arch/x86/kernel/acpi/sleep.c: In function ‘acpi_save_state_mem': arch/x86/kernel/acpi/sleep.c:75: error: ‘stack_start' undeclared (first use in this function) arch/x86/kernel/acpi/sleep.c:75: error: (Each undeclared identifier is reported only once arch/x86/kernel/acpi/sleep.c:75: error: for each function it appears in.) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:26 +02:00
Glauber Costa	7f6cbc905e	x86: take load_sp0 out of smpboot.c there's no particular reason to do load_sp0 in different places for i386 and x86_64. They should all be in cpu_init. Right now, cpu_init itself is not integrated, but with this patch, the code becomes closer to each other, making in easier to integrate when the time comes. Furthermore, although doing it in do_boot_cpu for x86_64 is fine, since it's only a copy, load_sp0 should be executed in the cpu it refers to anyway. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:25 +02:00
Glauber Costa	1481a3dd42	x86: move cpu_exit_clear to process_32.c Take it out of smpboot.c, and move it to process_32.c, closer to its only user. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:24 +02:00
Glauber Costa	b553a1e0ff	x86: remove cpu from maps during cpu disable, take cpus out of all maps in i386, instead of just the online map. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:23 +02:00
Glauber Costa	78e622705c	x86: change naming to match x86_64 Change unmap_cpu_to_logical_apicid to numa_remove_cpu. Besides being shorter, it is the same name x86_64 uses. We can save an ifdef in the code this way. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:23 +02:00
Glauber Costa	b5841765a2	x86: provide connect_bsp_APIC for x86_64 Although it is not really needed, we provide it to get closer to i386. ifdefs around it are removed in smpboot.c Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:22 +02:00
Glauber Costa	3fde690011	x86: change __setup_vector_irq with setup_vector_irq We create a version of it for i386, and then take the CONFIG_X86_64 ifdef out of the game. We could create a __setup_vector_irq for i386, but it would incur in an unnecessary lock taking. Moreover, it is better practice to only export setup_vector_irq anyway. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:21 +02:00
Glauber Costa	86e430edf4	x86: remove ifdef from stepping The stepping won't affect x86_64, since there are not x86_64 k7's or pentiums. So, although it adds to the binary size, remove the ifdef for smoother integration Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:20 +02:00
Glauber Costa	0f385d1ddd	x86: clearing io_apic harmless for x86_64 so remove ifdef. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:19 +02:00
Glauber Costa	3e9704739d	x86: boot secondary cpus through initial_code remove "initialize_secondary". Boot both architectures via initial_code. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:18 +02:00
Glauber Costa	e3f77edfc1	x86: use initial_code for i386 x86_64 jumps to whatever is written in "initial_code" symbol, instead of a fixed address. Do it for i386 too. It will allow us to integrate more of the smp boot code. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:17 +02:00
Glauber Costa	a939098afc	x86: move x86_64 gdt closer to i386 i386 and x86_64 used two different schemes for maintaining the gdt. With this patch, x86_64 initial gdt table is defined in a .c file, same way as i386 is now. Also, we call it "gdt_page", and the descriptor, "early_gdt_descr". This way we achieve common naming, which can allow for more code integration. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:16 +02:00
Glauber Costa	736f12bff9	x86: don't use gdt_page openly. There's a macro available for that. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:15 +02:00
Glauber Costa	9cf4f298e2	x86: use stack_start in x86_64 call x86_64's init_rsp stack_start, just as i386 does. Put a zeroed stack segment for consistency. With this, we can eliminate one ugly ifdef in smpboot.c. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:14 +02:00
Jeremy Fitzhardinge	a7bf0bd5e6	build: add __page_aligned_data and __page_aligned_bss Making a variable page-aligned by using __attribute__((section(".data.page_aligned"))) is fragile because if sizeof(variable) is not also a multiple of page size, it leaves variables in the remainder of the section unaligned. This patch introduces two new qualifiers, __page_aligned_data and __page_aligned_bss to set the section and the alignment of variables. This makes page-aligned variables more robust because the linker will make sure they're aligned properly. Unfortunately it requires all page-aligned data to use these macros... Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:48:13 +02:00
Bernhard Walle	1ecd27657b	x86: unify crashkernel reservation for 32 and 64 bit This patch moves the reserve_crashkernel() to setup.c and removes the architecture-specific version. Both versions were more or less the same. I tested it on both x86-64 and i386, with CONFIG_KEXEC on and off (so that it compiles). Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: yhlu.kernel@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:45:44 +02:00
Ingo Molnar	6236af82d8	Merge branch 'x86/fixmap' into x86/devel Conflicts: arch/x86/mm/init_64.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:24:29 +02:00
Ingo Molnar	e3ae0acf59	Merge branch 'x86/uv' into x86/devel	2008-07-08 12:24:13 +02:00
Cliff Wickman	e7eb8726d0	x86, SGI UV: uv_ptc_proc_write fix Someone could write 0 bytes to /proc/sgi_uv/ptc_statistics, causing optstr[count - 1] = '\0'; to write to who-knows-where. (Andi Kleen noticed this need from a patch I sent for similar code in the ia64 world (sn2_ptc_proc_write()).) (count less than zero is not possible here, as count is unsigned) Signed-off-by: Cliff Wickman <cpw@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:23:31 +02:00
Cliff Wickman	cef5327868	x86, SGI UV: TLB shootdown using broadcast assist unit, v6 v6: 6/19 close the security hole in uv_ptc_proc_write()) > Found a potential security hole while doing that: > static ssize_t uv_ptc_proc_write(struct file file, const char __user user, > size_t count, loff_t *data) > if (copy_from_user(optstr, user, count)) > return -EFAULT; > > is count guaranteed to never be larger than 64? is fixed below. It adds tlb_uv.o to the Makefile. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: mingo@elte.hu Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:23:30 +02:00
Jack Steiner	b6df1b8bc1	x86: fix stack overflow for large values of MAX_APICS physid_mask_of_physid() causes a huge stack (12k) to be created if the number of APICS is large. Replace physid_mask_of_physid() with a new function that does not create large stacks. This is a problem only on large x86_64 systems. this paves the way to increase MAX_APICS. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: linux-mm@kvack.org Cc: mingo@elte.hu Cc: tglx@linutronix.de Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:23:28 +02:00
Ingo Molnar	d400524aff	SGI UV: TLB shootdown using broadcast assist unit, fix fix: arch/x86/kernel/tlb_uv.c: In function ‘uv_table_bases_init': arch/x86/kernel/tlb_uv.c:612: error: ‘bau_tabsp' undeclared (first use in this function) arch/x86/kernel/tlb_uv.c:612: error: (Each undeclared identifier is reported only once arch/x86/kernel/tlb_uv.c:612: error: for each function it appears in.) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:23:27 +02:00
Ingo Molnar	b4c286e6af	SGI UV: clean up arch/x86/kernel/tlb_uv.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:23:26 +02:00
Ingo Molnar	dc163a41ff	SGI UV: TLB shootdown using broadcast assist unit TLB shootdown for SGI UV. v5: 6/12 corrections/improvements per Ingo's second review Signed-off-by: Cliff Wickman <cpw@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:23:25 +02:00
Cliff Wickman	b194b12050	SGI UV: TLB shootdown using broadcast assist unit, cleanups TLB shootdown for SGI UV. v1: 6/2 original v2: 6/3 corrections/improvements per Ingo's review v3: 6/4 split atomic operations off to a separate patch (Jeremy's review) v4: 6/12 include <mach_apic.h> rather than <asm/mach-bigsmp/mach_apic.h> (fixes a !SMP build problem that Ingo found) fix the index on uv_table_bases[blade] Signed-off-by: Cliff Wickman <cpw@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:23:24 +02:00
Cliff Wickman	1812924bb1	x86, SGI UV: TLB shootdown using broadcast assist unit TLB shootdown for SGI UV. Depends on patch (in tip/x86/irq): x86-update-macros-used-by-uv-platform.patch Jack Steiner May 29 This patch provides the ability to flush TLB's in cpu's that are not on the local node. The hardware mechanism for distributing the flush messages is the UV's "broadcast assist unit". The hook to intercept TLB shootdown requests is a 2-line change to native_flush_tlb_others() (arch/x86/kernel/tlb_64.c). This code has been tested on a hardware simulator. The real hardware is not yet available. The shootdown statistics are provided through /proc/sgi_uv/ptc_statistics. The use of /sys was considered, but would have required the use of many /sys files. The debugfs was also considered, but these statistics should be available on an ongoing basis, not just for debugging. Issues to be fixed later: - The IRQ for the messaging interrupt is currently hardcoded as 200 (see UV_BAU_MESSAGE). It should be dynamically assigned in the future. - The use of appropriate udelay()'s is untested, as they are a problem in the simulator. Signed-off-by: Cliff Wickman <cpw@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:23:22 +02:00
Ingo Molnar	d98b940ab2	Merge branch 'linus' into x86/irq	2008-07-08 12:23:00 +02:00
Ingo Molnar	4b62ac9a2b	Merge branch 'x86/nmi' into x86/devel Conflicts: arch/x86/kernel/nmi.c arch/x86/kernel/nmi_32.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:17:08 +02:00
Ingo Molnar	2b4fa851b2	Merge branch 'x86/numa' into x86/devel Conflicts: arch/x86/Kconfig arch/x86/kernel/e820.c arch/x86/kernel/efi_64.c arch/x86/kernel/mpparse.c arch/x86/kernel/setup.c arch/x86/kernel/setup_32.c arch/x86/mm/init_64.c include/asm-x86/proto.h Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:59:23 +02:00
Bernhard Walle	46f68e1c6b	x86: use reserve_bootmem_generic() to reserve crashkernel memory on x86_64 This patch uses reserve_bootmem_generic() instead of reserve_bootmem() to reserve the crashkernel memory on x86_64. That's necessary for NUMA machines, see `00212fef81`: [PATCH] Fix kdump Crash Kernel boot memory reservation for NUMA machines This patch will fix a boot memory reservation bug that trashes memory on the ES7000 when loading the kdump crash kernel. The code in arch/x86_64/kernel/setup.c to reserve boot memory for the crash kernel uses the non-numa aware "reserve_bootmem" function instead of the NUMA aware "reserve_bootmem_generic". I checked to make sure that no other function was using "reserve_bootmem" and found none, except the ones that had NUMA ifdef'ed out. I have tested this patch only on an ES7000 with NUMA on and off (numa=off) in a single (non-NUMA) and multi-cell (NUMA) configurations. Signed-off-by: Amul Shah <amul.shah@unisys.com> Looks-good-to: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> The switch-back to reserve_bootmem() was accidentally introduced in `5c3391f9f7` when adding the BOOTMEM_EXCLUSIVE parameter. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:49:52 +02:00
Bernhard Walle	3fd052b1b4	x86: add flags parameter to reserve_bootmem_generic() This patch adds a 'flags' parameter to reserve_bootmem_generic() like it already has been added in reserve_bootmem() with commit `72a7fe3967`. It also changes all users to use BOOTMEM_DEFAULT, which doesn't effectively change the behaviour. Since the change is x86-specific, I don't think it's necessary to add a new API for migration. There are only 4 users of that function. The change is necessary for the next patch, using reserve_bootmem_generic() for crashkernel reservation. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:49:49 +02:00
Randy Dunlap	053713f574	x86: fix setup.c printk format warning Fix setup.c printk format warning: linux-next-20080605/arch/x86/kernel/setup.c: In function 'setup_per_cpu_areas': linux-next-20080605/arch/x86/kernel/setup.c:173: warning: format '%lu' expects type 'long unsigned int', but argument 2 has type 'ssize_t' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:31:32 +02:00
Vegard Nossum	03db1f74a7	x86: don't return invalid pointers from node_to_cpumask() Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:31:31 +02:00
Thomas Gleixner	886533a3e3	x86: numa_64.c fix shadowed variable sparse mutters: arch/x86/mm/numa_64.c:195:27: warning: symbol 'end_pfn' shadows an earlier one Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:31:29 +02:00
Thomas Gleixner	864fc31ea5	x86: numa_64.c make local variables static plat_node_bdata, cmdline, nodemap_addr, nodemap_size are local to numa_64.c. Make them static Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:31:28 +02:00
Jeremy Fitzhardinge	f307d25e63	x86: compile error fix for smpboot.c Without this patch, my link fails with: arch/x86/kernel/built-in.o(.cpuinit.text+0x3c6e): In function `get_local_pda': : undefined reference to `_cpu_pda' arch/x86/kernel/built-in.o(.cpuinit.text+0x3cd1): In function `get_local_pda': : undefined reference to `after_bootmem' arch/x86/kernel/built-in.o(.cpuinit.text+0x3cec): In function `get_local_pda': : undefined reference to `_cpu_pda' make[2]: *** [.tmp_vmlinux1] Error 1 Caused by commit 766da892634694f795b18b9538407816896fc470 x86: remove static boot_cpu_pda array v2 Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:31:27 +02:00
Mike Travis	5deb0b2a25	x86: leave initial __cpu_pda array in place until cpus are booted Ingo Molnar wrote: ... > they crashed after about 3 randconfig iterations with: > > early res: 4 [8000-afff] PGTABLE > early res: 5 [b000-b87f] MEMNODEMAP > PANIC: early exception 0e rip 10:ffffffff8077a150 error 2 cr2 37 > Pid: 0, comm: swapper Not tainted 2.6.25-sched-devel.git-x86-latest.git #14 > > Call Trace: > [<ffffffff81466196>] early_idt_handler+0x56/0x6a > [<ffffffff8077a150>] ? numa_set_node+0x30/0x60 > [<ffffffff8077a129>] ? numa_set_node+0x9/0x60 > [<ffffffff8147a543>] numa_init_array+0x93/0xf0 > [<ffffffff8147b039>] acpi_scan_nodes+0x3b9/0x3f0 > [<ffffffff8147a496>] numa_initmem_init+0x136/0x150 > [<ffffffff8146da5f>] setup_arch+0x48f/0x700 > [<ffffffff802566ea>] ? clockevents_register_notifier+0x3a/0x50 > [<ffffffff81466a87>] start_kernel+0xd7/0x440 > [<ffffffff81466422>] x86_64_start_kernel+0x222/0x280 ... Here's the fixup... This one should follow the previous patches. Thanks, Mike Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:31:26 +02:00
Mike Travis	3461b0af02	x86: remove static boot_cpu_pda array v2 * Remove the boot_cpu_pda array and pointer table from the data section. Allocate the pointer table and array during init. do_boot_cpu() will reallocate the pda in node local memory and if the cpu is being brought up before the bootmem array is released (after_bootmem = 0), then it will free the initial pda. This will happen for all cpus present at system startup. This removes 512k + 32k bytes from the data section. For inclusion into sched-devel/latest tree. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + sched-devel/latest .../mingo/linux-2.6-sched-devel.git Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:31:25 +02:00
Mike Travis	9f248bde9d	x86: remove the static 256k node_to_cpumask_map * Consolidate node_to_cpumask operations and remove the 256k byte node_to_cpumask_map. This is done by allocating the node_to_cpumask_map array after the number of possible nodes (nr_node_ids) is known. * Debug printouts when CONFIG_DEBUG_PER_CPU_MAPS is active have been increased. It now shows faults when calling node_to_cpumask() and node_to_cpumask_ptr(). For inclusion into sched-devel/latest tree. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + sched-devel/latest .../mingo/linux-2.6-sched-devel.git Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:31:24 +02:00
Mike Travis	7891a24e1e	x86: restore pda nodenumber field * Restore the nodenumber field in the x86_64 pda. This field is slightly different than the x86_cpu_to_node_map mainly because it's a static indication of which node the cpu is on while the cpu to node map is a dyanamic mapping that may get reset if the cpu goes offline. This also simplifies the numa_node_id() macro. For inclusion into sched-devel/latest tree. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + sched-devel/latest .../mingo/linux-2.6-sched-devel.git Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:31:23 +02:00
Mike Travis	23ca4bba3e	x86: cleanup early per cpu variables/accesses v4 * Introduce a new PER_CPU macro called "EARLY_PER_CPU". This is used by some per_cpu variables that are initialized and accessed before there are per_cpu areas allocated. ["Early" in respect to per_cpu variables is "earlier than the per_cpu areas have been setup".] This patchset adds these new macros: DEFINE_EARLY_PER_CPU(_type, _name, _initvalue) EXPORT_EARLY_PER_CPU_SYMBOL(_name) DECLARE_EARLY_PER_CPU(_type, _name) early_per_cpu_ptr(_name) early_per_cpu_map(_name, _idx) early_per_cpu(_name, _cpu) The DEFINE macro defines the per_cpu variable as well as the early map and pointer. It also initializes the per_cpu variable and map elements to "_initvalue". The early_* macros provide access to the initial map (usually setup during system init) and the early pointer. This pointer is initialized to point to the early map but is then NULL'ed when the actual per_cpu areas are setup. After that the per_cpu variable is the correct access to the variable. The early_per_cpu() macro is not very efficient but does show how to access the variable if you have a function that can be called both "early" and "late". It tests the early ptr to be NULL, and if not then it's still valid. Otherwise, the per_cpu variable is used instead: #define early_per_cpu(_name, _cpu) \ (early_per_cpu_ptr(_name) ? \ early_per_cpu_ptr(_name)[_cpu] : \ per_cpu(_name, _cpu)) A better method is to actually check the pointer manually. In the case below, numa_set_node can be called both "early" and "late": void __cpuinit numa_set_node(int cpu, int node) { int cpu_to_node_map = early_per_cpu_ptr(x86_cpu_to_node_map); if (cpu_to_node_map) cpu_to_node_map[cpu] = node; else per_cpu(x86_cpu_to_node_map, cpu) = node; } Add a flag "arch_provides_topology_pointers" that indicates pointers to topology cpumask_t maps are available. Otherwise, use the function returning the cpumask_t value. This is useful if cpumask_t set size is very large to avoid copying data on to/off of the stack. * The coverage of CONFIG_DEBUG_PER_CPU_MAPS has been increased while the non-debug case has been optimized a bit. * Remove an unreferenced compiler warning in drivers/base/topology.c * Clean up #ifdef in setup.c For inclusion into sched-devel/latest tree. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + sched-devel/latest .../mingo/linux-2.6-sched-devel.git Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:31:20 +02:00
Mike Travis	1184dc2ffe	x86: modify Kconfig to allow up to 4096 cpus * Increase the limit of NR_CPUS to 4096 and introduce a boolean called "MAXSMP" which when set (e.g. "allyesconfig"), will set NR_CPUS = 4096 and NODES_SHIFT = 9 (512). * Changed max setting for NODES_SHIFT from 15 to 9 to accurately reflect the real limit. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:30:42 +02:00
Mike Travis	7496b60654	x86: fix remove cpu_pda table patch Mike Travis wrote: > Ingo Molnar wrote: >> * Mike Travis <travis@sgi.com> wrote: >> >>> [Ingo - please replace "PATCH 07/11" with this one.] >>> >>> * Remove 544k bytes from the kernel by removing the boot_cpu_pda >>> array from the data section and allocating it during startup. >>> >>> Fixed panic in setup_per_cpu_areas when HOTPLUG_CPU not set. >>> >>> For inclusion into sched-devel/latest tree. >> sched-devel.git randconfig testing found another crash with your queue: >> >> [ 0.111060] Brought up 1 CPUs >> [ 0.111986] Total of 1 processors activated (4022.73 BogoMIPS). >> [ 0.112987] Testing NMI watchdog ... <1>BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 >> [ 0.114982] IP: [<ffffffff8180d4a0>] check_nmi_watchdog+0xb0/0x210 >> [ 0.114982] PGD 0 >> [ 0.114982] Oops: 0000 [1] SMP >> [ 0.114982] CPU 0 >> [............] >> >> http://redhat.com/~mingo/misc/config-Mon_Apr_28_23_25_25_CEST_2008.bad >> http://redhat.com/~mingo/misc/log-Mon_Apr_28_23_25_25_CEST_2008.bad >> >> Ingo > > Hi Ingo, > > I need a bit more information on your hardware configuration. Building a > kernel with the above config file started up fine on both the Intel and AMD > boxes. > > Based on the above output it looks like it might be a UP machine? ... Ok, I think I found it. In check_nmi_watchdog(): for (cpu = 0; cpu < NR_CPUS; cpu++) prev_nmi_count[cpu] = cpu_pda(cpu)->__nmi_count; As I mentioned it works fine on both of my systems so could you try it out? Thanks! Mike -- * Change function check_nmi_watchdog() to use nr_cpu_ids instead of NR_CPUS. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + sched-devel/latest .../mingo/linux-2.6-sched-devel.git Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:28:47 +02:00
Yinghai Lu	dbb6152e6f	x86: don't call pxm_to_node again also make bus_numa work even if ACPI_NUMA is not defined. don't call pxm_to_node again, and use node directly. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:28:46 +02:00
Yinghai Lu	b755de8dfd	x86: make dev_to_node return online node a numa system (with multi HT chains) may return node without ram. Aka it is not online. Try to get an online node, otherwise return -1. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 11:28:43 +02:00
Ingo Molnar	3de352bbd8	Merge branch 'x86/mpparse' into x86/devel Conflicts: arch/x86/Kconfig arch/x86/kernel/io_apic_32.c arch/x86/kernel/setup_64.c arch/x86/mm/init_32.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:14:58 +02:00
Matthew Garrett	9340e1ccdf	x86, ioapic, acpi quirk: disable IRQ 0 through I/O APIC for some HP systems Some HP laptops have a problem with their DSDT reporting as HP/SB400/10000, which includes some code which overrides all temperature trip points to 16C if the INTIN2 input of the I/O APIC is enabled. This input is incorrectly designated the ISA IRQ 0 via an interrupt source override even though it is wired to the output of the master 8259A and INTIN0 is not connected at all. So far two models have been identified, namely nx6125 and nx6325. Use a knob provided by the I/O APIC interrupt registration code to abandon any attempts to route IRQ 0 through the I/O APIC for these systems. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: Len Brown <lenb@kernel.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:10:41 +02:00
Maciej W. Rozycki	471694ea6c	x86, ioapic, acpi: add a knob to disable IRQ 0 through I/O APIC As discovered recently some systems exhibit problems when the 8254 timer IRQ is routed through the I/O APIC. These problems do not affect the timer IRQ itself and therefore cannot be detected when the correctness of operation of the interrupt is verified in check_timer(). Therefore the I/O APIC path of the timer IRQ has to be disabled entirely. This is a change that lets platforms ask for the timer IRQ not to be registered in the I/O APIC interrupt tables. The local APIC and ExtINTA paths are unaffected. This request is only taken into account for ACPI platforms as MP table systems seem unaffected so far. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: Len Brown <lenb@kernel.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:10:32 +02:00
Yinghai Lu	bad48f4b31	x86: simplify x86_mpparse dependency check "Maciej W. Rozycki" <macro@linux-mips.org> said: > Given X86_64 selects X86_LOCAL_APIC I am not sure the redundancy seen >above does not actually obscure the logic behind... I think: > > depends on X86_LOCAL_APIC && !X86_VISWS > >would be clearer and get the same. Suggested-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:10:25 +02:00
Yinghai Lu	a4caa18efe	x86: fix compiling when CONFIG_X86_MPPARSE is not set Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:39:19 +02:00
Yinghai Lu	6695c85b2e	x86: let MPS support be selectable, v2 v2: seperate "fix for compiling when MPPARSE is not set" to another patch make X86_MPPARSE to be selectable only when acpi is set and X86_MPPARSE will be set if acpi is not set. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:39:14 +02:00
Yinghai Lu	fcfa146e41	x86: update mptable fix with no ioapic v2 if the system doesn't have ioapic, we don't need to store entries for mptable update also let mp_config_acpi_gsi not call func in mpparse so later could decouple mpparse with acpi more easily Reported-by: Daniel Exner <dex@dragonslave.de> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Daniel Exner <dex@dragonslave.de> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:39:07 +02:00
Yinghai Lu	95a71a45c2	x86: cleanup machine_specific_memory_setup, v2 1. let 64bit support 88 and e801 too 2. introduce default_machine_specific_memory_setup, and reuse it for voyager v2: fix 64 bit compiling Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:39:01 +02:00
Yinghai Lu	1c6e55032e	x86: use acpi_numa_init to parse on 32-bit numa seperate SRAT finding and parsing from get_memcfg_from_srat, and let getmemcfg_from_srat only handle array from previous step. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:38:47 +02:00
Yinghai Lu	0699eae140	x86: Kconfig cleanup with genericarch we already have summit and etc depends on genericarch, so use genericarch only. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:38:41 +02:00
Yinghai Lu	593a0cc390	x86: move some function out of setup_bootmem_alloc ... to make it more like 64-bit. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:38:34 +02:00
Yinghai Lu	064d25f120	x86: merge setup_memory_map with e820 ... and kill e820_32/64.c and e820_32/64.h Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:38:25 +02:00
Yinghai Lu	cc9f7a0ccf	x86: kill bad_ppro so don't punish all other cpus without that problem when init highmem Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:38:19 +02:00
Yinghai Lu	41c094fd3c	x86: move e820_resource_resources to e820.c and make 32-bit resource registration more like 64 bit. also move probe_roms back to setup_32.c Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:38:14 +02:00
Huang, Ying	8c5beb50d3	x86 boot: pass E820 memory map entries more than 128 via linked list of setup data Because of the size limits of struct boot_params (zero page), the maximum number of E820 memory map entries can be passed to kernel is 128. As pointed by Paul Jackson, there is some machine produced by SGI with so many nodes that the number of E820 memory map entries is more than 128. To enabling Linux kernel on these system, a new setup data type named SETUP_E820_EXT is defined to pass additional memory map entries to Linux kernel. This patch is based on x86/auto-latest branch of git-x86 tree and has been tested on x86_64 and i386 platform. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:37:39 +02:00
Yinghai Lu	b5bc6c0e55	x86, mm: use add_highpages_with_active_regions() for high pages init v2 use early_node_map to init high pages, so we can remove page_is_ram() and page_is_reserved_early() in the big loop with add_one_highpage also remove page_is_reserved_early(), it is not needed anymore. v2: fix the build of other platforms Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:37:25 +02:00
Yinghai Lu	d0be6bdea1	x86: rename two e820 related functions rename update_memory_range to e820_update_range rename add_memory_region to e820_add_region to make it more clear that they are about e820 map operations. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:37:01 +02:00
Yinghai Lu	6df8809bbd	x86: use dstapic in mp_config_acpi_legacy_irqs so we don't get the same value multiple times. also make mp_config_acpi_legacy_irqs more readable by moving assignments together. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:36:55 +02:00
Yinghai Lu	d867e5310b	x86: keep MP_intsrc_info untouched if we do not update mptable Daniel Exner reported IO-APIC enumeration breakage in linux-next. Alexey Starikovskiy found out that it might be related to commit `2944e16b25` "x86: update mptable". use enable_update_mptable to decide if need check before add mp_irqs array. Reported-by: Daniel Exner <webmaster@dragonslave.de> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:36:40 +02:00
Yinghai Lu	9a27f5c516	x86: clean up relocate_initrd 1. move that before zone_sizes_init ... 2. add free_early for one old one, otherwise it will be be reserved again when we init highmem. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:36:34 +02:00
Yinghai Lu	cc1050bafe	x86: replace shrink_active_range() with remove_active_range() in case we have kva before ramdisk on a node, we still need to use those ranges. v2: reserve_early kva ram area, in case there are holes in highmem, to avoid those area could be treat as free high pages. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:36:29 +02:00
Yinghai Lu	d2dbf34332	x86: clean up reserve_bootmem_generic() and port it to 32-bit 1. add reserve_bootmem_generic for 32bit 2. change len to unsigned long 3. make early_res_to_bootmem to use it Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:36:17 +02:00
Yinghai Lu	b1f006b65c	x86: make generic arch support NUMAQ, fix #2 we are checking mptable early for numaq, so don't need to reserve_bootmem for it. bootmem is not there yet. do the same thing as 64-bit. found it on 64g above system from 64-bit kernel kexec to 32 bit kernel with numaq support. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:36:04 +02:00
Yinghai Lu	b20d70b70e	x86: make generic arch support NUMAQ, fix fix typo in bigsmp switching. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:35:45 +02:00
Yinghai Lu	ab4a465e96	x86: e820 merge parsing of the mem=/memmap= boot parameters since we now have 32-bit support for e820_register_active_regions(), we can merge the parsing of the mem=/memmap= boot parameters. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:35:38 +02:00
Ingo Molnar	df5f6c212c	x86: unify the reserve_bootmem() behavior of early_res_to_bootmem() Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:35:31 +02:00
Bernhard Walle	12448e3ef3	x86: use reserve_bootmem_generic() to reserve crashkernel memory on x86_64 This patch uses reserve_bootmem_generic() instead of reserve_bootmem() to reserve the crashkernel memory on x86_64. That's necessary for NUMA machines, see `00212fef81`: [PATCH] Fix kdump Crash Kernel boot memory reservation for NUMA machines This patch will fix a boot memory reservation bug that trashes memory on the ES7000 when loading the kdump crash kernel. The code in arch/x86_64/kernel/setup.c to reserve boot memory for the crash kernel uses the non-numa aware "reserve_bootmem" function instead of the NUMA aware "reserve_bootmem_generic". I checked to make sure that no other function was using "reserve_bootmem" and found none, except the ones that had NUMA ifdef'ed out. I have tested this patch only on an ES7000 with NUMA on and off (numa=off) in a single (non-NUMA) and multi-cell (NUMA) configurations. Signed-off-by: Amul Shah <amul.shah@unisys.com> Looks-good-to: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> The switch-back to reserve_bootmem() was accidentally introduced in `5c3391f9f7` when adding the BOOTMEM_EXCLUSIVE parameter. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:35:13 +02:00
Bernhard Walle	8b2ef1d728	x86: add flags parameter to reserve_bootmem_generic() This patch adds a 'flags' parameter to reserve_bootmem_generic() like it already has been added in reserve_bootmem() with commit `72a7fe3967`. It also changes all users to use BOOTMEM_DEFAULT, which doesn't effectively change the behaviour. Since the change is x86-specific, I don't think it's necessary to add a new API for migration. There are only 4 users of that function. The change is necessary for the next patch, using reserve_bootmem_generic() for crashkernel reservation. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:34:54 +02:00
Ingo Molnar	896395c290	Merge branch 'linus' into tmp.x86.mpparse.new	2008-07-08 10:32:56 +02:00
Ingo Molnar	1b8ba39a3f	Merge branch 'x86/irq' into x86/devel Conflicts: arch/x86/kernel/i8259.c arch/x86/kernel/irqinit_64.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:53:57 +02:00
Ingo Molnar	58cf35228f	Merge branches 'x86/mmio', 'x86/delay', 'x86/idle', 'x86/oprofile', 'x86/debug', 'x86/ptrace' and 'x86/amd-iommu' into x86/devel	2008-07-08 09:46:15 +02:00
Ingo Molnar	3c1ca43faf	Merge branch 'x86/setup' into x86/devel	2008-07-08 09:43:01 +02:00
Ingo Molnar	6924d1ab8b	Merge branches 'x86/numa-fixes', 'x86/apic', 'x86/apm', 'x86/bitops', 'x86/build', 'x86/cleanups', 'x86/cpa', 'x86/cpu', 'x86/defconfig', 'x86/gart', 'x86/i8259', 'x86/intel', 'x86/irqstats', 'x86/kconfig', 'x86/ldt', 'x86/mce', 'x86/memtest', 'x86/pat', 'x86/ptemask', 'x86/resumetrace', 'x86/threadinfo', 'x86/timers', 'x86/vdso' and 'x86/xen' into x86/devel	2008-07-08 09:16:56 +02:00
Christophe Jaillet	25556c1699	x86, arch/x86/kernel/io_apic_32.c: use kzalloc instead of kmalloc/memset 1) replace kmalloc/memset with equivalent kzalloc. Signed-off-by: Christophe Jaillet <jaillet.christophe@wanadoo.fr> Cc: cj <jaillet.christophe@wanadoo.fr> Cc: petero2@telia.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:25 +02:00
Maciej W. Rozycki	7f0dbbc08d	x86: fix IO APIC breakage on HP nx6325, v2 > That helped a lot, the system seems to work normally now. > > Here's the relevant snippet from dmesg: > > [ 0.108006] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 > [ 0.108006] ..MP-BIOS bug: 8254 timer not connected to IO-APIC > [ 0.108006] ...trying to set up timer (IRQ0) through the 8259A ... <3> > [ 0.108006] ..... (found apic 0 pin 2) ...<3> failed. > [ 0.108006] ...trying to set up timer as Virtual Wire IRQ...<3> works. > > and the whole thing is at: http://www.sisk.pl/kernel/debug/20080618/dmesg-2.log Hmm, that only proved the 8259A is indeed wired to the pin #2 of the I/O APIC. > I, personally, don't have any and AMD only has SB600 documentation on its > web page (it's still marked as "AMD confidential" ;-)). Well, the IC block is most likely the same as that's not rocket science and once done there is no need to fiddle with that. That written, I am afraid there is nothing useful about the IC in the document, except that it's there and consists of an I/O APIC providing 24 inputs and the usual pair of 8259A cores. Thanks for the reference anyway. > There is an interrupt controller in there, but I'm not sure if there's any > 8259A. The northbridge is on the CPU, actually. I will praise the day someone ships an x86 machine without an 8259A core! As expressed in another mail I suspect there may actually be a direct route from the 8254 to INTIN0 in the southbridge -- this is what other bootstrap logs seen in the Internet suggest. This would mean this particular BIOS is buggy (is it the latest version?) and provides an incorrect IRQ override in its ACPI tables, for example because the responsible block has been blindly copied from a machine using a commoner wiring. This could be moderately easily fixed up with a quirk based on the PCI ID (after checking it again, we actually used to have a quirk for ATI in this area, but the way it was done suggests the issue was not understood well enough). Could you please remove the hack sent yesterday and test the patch provided below? I do hope it builds, but I have no immediate means to check it. Please report the output. The intent is to test INTIN0 directly before testing INTIN2 through the 8259A. Thanks. Aside of that, what I have gathered from your reports (please correct me if I have got it wrong) is that when the through-8259A mode is used, then after a while 8254 timer interrupts stop arriving. What's interesting, the "Virtual Wire IRQ" seems to work for you correctly (that's quite an odd setup where a local APIC input is used in the native mode -- please post /proc/interrupts for confirmation), which in turn implies the master 8259A drives its INT output as we expect. Why would the I/O APIC input have problems then? Hmm... [ mingo@elte.hu: revert the "x86: fix IO APIC breakage on HP nx6325" version. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:24 +02:00
Maciej W. Rozycki	cd08d0754e	x86: fix IO APIC breakage on HP nx6325 On Thu, 19 Jun 2008, Rafael J. Wysocki wrote: > > With such a configuration the "x86: I/O APIC: timer through 8259A > > second-chance" patch should not matter, because the only change it > > introduces is an attempt to try the same I/O APIC pin again, but with the > > IRQ0 line of the master 8259A enabled. That's not a terribly unusual > > configuration and nothing should get confused in the system. > > But it _does_ get confused, really. Something certainly gets confused, but so far I am not sure which bit exactly it is, are you? > > Barring the unlikely possibility of the 8259A actually being wired to > > INTIN2 of the I/O APIC I can see two possible explanations: > > > > 1. The 8259A interrupt actually escapes to the CPU somehow and is handled > > as an ExtINTA interrupt. This would make the code in check_timer() > > decide it has found a working configuration, while actually it has been > > fooled. [...] > Here you go: > > [ 0.108006] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 > [ 0.108006] ..MP-BIOS bug: 8254 timer not connected to IO-APIC > [ 0.108006] ...trying to set up timer (IRQ0) through the 8259A ... <3> > [ 0.108006] ..... (found apic 0 pin 2) ...<3> works. > > The full dmesg is at: http://www.sisk.pl/kernel/debug/20080618/dmesg-1.log Thanks. In this case I suspect the case #1 quoted above happens, that is the 8259A manages to deliver its interrupt somehow. Note at this stage it is meant to be in the AEOI mode, so it can happily resubmit the interrupt indefinitely with no additional handling as long as it receives INTA cycles. Can you please try the patch below on top of "x86: I/O APIC: timer through 8259A second-chance" to see whether my hypothesis is true? It modifies the through-8259A setup path so that the APIC input gets masked, but the 8259A has the timer interrupt still enabled. Let me know how the timer interrupt is routed in this case. Bisected-by: "Rafael J. Wysocki" <rjw@sisk.pl> Tested-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:24 +02:00
Paolo Ciarrocchi	360624484c	x86: coding style fixes to arch/x86/kernel/io_apic_32.c Before: total: 91 errors, 73 warnings, 2850 lines checked After: total: 1 errors, 47 warnings, 2848 lines checked Compile tested: paolo@paolo-desktop:/tmp$ size io* text data bss dec hex filename 13836 1756 11104 26696 6848 io_apic_32.o.after 13836 1756 11104 26696 6848 io_apic_32.o.before Signed-off-by: Paolo Ciarrocchi <paolo.ciarrocchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:23 +02:00
Cyrill Gorcunov	46b3b4ef1e	x86, io-apic: use predefined names instead of numeric constants This patch replaces some hard-coded numbers with predefined names. Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:22 +02:00
Maciej W. Rozycki	d54db1ac9e	x86: APIC/SMP: Downgrade the NMI watchdog for "nosmp" If configured to use the I/O APIC, the NMI watchdog is deemed to fail if the chip has been deactivated as a result of "nosmp". Downgrade to the local APIC watchdog similarly to what is done for the UP case. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:19 +02:00
Maciej W. Rozycki	1966202748	x86: APIC/UP: Remove redundant NMI watchdog downgrade For the UP case the NMI watchdog downgrade is done consistently in APIC_init_uniprocessor() now. Remove redundant code used only when BIOS-disabled local APIC is activated. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:18 +02:00
Maciej W. Rozycki	acae7d906f	x86: APIC/UP: Downgrade the NMI watchdog for no I/O APIC If configured to use the I/O APIC, the NMI watchdog is deemed to fail if the chip will not be used in the UP configuration, because "noapic" has been specified or the chip is simply not there. Downgrade to the local APIC watchdog to rectify. The new #ifdef is ugly, I know. A proper solution is to provide suitable definitions of smp_found_config, etc. for !CONFIG_X86_IO_APIC in a header. Likewise the whole if () condition should be moved to a static inline function. Such clean-ups are beyond the scope of this change and can be done once the whole issue of the timer has been sorted out. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:17 +02:00
Ingo Molnar	6fe9fe8756	Revert "x86: APIC/SMP: downgrade the NMI watchdog for "nosmp"" This reverts commit 791b93d3dfaf16c23e978bec0cc0a3dd9d855d63. A better fix from Maciej will be merged.	2008-07-08 09:13:15 +02:00
Ingo Molnar	ab5a5be099	Revert "x86, io-apic: fix nmi_watchdog=1 bootup hang" This reverts commit 2229ff84f01746d02fb6b79e156fb5cce48c908f. A better fix from Maciej will be merged.	2008-07-08 09:13:14 +02:00
Ingo Molnar	ff11571b25	x86, io-apic: fix nmi_watchdog=1 bootup hang nmi_watchdog=1 hangs on 64-bit: [ 0.250000] Detected 12.564 MHz APIC timer. [ 0.254178] APIC timer registered as dummy, due to nmi_watchdog=1! [ 0.260366] Testing NMI watchdog ... <4>WARNING: CPU#0: NMI appears to be stuck (0->0)! [ ... ] [ 0.470003] calling genl_init+0x0/0xd0 [ hard hang ] bisected it down to: git-bisect start git-bisect good `1beee8dc8c` git-bisect bad 11582ece0aaa2d0f94f345c08a4ab9997078a083 git-bisect bad 5479c623bb44089844022c03d4c0eb16d5b7a15f git-bisect bad cfb4c7fabeb499e1c29f9d1878968e37a938e28a git-bisect good `246dd412d3` git-bisect bad 3f8237eaff7dc1e35fa791dae095574fd974e671 git-bisect good 90e23b13ab849e2a11f00c655eb3a2011b4623be git-bisect bad 833526a34eeefc117df3191a594c3c3a4f15a9ac git-bisect good 791b93d3dfaf16c23e978bec0cc0a3dd9d855d63 git-bisect bad 65767c64068f2c93e56a1accfed5c78230ac12d7 git-bisect bad 2abc5c05dd82c188e3bdf6641a274f013348d14b git-bisect bad 317e1f2597ffb4d4db940577bbe56dc6e881ef07 \| 317e1f2597ffb4d4db940577bbe56dc6e881ef07 is first bad commit \| commit 317e1f2597ffb4d4db940577bbe56dc6e881ef07 \| Author: Maciej W. Rozycki <macro@linux-mips.org> \| Date: Wed May 21 22:10:22 2008 +0100 \| x86: I/O APIC: clean up the 8259A on a NMI watchdog failure the problem is that in the dummy-lapic branch we rely on the i8259A but if the NMI watchdog fails we turn off IRQ 0 - which doesnt work too well ;-) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:13 +02:00
Cyrill Gorcunov	067fa0ff0c	x86: IO-APIC - use NMI_NONE instead of numeric constant Not sure but maybe it is better to use NMI_DISABLED, will take a look. But for now this patch is not change anything in logic so it will not hurt/broke the kernel. For most cases nmi_watchdog assignment is by one of NMI_* macro so I think there it make sense too. Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:12 +02:00
Ingo Molnar	b1b57ee135	x86 build fix: arch/x86/kernel/io_apic_64.c: In function 'check_timer': arch/x86/kernel/io_apic_64.c:1688: error: 'vector' undeclared (first use in this function) arch/x86/kernel/io_apic_64.c:1688: error: (Each undeclared identifier is reported only once arch/x86/kernel/io_apic_64.c:1688: error: for each function it appears in.)	2008-07-08 09:13:11 +02:00
Thomas Gleixner	431ee79db0	x86: apic_64.c fix sparse warnings about shadowed variables Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:10 +02:00
Thomas Gleixner	7223daf5e1	x86: make irq_cfg static Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:09 +02:00
Thomas Gleixner	0715650958	x86: move pci_routirq declaration to pci.h Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:08 +02:00
Maciej W. Rozycki	691874fa96	x86: I/O APIC: timer through 8259A second-chance Some systems incorrectly report the ExtINTA pin of the I/O APIC as the genuine target of the timer interrupt. Here is a change that copies timer pin information found to the other pin if one has been found only. This way both a direct and a through-8259A route is tested with the pin letting these problematic systems work well enough. If no timer pin information has been found for the I/O APIC, then local APIC variations are tried only, similarly to what is done without the change (except without the misleading messages). Obviously if we try the first-chance path without being told by the BIOS to do so, we should not complain either, so do not print the message in this case. The 64-bit variation should be updated with a call to replace_pin_at_irq() which can be done with the upcoming merge. Since add_pin_to_irq() is now always called in the first-chance path, the condition to require it in the second-chance path no longer happens. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:07 +02:00
Maciej W. Rozycki	03be750559	x86: I/O APIC: keep the timer IRQ masked during set-up Keep the timer interrupt line masked when reconfiguring its interrupt redirection entry in the I/O APIC. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:06 +02:00
Maciej W. Rozycki	24742ece8e	x86: I/O APIC: unmask the second-chance timer interrupt Unmask the timer interrupt line set up in the through-8259A mode explicitly after setup_timer_IRQ0_pin() has set up the I/O APIC interrupt redirection entry to let the two operations be unbound from each other. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:05 +02:00
Maciej W. Rozycki	f7633ce55b	x86: I/O APIC: rename setup_ExtINT_IRQ0_pin() Rename setup_ExtINT_IRQ0_pin() to setup_timer_IRQ0_pin() to better reflect the upcoming role of a function setting up a (semi-)arbitrary I/O APIC pin appropriately for the 8254 timer. By "appropriate" the following settings are meant: edge-triggered, active-high, all the other settings per-architecture. Adjust comments to reflect code appropriately. No functional changes. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:04 +02:00
Maciej W. Rozycki	6b4722a777	x86: I/O APIC: remove redundant LVT0 masking The LINT0 line of the local APIC is masked in the LVT0 entry in check_timer() before this function is ever called. Removed the redundant unmasking for better control. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:03 +02:00
Maciej W. Rozycki	80d16bace6	x86: I/O APIC: remove redundant 8259A {,un}masking For a better control the masking and unmasking of the timer interrupt line in the 8259A operating in the 'Virtual Wire' mode has been moved out of setup_ExtINT_IRQ0_pin() now, so remove the redundant calls from the function. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:02 +02:00
Maciej W. Rozycki	f08252623c	x86: I/O APIC: fix the name of the through-8259A handler When the through-8259A mode is used for the timer, the call to set_irq_handler() will register a NULL handler name, resulting in "IO-APIC-<NULL>" reported. Fix by calling ioapic_register_intr() as done for all the other I/O APIC interrupts. The 64-bit variation calls set_irq_chip_and_handler_name() here needlessly and should get fixed with the upcoming merge. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:01 +02:00
Maciej W. Rozycki	9a1c619291	x86: I/O APIC: fix the name of the L-APIC IRQ handler The local APIC interrupt handler gets registered with set_irq_chip_and_handler_name(), which results in "local-APIC-edge-fasteoi" reported as the name of the handler. Fix by removing the type of the handler left over from before the generic handlers were introduced. The 64-bit variation should get fixed with the upcoming merge. NB It should really use the "edge" handler and not the "fasteoi" one, but that's a separate issue. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:13:00 +02:00
Maciej W. Rozycki	35542c5ebc	x86: I/O APIC: clean up the 8259A on a NMI watchdog failure There is no point in keeping the 8259A enabled if the I/O APIC NMI watchdog has failed and the 8259A is not used to pass through regular timer interrupts. This fixes problems with some systems where some logic gets confused. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:12:59 +02:00
Maciej W. Rozycki	a1133d8e4f	x86: APIC/SMP: downgrade the NMI watchdog for "nosmp" If configured to use the I/O APIC, the NMI watchdog is deemed to fail if the chip has been deactivated as a result of "nosmp". Downgrade to the local APIC watchdog similarly to what is done for the UP case. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:12:58 +02:00
Maciej W. Rozycki	73d08e6360	x86: APIC/SMP: correct the message for "nosmp" The local APIC is no longer forced off when "nosmp" has been specified. Correct the message printed. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:12:57 +02:00
Maciej W. Rozycki	60134ebe79	x86: I/O APIC: keep IRQ off when changing LVT registers Disable the 8259A acting in the "virtual wire" mode to keep the interrupt line inactive while fiddling with local APIC interrupt vector registers associated with its destination inputs. To be on the safe side, especially concerning flipping the trigger mode. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:12:56 +02:00
Maciej W. Rozycki	e67465f129	x86: I/O APIC: clean up after a fasteoi failure Disable the 8259A when routing of the timer interrupt through the chip to the local APIC of the primary processor has failed. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:12:55 +02:00
Maciej W. Rozycki	ecd29476ae	x86: I/O APIC: remove parameters to fiddle with the 8259A Remove the "disable_8254_timer" and "enable_8254_timer" kernel parameters. Now that AEOI acknowledgements are no longer needed for correct timer operation, the 8259A can be kept disabled unconditionally unless interrupts, either timer or watchdog ones, are actually passed through it. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:12:54 +02:00
Maciej W. Rozycki	d11d5794e0	x86: I/O APIC: AEOI timer acknowledgement clean-ups The code that used to be in do_slow_gettimeoffset() that relied on the IRR bit of the master 8259A PIC for IRQ0 to check the state of the output timer 0 of the PIT is no longer there. As a result, there is no need to use the POLL command to acknowledge the timer interrupt in the "8259A Virtual Wire", except for the NMI watchdog when the i82489DX APIC is used (this is because this particular APIC treats NMIs as level-triggered and keeping the input asserted would keep motherboard NMI sources held off for too long). Remove the unneeded bits and adjust comments accordingly. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:12:53 +02:00
Ingo Molnar	a0176e2485	Revert "Revert "x86: fix ioapic bug again"" This reverts commit `0b6a39f7eb`. The changes in tip/x86/apic solve this better. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 09:12:49 +02:00
Jiri Slaby	684eb0163a	x86_64: use PAGE_OFFSET in dump_pagetables Use PAGE_OFFSET macro instead of using 0xffff810000000000UL directly. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: hpa@zytor.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-08 08:12:06 +02:00
Thomas Gleixner	6e92a5a615	x86: add sparse annotations to ioremap arch/x86/mm/ioremap.c:308:11: error: incompatible types in comparison expression (different address spaces) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 08:12:05 +02:00

... 3 4 5 6 7 ...

23901 Commits