linux/arch/x86
Lukasz Anaczkowski d81056b527 x86, ACPI: Handle apic/x2apic entries in MADT in correct order
ACPI specifies the following rules when listing APIC IDs:
(1) Boot processor is listed first
(2) For multi-threaded processors, BIOS should list the first logical
    processor of each of the individual multi-threaded processors in MADT
    before listing any of the second logical processors.
(3) APIC IDs < 0xFF should be listed in APIC subtable, APIC IDs >= 0xFF
    should be listed in X2APIC subtable

Because of above, when there's more than 0xFF logical CPUs, BIOS
interleaves APIC/X2APIC subtables.

Assuming, there's 72 cores, 72 hyper-threads each, 288 CPUs total,
listing is like this:

APIC (0,4,8, .., 252)
X2APIC (258,260,264, .. 284)
APIC (1,5,9,...,253)
X2APIC (259,261,265,...,285)
APIC (2,6,10,...,254)
X2APIC (260,262,266,..,286)
APIC (3,7,11,...,251)
X2APIC (255,261,262,266,..,287)

Now, before this patch, due to how ACPI MADT subtables were parsed (BSP
then X2APIC then APIC), kernel enumerated CPUs in reverted order (i.e.
high APIC IDs were getting low logical IDs, and low APIC IDs were
getting high logical IDs).
This is wrong for the following reasons:
() it's hard to predict how cores and threads are enumerated
() when it's hard to predict, s/w threads cannot be properly affinitized
   causing significant performance impact due to e.g. inproper cache
   sharing
() enumeration is inconsistent with how threads are enumerated on
   other Intel Xeon processors

So, order in which MADT APIC/X2APIC handlers are passed is
reverse and both handlers are passed to be called during same MADT
table to walk to achieve correct CPU enumeration.

In scenario when someone boots kernel with options 'maxcpus=72 nox2apic',
in result less cores may be booted, since some of the CPUs the kernel
will try to use will have APIC ID >= 0xFF. In such case, one
should not pass 'nox2apic'.

Disclimer: code parsing MADT APIC/X2APIC has not been touched since 2009,
when X2APIC support was initially added. I do not know why MADT parsing
code was added in the reversed order in the first place.
I guess it didn't matter at that time since nobody cared about cores
with APIC IDs >= 0xFF, right?

This patch is based on work of "Yinghai Lu <yinghai@kernel.org>"
previously published at https://lkml.org/lkml/2013/1/21/563

Here's the explanation why parsing interface needs to be changed
and why simpler approach will not work https://lkml.org/lkml/2015/9/7/285

Signed-off-by: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de> (commit message)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-10-15 01:31:06 +02:00
..
boot lib/decompressors: use real out buf size for gunzip with kernel 2015-09-10 13:29:01 -07:00
configs Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2015-09-04 15:49:32 -07:00
crypto crypto: ghash-clmulni: specify context size for ghash async algorithm 2015-09-04 23:21:07 +08:00
entry sys_membarrier(): system-wide memory barrier (generic, x86) 2015-09-11 15:21:34 -07:00
ia32 x86/compat: Move copy_siginfo_*_user32() to signal_compat.c 2015-07-06 15:28:55 +02:00
include Merge branch 'akpm' (patches from Andrew) 2015-09-10 18:19:42 -07:00
kernel x86, ACPI: Handle apic/x2apic entries in MADT in correct order 2015-10-15 01:31:06 +02:00
kvm Merge branch 'akpm' (patches from Andrew) 2015-09-10 18:19:42 -07:00
lguest Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-09-01 15:20:51 -07:00
lib x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer 2015-08-22 14:52:16 +02:00
math-emu Merge branch 'x86/urgent' into x86/asm to fix up conflicts and to pick up fixes 2015-08-18 09:39:47 +02:00
mm mm, mpx: add "vm_flags_t vm_flags" arg to do_mmap_pgoff() 2015-09-10 13:29:01 -07:00
net bpf: Make the bpf_prog_array_map more generic 2015-08-09 22:50:05 -07:00
oprofile
pci Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-09-01 14:33:35 -07:00
platform kexec: split kexec_load syscall from kexec core code 2015-09-10 13:29:01 -07:00
power x86/ldt: Make modify_ldt synchronous 2015-07-31 10:23:23 +02:00
purgatory
ras x86/ras: Move AMD MCE injector to arch/x86/ras/ 2015-08-13 10:12:54 +02:00
realmode
tools
um x86/asm/tsc: Remove rdtsc_barrier() 2015-07-06 15:23:30 +02:00
video
xen xen: MFN/GFN/BFN terminology changes for 4.3-rc0 2015-09-10 16:21:11 -07:00
.gitignore
Kbuild x86/asm/entry, x86/vdso: Move the vDSO code to arch/x86/entry/vdso/ 2015-06-03 18:51:37 +02:00
Kconfig kexec: split kexec_load syscall from kexec core code 2015-09-10 13:29:01 -07:00
Kconfig.cpu
Kconfig.debug x86/entry/64, x86/nmi/64: Add CONFIG_DEBUG_ENTRY NMI testing code 2015-07-17 12:50:14 +02:00
Makefile Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-09-01 08:40:25 -07:00
Makefile_32.cpu
Makefile.um kbuild: use relative path more to include Makefile 2015-04-02 16:42:08 +02:00