Add SMP support for metag. This allows Linux to take control of multiple
hardware threads on a single Meta core, treating them as separate Linux
CPUs.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Add header files to implement Meta hardware thread locks (used by some
other atomic operations), atomics, spinlocks, and bitops.
There are 2 main types of atomic primitives for metag (in addition to
IRQs off on UP):
- LOCK instructions provide locking between hardware threads.
- LNKGET/LNKSET instructions provide load-linked/store-conditional
operations allowing for lighter weight atomics on Meta2
LOCK instructions allow for hardware threads to acquire voluntary or
exclusive hardware thread locks:
- LOCK0 releases exclusive and voluntary lock from the running hardware
thread.
- LOCK1 acquires the voluntary hardware lock, blocking until it becomes
available.
- LOCK2 implies LOCK1, and additionally acquires the exclusive hardware
lock, blocking all other hardware threads from executing.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Add metag system call and gateway page interfaces. The metag
architecture port uses the generic system call numbers from
asm-generic/unistd.h, as well as a user gateway page mapped at
0x6ffff000 which contains fast atomic primitives (depending on SMP) and
a fast method of accessing TLS data.
System calls use the SWITCH instruction with the immediate 0x440001 to
signal a system call.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Meta core internal interrupts (from HWSTATMETA and friends) are vectored
onto the TR1 core trigger for the current thread. This is demultiplexed
in irq-metag.c to individual Linux IRQs for each internal interrupt.
External SoC interrupts (from HWSTATEXT and friends) are vectored onto
the TR2 core trigger for the current thread. This is demultiplexed in
irq-metag-ext.c to individual Linux IRQs for each external SoC interrupt.
The external irqchip has devicetree bindings for configuring the number
of irq banks and the type of masking available.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Rob Landley <rob@landley.net>
Cc: Dom Cobley <popcornmix@gmail.com>
Cc: Simon Arlott <simon@fire.lp0.eu>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: devicetree-discuss@lists.ozlabs.org
Cc: linux-doc@vger.kernel.org
Add trap code for metag. At the lowest level Meta traps (and return from
interrupt instruction - RTI) simply swap the PC and PCX registers and
optionally toggle the interrupt status bit (ISTAT). Low level TBX code
in tbipcx.S handles the core context save, determine the TBX signal
number based on the core trigger that fired (using the TXSTATI status
register), and call TBX signal handlers (mostly in traps.c) via a vector
table.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Add time keeping code for metag. Meta hardware threads have 2 timers.
The background timer (TXTIMER) is used as a free-running time base, and
the interrupt timer (TXTIMERI) is used for the timer interrupt. Both
counters traditionally count at approximately 1MHz.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
The ptrace interface for metag provides access to some core register
sets using the PTRACE_GETREGSET and PTRACE_SETREGSET operations. The
details of the internal context structures is abstracted into user API
structures to both ease use and allow flexibility to change the internal
context layouts. Copyin and copyout functions for these register sets
are exposed to allow signal handling code to use them to copy to and
from the signal context.
struct user_gp_regs (NT_PRSTATUS) provides access to the core general
purpose register context.
struct user_cb_regs (NT_METAG_CBUF) provides access to the TXCATCH*
registers which contains information abuot a memory fault, unaligned
access error or watchpoint. This can be modified to alter the way the
fault is replayed on resume ("catch replay"), or to prevent the replay
taking place.
struct user_rp_state (NT_METAG_RPIPE) provides access to the state of
the Meta read pipeline which can be used to hide memory latencies in
hand optimised data loops.
Extended DSP register state, DSP RAM, and hardware breakpoint registers
aren't yet exposed through ptrace.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Tony Lindgren <tony@atomide.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Meta has instructions for accessing:
- bytes - GETB (1 byte)
- words - GETW (2 bytes)
- doublewords - GETD (4 bytes)
- longwords - GETL (8 bytes)
All accesses must be aligned. Unaligned accesses can be detected and
made to fault on Meta2, however it isn't possible to fix up unaligned
writes so we don't bother fixing up reads either.
This patch adds metag memory handling code including:
- I/O memory (io.h, ioremap.c): Actually any virtual memory can be
accessed with these helpers. A part of the non-MMUable address space
is used for memory mapped I/O. The ioremap() function is implemented
one to one for non-MMUable addresses.
- User memory (uaccess.h, usercopy.c): User memory is directly
accessible from privileged code.
- Kernel memory (maccess.c): probe_kernel_write() needs to be
overwridden to use the I/O functions when doing a simple aligned
write to non-writecombined memory, otherwise the write may be split
by the generic version.
Note that due to the fact that a portion of the virtual address space is
non-MMUable, and therefore always maps directly to the physical address
space, metag specific I/O functions are made available (metag_in32,
metag_out32 etc). These cast the address argument to a pointer so that
they can be used with raw physical addresses. These accessors are only
to be used for accessing fixed core Meta architecture registers in the
non-MMU region, and not for any SoC/peripheral registers.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Add memory management files for metag.
Meta's 32bit virtual address space is split into two halves:
- local (0x08000000-0x7fffffff): traditionally local to a hardware
thread and incoherent between hardware threads. Each hardware thread
has it's own local MMU table. On Meta2 the local space can be
globally coherent (GCOn) if the cache partitions coincide.
- global (0x88000000-0xffff0000): coherent and traditionally global
between hardware threads. On Meta2, each hardware thread has it's own
global MMU table.
The low 128MiB of each half is non-MMUable and maps directly to the
physical address space:
- 0x00010000-0x07ffffff: contains Meta core registers and maps SoC bus
- 0x80000000-0x87ffffff: contains low latency global core memories
Linux usually further splits the local virtual address space like this:
- 0x08000000-0x3fffffff: user mappings
- 0x40000000-0x7fffffff: kernel mappings
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Add cache and TLB handling code for metag, including the required
callbacks used by MM switches and DMA operations. Caches can be
partitioned between the hardware threads and the global space, however
this is usually configured by the bootloader so Linux doesn't make any
changes to this configuration. TLBs aren't configurable, so only need
consideration to flush them.
On Meta1 the L1 cache was VIVT which required a full flush on MM switch.
Meta2 has a VIPT L1 cache so it doesn't require the full flush on MM
switch. Meta2 can also have a writeback L2 with hardware prefetch which
requires some special handling. Support is optional, and the L2 can be
detected and initialised by Linux.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Add source files from the Thread Binary Interface (TBI) library which
provides useful low level operations and traps/context management.
Among other things it handles interrupt/exception/syscall entry (in
tbipcx.S).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Add the main header for the Thread Binary Interface (TBI) library which
provides useful low level operations and trap/context management.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Add boot code for metag. Due to the multi-threaded nature of Meta it is
not uncommon for an RTOS or bare metal application to be started on
other hardware threads by the bootloader. Since there is a single MMU
switch which affects all threads, the MMU is traditionally configured by
the bootloader prior to starting Linux. The bootloader passes a
structure to Linux which among other things contains information about
memory regions which have been mapped. Linux then assumes control of the
local heap memory region.
A kernel arguments string pointer or a flattened device tree pointer can
be provided in the third argument.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Add the header <asm/metag_mem.h> describing addresses, fields, and bits
of various core memory mapped registers in the low non-MMU region.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Add a couple of header files containing core architecture constants.
The first (<asm/metag_isa.h>) contains some constants relating to the
instruction set, such as values to give to the CACHEW and CACHER
instructions.
The second (<asm/metag_regs.h>) contains constants for the core register
units directly accessible to various instructions, and for the
registers, fields, and bits in those units. The main units described are
the control unit (CT.*), the trigger unit (TR.*), and the run-time trace
unit (TT.*).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
On 64 bit architectures with no efficient unaligned access, padding and
explicit alignment must be added in various places to prevent unaligned
64bit accesses (such as taskstats and trace ring buffer).
However this also needs to apply to 32 bit architectures with 64 bit
accesses requiring alignment such as metag.
This is solved by adding a new Kconfig symbol HAVE_64BIT_ALIGNED_ACCESS
which defaults to 64BIT && !HAVE_EFFICIENT_UNALIGNED_ACCESS, and can be
explicitly selected by METAG and any other relevant architectures. This
can be used in various places to determine whether 64bit alignment is
required.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: Will Drewry <wad@chromium.org>
with older hypervisor stacks, such as Xen 4.1.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJRHZ7eAAoJEFjIrFwIi8fJZ+sH/ieMkzdBB6aqbFMcNr7mkfBo
i3swjO2JQI7REYIHfKEVoR3IgHfqKEuABdeEQrceE0XqDepFh84YiKGI2QpPRWEA
903vUV4DXVdcBrypbL45tSFZ1Jxsrzx+F7WfV/f9WHyeiwOyaZTGVQH0VuOzpcum
RvPTT7MmC7g8MJDi66SDYBaX/pBQzifQ81nMWWjXNw0w4CwWX7le1cScZEP42MR6
jTEHzYMLDojdO+2aQM5pt/0CGI5tzBHtX5nNRl6tovlPI3ckknYYx6a7RfxkfZzF
IkMIuGS32yLfsswPPIiMs47/Qgiq3BN6eSTJXMZKUwQokL9yEs8LodcnRDYfgyQ=
=fqcJ
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.8-rc7-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull xen fixes from Konrad Rzeszutek Wilk:
"Two fixes:
- A simple bug-fix for redundant NULL check.
- CVE-2013-0228/XSA-42: x86/xen: don't assume %ds is usable in
xen_iret for 32-bit PVOPS
and two reverts:
- Revert the PVonHVM kexec. The patch introduces a regression with
older hypervisor stacks, such as Xen 4.1."
* tag 'stable/for-linus-3.8-rc7-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
Revert "xen PVonHVM: use E820_Reserved area for shared_info"
Revert "xen/PVonHVM: fix compile warning in init_hvm_pv_info"
xen: remove redundant NULL check before unregister_and_remove_pcpu().
x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.
Pull sparc fixes from David Miller:
"A couple small fixes for sparc including some THP brown-paper-bag
material:
1) During the merging of all the THP support for various
architectures, sparc missed adding a
HAVE_ARCH_TRANSPARENT_HUGEPAGE to it's Kconfig, oops.
2) Sparc needs to be mindful of hugepages in get_user_pages_fast().
3) Fix memory leak in SBUS probe, from Cong Ding.
4) The sunvdc virtual disk client driver has a test of the bitmask of
vdisk server supported operations which was off by one bit"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sunvdc: Fix off-by-one in generic_request().
sparc64: Fix get_user_pages_fast() wrt. THP.
sparc64: Add missing HAVE_ARCH_TRANSPARENT_HUGEPAGE.
sparc: kernel/sbus.c: fix memory leakage
This reverts commit 9d02b43dee.
We are doing this b/c on 32-bit PVonHVM with older hypervisors
(Xen 4.1) it ends up bothing up the start_info. This is bad b/c
we use it for the time keeping, and the timekeeping code loops
forever - as the version field never changes. Olaf says to
revert it, so lets do that.
Acked-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
There was a serious problem in samsung-laptop that its platform driver is
designed to run under BIOS and running under EFI can cause the machine to
become bricked or can cause Machine Check Exceptions.
Discussion about this problem:
https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557https://bugzilla.kernel.org/show_bug.cgi?id=47121
The patches to fix this problem:
efi: Make 'efi_enabled' a function to query EFI facilities
83e6818974
samsung-laptop: Disable on EFI hardware
e0094244e4
Unfortunately this problem comes back again if users specify "noefi" option.
This parameter clears EFI_BOOT and that driver continues to run even if running
under EFI. Refer to the document, this parameter should clear
EFI_RUNTIME_SERVICES instead.
Documentation/kernel-parameters.txt:
===============================================================================
...
noefi [X86] Disable EFI runtime services support.
...
===============================================================================
Documentation/x86/x86_64/uefi.txt:
===============================================================================
...
- If some or all EFI runtime services don't work, you can try following
kernel command line parameters to turn off some or all EFI runtime
services.
noefi turn off all EFI runtime services
...
===============================================================================
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Link: http://lkml.kernel.org/r/511C2C04.2070108@jp.fujitsu.com
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This fixes CVE-2013-0228 / XSA-42
Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user
in 32bit PV guest can use to crash the > guest with the panic like this:
-------------
general protection fault: 0000 [#1] SMP
last sysfs file: /sys/devices/vbd-51712/block/xvda/dev
Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4
mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: scsi_wait_scan]
Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1
EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0
EIP is at xen_iret+0x12/0x2b
EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010
ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0
DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069
Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000)
Stack:
00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000
Call Trace:
Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00
8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40
10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02
EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0
general protection fault: 0000 [#2]
---[ end trace ab0d29a492dcd330 ]---
Kernel panic - not syncing: Fatal exception
Pid: 1250, comm: r Tainted: G D ---------------
2.6.32-356.el6.i686 #1
Call Trace:
[<c08476df>] ? panic+0x6e/0x122
[<c084b63c>] ? oops_end+0xbc/0xd0
[<c084b260>] ? do_general_protection+0x0/0x210
[<c084a9b7>] ? error_code+0x73/
-------------
Petr says: "
I've analysed the bug and I think that xen_iret() cannot cope with
mangled DS, in this case zeroed out (null selector/descriptor) by either
xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT
entry was invalidated by the reproducer. "
Jan took a look at the preliminary patch and came up a fix that solves
this problem:
"This code gets called after all registers other than those handled by
IRET got already restored, hence a null selector in %ds or a non-null
one that got loaded from a code or read-only data descriptor would
cause a kernel mode fault (with the potential of crashing the kernel
as a whole, if panic_on_oops is set)."
The way to fix this is to realize that the we can only relay on the
registers that IRET restores. The two that are guaranteed are the
%cs and %ss as they are always fixed GDT selectors. Also they are
inaccessible from user mode - so they cannot be altered. This is
the approach taken in this patch.
Another alternative option suggested by Jan would be to relay on
the subtle realization that using the %ebp or %esp relative references uses
the %ss segment. In which case we could switch from using %eax to %ebp and
would not need the %ss over-rides. That would also require one extra
instruction to compensate for the one place where the register is used
as scaled index. However Andrew pointed out that is too subtle and if
further work was to be done in this code-path it could escape folks attention
and lead to accidents.
Reviewed-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Petr Matousek <pmatouse@redhat.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Mostly mirrors the s390 logic, as unlike x86 we don't need the
SetPageReferenced() bits.
On sparc64 we also lack a user/privileged bit in the huge PMDs.
In order to make this work for THP and non-THP builds, some header
file adjustments were necessary. Namely, provide the PMD_HUGE_* bit
defines and the pmd_large() inline unconditionally rather than
protected by TRANSPARENT_HUGEPAGE.
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This got missed in the cleanups done for the S390 THP
support.
CC: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull x86 fixes from Peter Anvin:
"One (hopefully) last batch of x86 fixes. You asked for the patch by
patch justifications, so here they are:
x86, MCE: Retract most UAPI exports
This one unexports from userspace a bunch of definitions which should
never have been exported. We really don't want to create an
accidental legacy here.
x86, doc: Add a bootloader ID for OVMF
This is a documentation-only patch, just recording the official
assignment of a boot loader ID.
x86: Do not leak kernel page mapping locations
Security: avoid making it needlessly easy for user space to probe the
kernel memory layout.
x86/mm: Check if PUD is large when validating a kernel address
Prevent failures using /proc/kcore when using 1G pages.
x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systems
Works around a BIOS problem causing boot failures on affected hardware."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Check if PUD is large when validating a kernel address
x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systems
x86, doc: Add a bootloader ID for OVMF
x86: Do not leak kernel page mapping locations
x86, MCE: Retract most UAPI exports
A user reported the following oops when a backup process reads
/proc/kcore:
BUG: unable to handle kernel paging request at ffffbb00ff33b000
IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
[...]
Call Trace:
[<ffffffff811b8aaa>] read_kcore+0x17a/0x370
[<ffffffff811ad847>] proc_reg_read+0x77/0xc0
[<ffffffff81151687>] vfs_read+0xc7/0x130
[<ffffffff811517f3>] sys_read+0x53/0xa0
[<ffffffff81449692>] system_call_fastpath+0x16/0x1b
Investigation determined that the bug triggered when reading
system RAM at the 4G mark. On this system, that was the first
address using 1G pages for the virt->phys direct mapping so the
PUD is pointing to a physical address, not a PMD page.
The problem is that the page table walker in kern_addr_valid() is
not checking pud_large() and treats the physical address as if
it was a PMD. If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If
the data happens to look like a present PMD though, it will be
walked resulting in the oops above.
This patch adds the necessary pud_large() check.
Unfortunately the problem was not readily reproducible and now
they are running the backup program without accessing
/proc/kcore so the patch has not been validated but I think it
makes sense.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.coM>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: stable@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull s390 regression fix from Martin Schwidefsky:
"The recent fix for the s390 sched_clock() function uncovered yet
another bug in s390_next_ktime which causes an endless loop in KVM.
This regression should be fixed before v3.8.
I keep the fingers crossed that this is the last one for v3.8."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/timer: avoid overflow when programming clock comparator
Pull m68knommu fix from Greg Ungerer:
"This contains a single critical fix for the non-MMU m68k platforms.
The change of the kernel exec code path has revealed a problem in the
start thread code that causes crashing on boot. This is the fix for
it."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68knommu: fix trap on execing /bin/init
Pull tile bugfixes from Chris Metcalf:
"This includes a variety of minor bug fixes, mostly to do with testing
"make allyesconfig", "make allmodconfig", "make allnoconfig", inspired
to Tejun Heo's observation about Kconfig.freezer not being included.
The largest changes are just syntax changes removing the tile-specific
use of a macro named INT_MASK, which is way too commonly redefined
throughout driver code"
* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: tag some code with #ifdef CONFIG_COMPAT
tile: fix memcpy_*io functions for allnoconfig
tile: export a handful of symbols appropriately
drm: fix compile failure by including <linux/swiotlb.h>
tile: avoid defining INT_MASK macro in <arch/interrupts.h>
tile: provide "screen_info" when enabling VT
drivers/input/joystick/analog.c: enable precise timer
tile: include kernel/Kconfig.freezer in tile Kconfig
tile: remove an unused variable in copy_thread()
When a HP ProLiant DL980 G7 Server boots a regular kernel,
there will be intermittent lost interrupts which could
result in a hang or (in extreme cases) data loss.
The reason is that this system only supports x2apic physical
mode, while the kernel boots with a logical-cluster default
setting.
This bug can be worked around by specifying the "x2apic_phys" or
"nox2apic" boot option, but we want to handle this system
without requiring manual workarounds.
The BIOS sets ACPI_FADT_APIC_PHYSICAL in FADT table.
As all apicids are smaller than 255, BIOS need to pass the
control to the OS with xapic mode, according to x2apic-spec,
chapter 2.9.
Current code handle x2apic when BIOS pass with xapic mode
enabled:
When user specifies x2apic_phys, or FADT indicates PHYSICAL:
1. During madt oem check, apic driver is set with xapic logical
or xapic phys driver at first.
2. enable_IR_x2apic() will enable x2apic_mode.
3. if user specifies x2apic_phys on the boot line, x2apic_phys_probe()
will install the correct x2apic phys driver and use x2apic phys mode.
Otherwise it will skip the driver will let x2apic_cluster_probe to
take over to install x2apic cluster driver (wrong one) even though FADT
indicates PHYSICAL, because x2apic_phys_probe does not check
FADT PHYSICAL.
Add checking x2apic_fadt_phys in x2apic_phys_probe() to fix the
problem.
Signed-off-by: Stoney Wang <song-bo.wang@hp.com>
[ updated the changelog and simplified the code ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1360263182-16226-1-git-send-email-yinghai@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-Compile fix for !SMP
-More cpu cluster id related fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJRDSkpAAoJEMhvYp4jgsXiSWYH/jy92oAHpKoNBZQ0wzOVXoa7
Lo9vkPaTexXY1SLNohrTiGDxWwm7LNfi6Lag3OtzyzAblcUAqd6ISyx2Nh7xN3Uu
L+3Y7CIBw9CqCAsrjOunMmKcawo/aiscAlhqOoYsmKGcbopQHEomui+yfDIX6NY+
v59xDXp1IyJHqDc9M3t2VJijfeCHnREIddt33gUeEcSfL7nyuvGf2DXuarAGN5N7
bHd7NxqZS2TNGxTbxuNZIlYuVUgqcVV16LLNlTGrBb27iiPJuIg3uHa6GVVO4n0U
czd388xl5Rca8LeD7rAvWPiHX2rIfyEyvWZO4KvRBC9H6+VgGvlyKtnoUPyNhn8=
=mUYd
-----END PGP SIGNATURE-----
Merge tag 'highbank-fixes-for-3.8' of git://sources.calxeda.com/kernel/linux into fixes
From Rob Herring:
highbank fixes for 3.8
-Compile fix for !SMP
-More cpu cluster id related fixes
* tag 'highbank-fixes-for-3.8' of git://sources.calxeda.com/kernel/linux:
ARM: highbank: mask cluster id from cpu_logical_map
ARM: scu: mask cluster id from cpu_logical_map
ARM: scu: add empty scu_enable for !CONFIG_SMP
Pull ARM fixes from Russell King:
"I was going to hold these off until v3.8 was out, and send them with a
stable tag, but as everyone else is pushing much bigger fixes which
Linus is accepting, let's save people from the hastle of having to
patch v3.8 back into working or use a stable kernel.
Looking at the diffstat, this really is high value for its size; this
is miniscule compared to how the -rc6 to tip diffstat currently looks.
So, four patches in this set:
- Punit Agrawal reports that the kernel no longer boots on MPCore due
to a new assumption made in the GIC code which isn't true of
earlier GIC designs. This is the biggest change in this set.
- Punit's boot log also revealed a bunch of WARN_ON() dumps caused by
the DT-ification of the GIC support without fixing up non-DT
Realview - which now sees a greater number of interrupts than it
did before.
- A fix for the DMA coherent code from Marek which uses the wrong
check for atomic allocations; this can result in spinlock lockups
or other nasty effects.
- A fix from Will, which will affect all Android based platforms if
not applied (which use the 2G:2G VM split) - this causes
particularly 'make' to misbehave unless this bug is fixed."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7641/1: memory: fix broken mmap by ensuring TASK_UNMAPPED_BASE is aligned
ARM: DMA mapping: fix bad atomic test
ARM: realview: ensure that we have sufficient IRQs available
ARM: GIC: fix GIC cpumask initialization
On tilepro without CONFIG_PCI, we can't provide inlines of these
functions, as we don't have readl/writel.
In addition, fix memset_io() signature to take a volatile void *.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This was shown up by running with "allmodconfig". I used
EXPORT_SYMBOL() to match existing conventions in files that
were already exporting symbols, or that were exported that way
by other architectures, and otherwise EXPORT_SYMBOL_GPL().
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
We have received multiple reports of mmap failures when running with a
2:2 vm split. These manifest as either -EINVAL with a non page-aligned
address (ending 0xaaa) or a SEGV, depending on the application. The
issue is commonly observed in children of make, which appears to use
bottom-up mmap (assumedly because it changes the stack rlimit).
Further investigation reveals that this regression was triggered by
394ef6403a ("mm: use vm_unmapped_area() on arm architecture"), whereby
TASK_UNMAPPED_BASE is no longer page-aligned for bottom-up mmap, causing
get_unmapped_area to choke on misaligned addressed.
This patch fixes the problem by defining TASK_UNMAPPED_BASE in terms of
TASK_SIZE and explicitly aligns the result to 16M, matching the other
end of the heap.
Acked-by: Nicolas Pitre <nico@linaro.org>
Reported-by: Steve Capper <steve.capper@arm.com>
Reported-by: Jean-Francois Moine <moinejf@free.fr>
Reported-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Realview fails to boot with this warning:
BUG: spinlock lockup suspected on CPU#0, init/1
lock: 0xcf8bde10, .magic: dead4ead, .owner: init/1, .owner_cpu: 0
Backtrace:
[<c00185d8>] (dump_backtrace+0x0/0x10c) from [<c03294e8>] (dump_stack+0x18/0x1c) r6:cf8bde10 r5:cf83d1c0 r4:cf8bde10 r3:cf83d1c0
[<c03294d0>] (dump_stack+0x0/0x1c) from [<c018926c>] (spin_dump+0x84/0x98)
[<c01891e8>] (spin_dump+0x0/0x98) from [<c0189460>] (do_raw_spin_lock+0x100/0x198)
[<c0189360>] (do_raw_spin_lock+0x0/0x198) from [<c032cbac>] (_raw_spin_lock+0x3c/0x44)
[<c032cb70>] (_raw_spin_lock+0x0/0x44) from [<c01c9224>] (pl011_console_write+0xe8/0x11c)
[<c01c913c>] (pl011_console_write+0x0/0x11c) from [<c002aea8>] (call_console_drivers.clone.7+0xdc/0x104)
[<c002adcc>] (call_console_drivers.clone.7+0x0/0x104) from [<c002b320>] (console_unlock+0x2e8/0x454)
[<c002b038>] (console_unlock+0x0/0x454) from [<c002b8b4>] (vprintk_emit+0x2d8/0x594)
[<c002b5dc>] (vprintk_emit+0x0/0x594) from [<c0329718>] (printk+0x3c/0x44)
[<c03296dc>] (printk+0x0/0x44) from [<c002929c>] (warn_slowpath_common+0x28/0x6c)
[<c0029274>] (warn_slowpath_common+0x0/0x6c) from [<c0029304>] (warn_slowpath_null+0x24/0x2c)
[<c00292e0>] (warn_slowpath_null+0x0/0x2c) from [<c0070ab0>] (lockdep_trace_alloc+0xd8/0xf0)
[<c00709d8>] (lockdep_trace_alloc+0x0/0xf0) from [<c00c0850>] (kmem_cache_alloc+0x24/0x11c)
[<c00c082c>] (kmem_cache_alloc+0x0/0x11c) from [<c00bb044>] (__get_vm_area_node.clone.24+0x7c/0x16c)
[<c00bafc8>] (__get_vm_area_node.clone.24+0x0/0x16c) from [<c00bb7b8>] (get_vm_area_caller+0x48/0x54)
[<c00bb770>] (get_vm_area_caller+0x0/0x54) from [<c0020064>] (__alloc_remap_buffer.clone.15+0x38/0xb8)
[<c002002c>] (__alloc_remap_buffer.clone.15+0x0/0xb8) from [<c0020244>] (__dma_alloc+0x160/0x2c8)
[<c00200e4>] (__dma_alloc+0x0/0x2c8) from [<c00204d8>] (arm_dma_alloc+0x88/0xa0)[<c0020450>] (arm_dma_alloc+0x0/0xa0) from [<c00beb00>] (dma_pool_alloc+0xcc/0x1a8)
[<c00bea34>] (dma_pool_alloc+0x0/0x1a8) from [<c01a9d14>] (pl08x_fill_llis_for_desc+0x28/0x568)
[<c01a9cec>] (pl08x_fill_llis_for_desc+0x0/0x568) from [<c01aab8c>] (pl08x_prep_slave_sg+0x258/0x3b0)
[<c01aa934>] (pl08x_prep_slave_sg+0x0/0x3b0) from [<c01c9f74>] (pl011_dma_tx_refill+0x140/0x288)
[<c01c9e34>] (pl011_dma_tx_refill+0x0/0x288) from [<c01ca748>] (pl011_start_tx+0xe4/0x120)
[<c01ca664>] (pl011_start_tx+0x0/0x120) from [<c01c54a4>] (__uart_start+0x48/0x4c)
[<c01c545c>] (__uart_start+0x0/0x4c) from [<c01c632c>] (uart_start+0x2c/0x3c)
[<c01c6300>] (uart_start+0x0/0x3c) from [<c01c795c>] (uart_write+0xcc/0xf4)
[<c01c7890>] (uart_write+0x0/0xf4) from [<c01b0384>] (n_tty_write+0x1c0/0x3e4)
[<c01b01c4>] (n_tty_write+0x0/0x3e4) from [<c01acfe8>] (tty_write+0x144/0x240)
[<c01acea4>] (tty_write+0x0/0x240) from [<c01ad17c>] (redirected_tty_write+0x98/0xac)
[<c01ad0e4>] (redirected_tty_write+0x0/0xac) from [<c00c371c>] (vfs_write+0xbc/0x150)
[<c00c3660>] (vfs_write+0x0/0x150) from [<c00c39c0>] (sys_write+0x4c/0x78)
[<c00c3974>] (sys_write+0x0/0x78) from [<c0014460>] (ret_fast_syscall+0x0/0x3c)
This happens because the DMA allocation code is not respecting atomic
allocations correctly.
GFP flags should not be tested for GFP_ATOMIC to determine if an
atomic allocation is being requested. GFP_ATOMIC is not a flag but
a value. The GFP bitmask flags are all prefixed with __GFP_.
The rest of the kernel tests for __GFP_WAIT not being set to indicate
an atomic allocation. We need to do the same.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Realview EB with a rev B MPcore tile results in lots of warnings at
boot because it can't allocate enough IRQs. Fix this by increasing
the number of available IRQs.
WARNING: at /home/rmk/git/linux-rmk/arch/arm/common/gic.c:757 gic_init_bases+0x12c/0x2ec()
Cannot allocate irq_descs @ IRQ96, assuming pre-allocated
Modules linked in:
Backtrace:
[<c00185d8>] (dump_backtrace+0x0/0x10c) from [<c03294e8>] (dump_stack+0x18/0x1c) r6:000002f5 r5:c042c62c r4:c044ff40 r3:c045f240
[<c03294d0>] (dump_stack+0x0/0x1c) from [<c00292c8>] (warn_slowpath_common+0x54/0x6c)
[<c0029274>] (warn_slowpath_common+0x0/0x6c) from [<c0029384>] (warn_slowpath_fmt+0x38/0x40)
[<c002934c>] (warn_slowpath_fmt+0x0/0x40) from [<c042c62c>] (gic_init_bases+0x12c/0x2ec)
[<c042c500>] (gic_init_bases+0x0/0x2ec) from [<c042cdc8>] (gic_init_irq+0x8c/0xd8)
[<c042cd3c>] (gic_init_irq+0x0/0xd8) from [<c042827c>] (init_IRQ+0x1c/0x24)
[<c0428260>] (init_IRQ+0x0/0x24) from [<c04256c8>] (start_kernel+0x1a4/0x300)
[<c0425524>] (start_kernel+0x0/0x300) from [<70008070>] (0x70008070)
---[ end trace 1b75b31a2719ed1c ]---
------------[ cut here ]------------
WARNING: at /home/rmk/git/linux-rmk/kernel/irq/irqdomain.c:234 irq_domain_add_legacy+0x80/0x140()
Modules linked in:
Backtrace:
[<c00185d8>] (dump_backtrace+0x0/0x10c) from [<c03294e8>] (dump_stack+0x18/0x1c) r6:000000ea r5:c0081a38 r4:00000000 r3:c045f240
[<c03294d0>] (dump_stack+0x0/0x1c) from [<c00292c8>] (warn_slowpath_common+0x54/0x6c)
[<c0029274>] (warn_slowpath_common+0x0/0x6c) from [<c0029304>] (warn_slowpath_null+0x24/0x2c)
[<c00292e0>] (warn_slowpath_null+0x0/0x2c) from [<c0081a38>] (irq_domain_add_legacy+0x80/0x140)
[<c00819b8>] (irq_domain_add_legacy+0x0/0x140) from [<c042c64c>] (gic_init_bases+0x14c/0x2ec)
[<c042c500>] (gic_init_bases+0x0/0x2ec) from [<c042cdc8>] (gic_init_irq+0x8c/0xd8)
[<c042cd3c>] (gic_init_irq+0x0/0xd8) from [<c042827c>] (init_IRQ+0x1c/0x24)
[<c0428260>] (init_IRQ+0x0/0x24) from [<c04256c8>] (start_kernel+0x1a4/0x300)
[<c0425524>] (start_kernel+0x0/0x300) from [<70008070>] (0x70008070)
---[ end trace 1b75b31a2719ed1d ]---
------------[ cut here ]------------
WARNING: at /home/rmk/git/linux-rmk/arch/arm/common/gic.c:762 gic_init_bases+0x170/0x2ec()
Modules linked in:
Backtrace:
[<c00185d8>] (dump_backtrace+0x0/0x10c) from [<c03294e8>] (dump_stack+0x18/0x1c) r6:000002fa r5:c042c670 r4:00000000 r3:c045f240
[<c03294d0>] (dump_stack+0x0/0x1c) from [<c00292c8>] (warn_slowpath_common+0x54/0x6c)
[<c0029274>] (warn_slowpath_common+0x0/0x6c) from [<c0029304>] (warn_slowpath_null+0x24/0x2c)
[<c00292e0>] (warn_slowpath_null+0x0/0x2c) from [<c042c670>] (gic_init_bases+0x170/0x2ec)
[<c042c500>] (gic_init_bases+0x0/0x2ec) from [<c042cdc8>] (gic_init_irq+0x8c/0xd8)
[<c042cd3c>] (gic_init_irq+0x0/0xd8) from [<c042827c>] (init_IRQ+0x1c/0x24)
[<c0428260>] (init_IRQ+0x0/0x24) from [<c04256c8>] (start_kernel+0x1a4/0x300)
[<c0425524>] (start_kernel+0x0/0x300) from [<70008070>] (0x70008070)
---[ end trace 1b75b31a2719ed1e ]---
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>