Problem: An application violating the architectural rules regarding
operation dependencies and having specific Register Stack Engine (RSE)
state at the time of the violation, may result in an illegal operation
fault and invalid RSE state. Such faults may initiate a cascade of
repeated illegal operation faults within OS interruption handlers.
The specific behavior is OS dependent.
Implication: An application causing an illegal operation fault with
specific RSE state may result in a series of illegal operation faults
and an eventual OS stack overflow condition.
Workaround: OS interruption handlers that switch to kernel backing
store implement a check for invalid RSE state to avoid the series
of illegal operation faults.
The core of the workaround is the RSE_WORKAROUND code sequence
inserted into each invocation of the SAVE_MIN_WITH_COVER and
SAVE_MIN_WITH_COVER_R19 macros. This sequence includes hard-coded
constants that depend on the number of stacked physical registers
being 96. The rest of this patch consists of code to disable this
workaround should this not be the case (with the presumption that
if a future Itanium processor increases the number of registers, it
would also remove the need for this patch).
Move the start of the RBS up to a mod32 boundary to avoid some
corner cases.
The dispatch_illegal_op_fault code outgrew the spot it was
squatting in when built with this patch and CONFIG_VIRT_CPU_ACCOUNTING=y
Move it out to the end of the ivt.
Signed-off-by: Tony Luck <tony.luck@intel.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
[PATCH] return to old errno choice in mkdir() et.al.
[Patch] fs/binfmt_elf.c: fix wrong return values
[PATCH] get rid of leak in compat_execve()
[Patch] fs/binfmt_elf.c: fix a wrong free
[PATCH] avoid multiplication overflows and signedness issues for max_fds
[PATCH] dup_fd() part 4 - race fix
[PATCH] dup_fd() - part 3
[PATCH] dup_fd() part 2
[PATCH] dup_fd() fixes, part 1
[PATCH] take init_files to fs/file.c
The GVMM module is position independent since it is relocated to the guest
address space.
Commit ea696f9cf ("ia64 kvm fixes for O=... builds") broke this by linking
GVMM with non-PIC objects.
Fix by creating two files: memset.S and memcpy.S which just include the files
under arch/ia64/lib/{memset.S, memcpy.S} respectively.
[akpm: don't delete files which we need]
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
The patch aims to fix a performance issue for the syscall
personality(PER_LINUX32).
On IA-64 box, the syscall personality (PER_LINUX32) has poor performance
because it failed to find the Linux/x86 execution domain. Then it tried
to load the kernel module however it failed always and it used the default
execution domain PER_LINUX instead. Requesting kernel modules is very
expensive. It caused the performance issue. (see the function
lookup_exec_domain in kernel/exec_domain.c).
To resolve the issue, execution domain Linux/x86 is always registered in
initialization time for IA-64 architecture.
Signed-off-by: Xiaolan Huang <xiaolan.huang@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
acpi_unregister_gsi() should "undo" what acpi_register_gsi() does.
On systems that have legacy interrupts, acpi_unregister_gsi erroneously calls
iosapci_unregister_intr() which is wrong to do and causes a loud warning.
acpi_unregister_gsi() should just return in these cases.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
There is only palinfo_handle_smp as (indirect) user of palinfo_smp_call (by
way of smp_call_function_single) and surely palinfo_handle_smp never pass
NULL as parameter for info.
Signed-off-by: Simon Holm Thøgersen <odie@cs.aau.dk>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix a typo, and coding style cleanups for pfm_handle_work().
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch does:
- make comment at next to resched check more robust
- move "re-check" comments to next to where change predicate regs
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
[Bug-fix for "[BUG?][2.6.25-mm1] sleeping during IRQ disabled"]
This patch does:
- enable interrupts before calling schedule() as same as others, ex. x86
- enable interrupts during ia64_do_signal() and ia64_sync_krbs()
- do_notify_resume_user() is still called with interrupts disabled, since
we can take short path of fsys_mode if-statement quickly.
- pfm_handle_work() is also called with interrupts disabled, since
it can deal interrupt mask within itself.
- fix/add some comments/notes
Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The sequence executed in check_sal_cache_flush:
- pend a timer interrupt
- call SAL_CACHE_FLUSH
- see if interrupt is still pending
can hang HP machines with buggy SAL_CACHE_FLUSH implementations.
Provide a kernel command-line argument to allow users skip this
check if desired. Using this parameter will force ia64_sal_cache_flush
to call ia64_pal_cache_flush() instead of SAL_CACHE_FLUSH.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Some IA64 machines map all cell-local memory above 4 GB (32 bit limit).
However, in most cases, the kernel needs some memory below that limit that is
DMA-capable. So in this machine configuration, the crashkernel will be reserved
above 4 GB.
For machines that use SWIOTLB implementation because they lack an I/O MMU
the low memory is required by the SWIOTLB implementation. In that case,
it doesn't make sense to reserve the crashkernel at all because it's unusable
for kdump.
A special case is the "hpzx1" machine vector. In theory, it has a I/O MMU, so
it can be booted above 4 GB. However, in the kdump case that is not possible
because of changeset 51b58e3e26:
On HP zx1 machines, the 'machvec=dig' parameter is needed for the kdump
kernel to avoid problems with the HP sba iommu. The problem is that during
the boot of the kdump kernel, the iommu is re-initialized, so in-flight DMA
from improperly shutdown drivers causes an IOTLB miss which leads to an
MCA. With kdump, the idea is to get into the kdump kernel with as little
code as we can, so shutting down drivers properly is not an option.
The workaround is to add 'machvec=dig' to the kdump kernel boot parameters.
This makes the kdump kernel avoid using the sba iommu altogether, leaving
the IOTLB intact. Any ongoing DMA falls harmlessly outside the kdump
kernel. After the kdump kernel reboots, all devices will have been
shutdown properly and DMA stopped.
This patch pushes that functionality into the sba iommu initialization
code, so that users won't have to find the obscure documentation telling
them about 'machvec=dig'.
This means that also for hpzx1 it's not possible to boot when all
memory is above the 4 GB limit. So the only machine vectors that can handle
this case are "sn2" and "uv".
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch adds the basic IA64 machvec infrastructure to support
the SGI "UV" platform.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Races galore... General rule: as soon as it's in descriptor table,
it's over; another thread might have started IO on it/dup2() it
elsewhere/dup2() something *over* it/etc. fd_install() is the very
last step one should take - it's a point of no return.
Besides, the damn thing leaked on failure exits...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Replace TIF_RESTORE_SIGMASK with TS_RESTORE_SIGMASK and define
our own set_restore_sigmask() function. This saves the costly
SMP-safe set_bit operation, which we do not need for the sigmask
flag since TIF_SIGPENDING always has to be set too.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix indenting of switch statement to follow CodingStyle, and
pull out handling of call_data into an inlined function.
I confirmed that applying this fix doesn't affect assembled code.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Rename div64_64 to div64_u64 to make it consistent with the other divide
functions, so it clearly includes the type of the divide. Move its definition
to math64.h as currently no architecture overrides the generic implementation.
They can still override it of course, but the duplicated declarations are
avoided.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch silences:
WARNING: vmlinux.o(.text+0x44672): Section mismatch in
reference from the function arch_register_cpu() to the
function .cpuinit.text:register_cpu()
Changes are based on codes in arch/x86/kernel/topology.c
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch removes following warning:
WARNING: vmlinux.o(.exit.text+0xb1): Section mismatch in
reference from the function palinfo_exit() to the variable
.cpuinit.data:palinfo_cpu_notifier
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch shuts up the following:
WARNING: vmlinux.o(.text+0x7102): Section mismatch in
reference from the function fixup_irqs() to the function
.devinit.text:ia64_disable_timer()
Removing ia64_disable_timer() is safe because there are no functions
calling it other than the fixup_irqs(),
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch kills:
WARNING: vmlinux.o(.text+0x1702): Section mismatch in
reference from the function acpi_register_ioapic() to the
function .devinit.text:iosapic_init()
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
- Operations are now a shared const function block as with most other Linux
objects
- Introduce wrappers for some optional functions to get consistent behaviour
- Wrap put_char which used to be patched by the tty layer
- Document which functions are needed/optional
- Make put_char report success/fail
- Cache the driver->ops pointer in the tty as tty->ops
- Remove various surplus lock calls we no longer need
- Remove proc_write method as noted by Alexey Dobriyan
- Introduce some missing sanity checks where certain driver/ldisc
combinations would oops as they didn't check needed methods were present
[akpm@linux-foundation.org: fix fs/compat_ioctl.c build]
[akpm@linux-foundation.org: fix isicom]
[akpm@linux-foundation.org: fix arch/ia64/hp/sim/simserial.c build]
[akpm@linux-foundation.org: fix kgdb]
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
TIF_RESTORE_SIGMASK no longer needs to be in the _TIF_WORK_* masks.
Those low bits are scarce. Renumber TIF_RESTORE_SIGMASK to free one up.
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Legacy HP ia64 platforms currently cannot provide
/proc/cpuinfo/physical_id due to legacy SAL/PAL implementations.
However, that physical topology information can be obtained
via ACPI.
Provide an interface that gives ACPI one last chance to provide
physical_id for these legacy platforms. This logic only comes
into play iff:
- ACPI actually provides slot information for the CPU
- we lack a valid socket_id
Otherwise, we don't do anything.
Since x86 uses the ACPI processor driver as well, we provide a nop
stub function for arch_fix_phys_package_id() in asm-x86/topology.h
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Commit 113134fcbc changed the flow of
control when calling PAL_LOGICAL_TO_PHYSICAL and SAL_PHYSICAL_ID_INFO.
With the change, if a platform did not implement the latter, a useless
printk would appear in the boot log:
ia64_sal_pltid failed with -1
So let's check the return code and only printk on a true error, and do
not print anything in the unimplemented case. While we're in there,
clean up some stylistic issues too.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Enable the uncached allocator to allocate multiple pages of contiguous
uncached memory.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
If "max_purges" from PAL is 0, it actually means 1.
However it was not handled later when a hot-added cpu pass the
max_purges from PAL. This makes systems easy to go BUG_ON().
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change all ia64 machvecs to use the new dma_*map*_attrs() interfaces.
Implement the old dma_*map_*() interfaces in terms of the corresponding new
interfaces. For ia64/sn, make use of one dma attribute,
DMA_ATTR_WRITE_BARRIER. Introduce swiotlb_*map*_attrs() functions.
Signed-off-by: Arthur Kepner <akepner@sgi.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Jes Sorensen <jes@sgi.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce new interfaces, dma_*map*_attrs(), for passing architecture-specific
attributes when memory is mapped and unmapped for DMA. Give the interfaces
default implementations which ignore attributes. Also introduce the
dma_{set|get}_attr() interfaces for setting and retrieving individual
attributes. Define one attribute, DMA_ATTR_WRITE_BARRIER, in anticipation of
its use by ia64/sn. Select whether architectures implement arch-specific
versions of the dma_*map*_attrs() interfaces via HAVE_DMA_ATTRS in Kconfig.
[markn@au1.ibm.com: dma_{set,get}_attr() have to be static inline]
Signed-off-by: Arthur Kepner <akepner@sgi.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Jes Sorensen <jes@sgi.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
iommu_is_span_boundary in lib/iommu-helper.c was exported for PARISC IOMMUs
(commit 3715863aa1). SWIOTLB can use it instead
of the homegrown function.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* EXTRA_CFLAGS do not apply for *.S
* don't bother with symlinks to ../lib/mem*.S, just add ../lib/mem*.o
to object list
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All architectures use an effectively identical definition of online_page(), so
just make it common code. x86-64, ia64, powerpc and sh are actually
identical; x86-32 is slightly different.
x86-32's differences arise because it puts its hotplug pages in the highmem
zone. We can handle this in the generic code by inspecting the page to see if
its in highmem, and update the totalhigh_pages count appropriately. This
leaves init_32.c:free_new_highpage with a single caller, so I folded it into
add_one_highpage_init.
I also removed an incorrect comment referring to the NUMA case; any NUMA
details have already been dealt with by the time online_page() is called.
[akpm@linux-foundation.org: fix indenting]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com>
Tested-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Christoph Lameter <clameter@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
So userspace can save/restore the mpstate during migration.
[avi: export the #define constants describing the value]
[christian: add s390 stubs]
[avi: ditto for ia64]
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Timers that fire between guest hlt and vcpu_block's add_wait_queue() are
ignored, possibly resulting in hangs.
Also make sure that atomic_inc and waitqueue_active tests happen in the
specified order, otherwise the following race is open:
CPU0 CPU1
if (waitqueue_active(wq))
add_wait_queue()
if (!atomic_read(pit_timer->pending))
schedule()
atomic_inc(pit_timer->pending)
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Update the related Makefile and KConfig for kvm build
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Some sal/pal calls would be traped to kvm for virtulization
from guest firmware.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
process.c mainly handle interruption injection, and some faults handling.
Signed-off-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
asm-offsets.c will generate offset values used for assembly code
for some fileds of special structures.
Signed-off-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
optvfault.S Add optimization for some performance-critical
virtualization faults.
Signed-off-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
trampoline code targets for guest/host world switch.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
mmio.c includes mmio decoder, and related mmio logics.
Signed-off-by: Anthony Xu <Anthony.xu@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
vmm_ivt.S includes an ivt for vmm use.
Signed-off-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
vmm.c adds the interfaces with kvm/module, and initialize global data area.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>