X86 CPUs need to have some magic happening to enable the virtualization
extensions on them. This magic can result in unpleasant results for
users, like blocking other VMMs from working (vmx) or using invalid TLB
entries (svm).
Currently KVM activates virtualization when the respective kernel module
is loaded. This blocks us from autoloading KVM modules without breaking
other VMMs.
To circumvent this problem at least a bit, this patch introduces on
demand activation of virtualization. This means, that instead
virtualization is enabled on creation of the first virtual machine
and disabled on destruction of the last one.
So using this, KVM can be easily autoloaded, while keeping other
hypervisors usable.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
On s390 there are two ways of specifying the system call number for
the svc instruction. The standard way is to use the immediate field
in the instruction (or to use EXecute for values unknown during
assemble time). This can encode 256 system calls.
The kernel ABI also allows to put the system call number in r1 and
then execute svc 0 to enable system call numbers > 255.
It turns out that single stepping svc 0 is broken, since the PER
program check handler uses r1. We have to use a different register.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
After an IPL from NSS the uptime of the system is incorrect. The reason
is that the startup code in head.S is not executed in case of an IPL
from NSS. Due to that sched_clock_base_cc which is used to initialze
wall_to_monotonic contains the time stamp when the NSS has been created
instead of the time stamp of the system start.
Reinitialize the cputime accounting values in create_kernel_nss after
the SAVESYS CP command that created the NSS segment.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
sigp sense only returns the status of a cpu if it is non zero. If the
status of the sensed cpu is all zeros condition code 0 (accpeted) is
set and no status bits are returned.
The current code however assumes that a status was returned and tests
bits in it. This means uninitalized data is accessed with random
results.
Worst case is that the code that checks if cpu is offline on cpu
hotplug assumes that the target cpu is offline while it is still
running. This leads potentially to memory corruption since resources
that are still needed by the target cpu will be freed and could be
resused while still in use.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
According to the architecture a cpu must not necessarily enter stopped
state after completion of a sigp instruction with "stop" order code.
So remove the BUG() statement after self sending sigp stop to avoid
that it ever gets reached.
Also add a sigp busy check to make sure that the order gets delivered.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Offlined cpus still have valid prefix register contents. Dumpers
will store the register contents of a cpu to the location where its
prefix register points to.
For offlined cpus the area (lowcore) has been freed and the dumper
would write the uninteresting contents of the offline cpu to a memory
location which might be in use by some other component and destroy
valueable information.
To fix this set the prefix register of offline cpus to absolute
address zero again. This prevents the current dumpers to write to
random memory locations.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch makes the hwcap bit for the high gprs feature to be visible
in /proc/cpuinfo.
Signed-off-by: Andreas Krebbel <Andreas.Krebbel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Hypfs never worked on systems that only provide D204 subcode 6.
In these cases we nevertheless used subcode 7. With this fix, we
use subcode 6, if it is available and the system does not provide
subcode 7.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch adds an EX_TABLE entry to mvc{p|s|os} usercopy functions that
may be called with KERNEL_DS. In combination with collaborative memory
management, kernel pages marked as unused may trigger an adressing exception
in the usercopy functions. This fixes an unhandled addressing exception bug
where strncpy_from_user() is used with len > strnlen and KERNEL_DS, crossing
a page boundary to an unused page.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We used address 0x1084 instead of 0x84 to store the suspend CPU address.
With this patch we use the correct address 0x84 as it is defined in
the POP.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The time a system has been suspended should not show up in any
of the cputime accounting fields. The time of inactivity is definitly
not any form of real cputime nor is it idle time.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The function graph tracer used to have a protection against NMI
while entering a function entry tracing. But this is useless now,
the tracer is reentrant and the ring buffer supports NMI tracing.
Same as 07868b086c for x86.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The system call takes a signed length parameter. So perform sign
extension instead of zero extension.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use an own implementation instead of the common code udelay loop.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When udelay() gets called with a delay that would expire before the
next clock event it reprograms the clock comparator.
When the interrupt happens the clock comparator won't be resetted
therefore the interrupt condition doesn't get cleared.
The result is an endless timer interrupt loop until the next clock
event would expire (stored in lowcore).
So udelay() usually would wait much longer for small delays than it
should.
Fix this by disabling the local tick which makes sure that the clock
comparator will be resetted when a timer interrupt happens.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The s390 version of module_frob_arch_sections allocates additional
syminfos for got and plt offsets. These syminfos are freed on
sucessful module load. If the module fails to load (e.g. missing
dependency when using insmod instead of modprobe) this area is not
freed.
This patch lets module_free free this area. Please note, we have to
set the pointer to NULL since module_free is called several times
from the generic code.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Also increase the maximum possible kmemleak early log entries since
2000 are not sufficient on s390.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
next-20090925 randconfig build breaks on s390x, with CONFIG_AIO=n.
arch/s390/mm/pgtable.c: In function 's390_enable_sie':
arch/s390/mm/pgtable.c:282: error: 'struct mm_struct' has no member named 'ioctx_list'
arch/s390/mm/pgtable.c:298: error: 'struct mm_struct' has no member named 'ioctx_list'
make[1]: *** [arch/s390/mm/pgtable.o] Error 1
Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
commit 628eb9b8a8
KVM: s390: streamline memslot handling
introduced kvm_s390_vcpu_get_memsize. This broke guests >=4G, since this
function returned an int.
This patch changes the return value to a long.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
It's unused.
It isn't needed -- read or write flag is already passed and sysctl
shouldn't care about the rest.
It _was_ used in two places at arch/frv for some reason.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (39 commits)
cpumask: Move deprecated functions to end of header.
cpumask: remove unused deprecated functions, avoid accusations of insanity
cpumask: use new-style cpumask ops in mm/quicklist.
cpumask: use mm_cpumask() wrapper: x86
cpumask: use mm_cpumask() wrapper: um
cpumask: use mm_cpumask() wrapper: mips
cpumask: use mm_cpumask() wrapper: mn10300
cpumask: use mm_cpumask() wrapper: m32r
cpumask: use mm_cpumask() wrapper: arm
cpumask: Use accessors for cpu_*_mask: um
cpumask: Use accessors for cpu_*_mask: powerpc
cpumask: Use accessors for cpu_*_mask: mips
cpumask: Use accessors for cpu_*_mask: m32r
cpumask: remove arch_send_call_function_ipi
cpumask: arch_send_call_function_ipi_mask: s390
cpumask: arch_send_call_function_ipi_mask: powerpc
cpumask: arch_send_call_function_ipi_mask: mips
cpumask: arch_send_call_function_ipi_mask: m32r
cpumask: arch_send_call_function_ipi_mask: alpha
cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64
...
* remove asm/atomic.h inclusion from linux/utsname.h --
not needed after kref conversion
* remove linux/utsname.h inclusion from files which do not need it
NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We're weaning the core code off handing cpumask's around on-stack.
This introduces arch_send_call_function_ipi_mask().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (30 commits)
Use macros for .data.page_aligned section.
Use macros for .bss.page_aligned section.
Use new __init_task_data macro in arch init_task.c files.
kbuild: Don't define ALIGN and ENTRY when preprocessing linker scripts.
arm, cris, mips, sparc, powerpc, um, xtensa: fix build with bash 4.0
kbuild: add static to prototypes
kbuild: fail build if recordmcount.pl fails
kbuild: set -fconserve-stack option for gcc 4.5
kbuild: echo the record_mcount command
gconfig: disable "typeahead find" search in treeviews
kbuild: fix cc1 options check to ensure we do not use -fPIC when compiling
checkincludes.pl: add option to remove duplicates in place
markup_oops: use modinfo to avoid confusion with underscored module names
checkincludes.pl: provide usage helper
checkincludes.pl: close file as soon as we're done with it
ctags: usability fix
kernel hacking: move STRIP_ASM_SYMS from General
gitignore usr/initramfs_data.cpio.bz2 and usr/initramfs_data.cpio.lzma
kbuild: Check if linker supports the -X option
kbuild: introduce ld-option
...
Fix trivial conflict in scripts/basic/fixdep.c
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (22 commits)
[S390] Update default configuration.
[S390] hibernate: Do real CPU swap at resume time
[S390] dasd: tolerate devices that have no feature codes
[S390] zcrypt: Do not add/remove devices in s/r callbacks
[S390] hibernate: make sure pfn_is_nosave handles lowcore pages
[S390] smp: introduce LC_ORDER and simplify lowcore handling
[S390] ptrace: use common code for simple peek/poke operations
[S390] fix disabled_wait inline assembly clobber list
[S390] Change kernel_page_present coding style.
[S390] hibernation: reset system after resume
[S390] hibernation: fix guest page hinting related crash
[S390] Get rid of init_module/delete_module compat functions.
[S390] Convert sys_execve to function with parameters.
[S390] Convert sys_clone to function with parameters.
[S390] qdio: change state of all primed input buffers
[S390] qdio: reduce per device debug messages
[S390] cio: introduce consistent subchannel scanning
[S390] cio: idset use actual number of ssids
[S390] cio: dont kfree vmalloced memory
[S390] cio: introduce css_settle
...
Currently, when the physical resume CPU is not equal to the physical suspend
CPU, we swap the CPUs logically, by modifying the logical/physical CPU mapping.
This has two major drawbacks: First the change is visible from user space (e.g.
CPU sysfs files) and second it is hard to ensure that nowhere in the kernel
the physical CPU ID is stored before suspend.
To fix this, we now really swap the physical CPUs, if the resume CPU is not
the pysical suspend CPU. We restart the suspend CPU and stop the resume CPU
using SIGP restart and SIGP stop. If the suspend CPU is no longer available,
we write a message and load a disabled wait PSW.
Signed-off-by: Michael Holzheu <michael.holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
pfn_is_nosave doesn't return the correct value for the second lowcore
page if lowcore protection is enabled. Make sure it always returns
the correct value.
While at it simplify the whole thing.
NSS special handling is done by the tprot check like it already works
for DCSS as well. So remove the extra code for NSS.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Removes a couple of simple code duplications. But before I have to do
this again, just simplify it.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
arch_ptrace on s390 implements PTRACE_(PEEK|POKE)(TEXT|DATA) instead of
using using ptrace_request in kernel/ptrace.c.
The only reason is the 31bit addressing mode, where we have to unmask the
highest bit.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The disabled_wait inline assmembly also clobbers register r1, but it
is missing in the clobber list.
Fixes recursive Oops on panic.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Make the inline assembly look like all others.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Force system into defined state after resume.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
On resume the system that loads the to be resumed image might have
unstable pages.
When the resume image is copied back and a write access happen to an
unstable page this causes an exception and the system crashes.
To fix this set all free pages to stable before copying the resumed
image data. Also after everything has been restored set all free
pages of the resumed system to unstable again.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
These functions aren't needed. Might be a leftover of the pre
cond_syscall time.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use function parameters instead of accessing the pt_regs structure
to get the parameters.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use function parameters instead of accessing the pt_regs structure
to get the parameters.
Also merge the 31 and 64 bit versions since they are identical.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
trivial: fix typo in aic7xxx comment
trivial: fix comment typo in drivers/ata/pata_hpt37x.c
trivial: typo in kernel-parameters.txt
trivial: fix typo in tracing documentation
trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c
trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c
trivial: remove unnecessary semicolons
trivial: Fix duplicated word "options" in comment
trivial: kbuild: remove extraneous blank line after declaration of usage()
trivial: improve help text for mm debug config options
trivial: doc: hpfall: accept disk device to unload as argument
trivial: doc: hpfall: reduce risk that hpfall can do harm
trivial: SubmittingPatches: Fix reference to renumbered step
trivial: fix typos "man[ae]g?ment" -> "management"
trivial: media/video/cx88: add __init/__exit macros to cx88 drivers
trivial: fix typo in CONFIG_DEBUG_FS in gcov doc
trivial: fix missing printk space in amd_k7_smp_check
trivial: fix typo s/ketymap/keymap/ in comment
trivial: fix typo "to to" in multiple files
trivial: fix typos in comments s/DGBU/DBGU/
...
A number of architectures have identical asm/mman.h files so they can all
be merged by using the new generic file.
The remaining asm/mman.h files are substantially different from each
other.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a flag for mmap that will be used to request a huge page region that
will look like anonymous memory to user space. This is accomplished by
using a file on the internal vfsmount. MAP_HUGETLB is a modifier of
MAP_ANONYMOUS and so must be specified with it. The region will behave
the same as a MAP_ANONYMOUS region using small pages.
The patch also adds the MAP_STACK flag, which was previously defined only
on some architectures but not on others. Since MAP_STACK is meant to be a
hint only, architectures can define it without assigning a specific
meaning to it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: David Rientjes <rientjes@google.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 9617729941 ("Drop free_pages()")
modified nr_free_pages() to return 'unsigned long' instead of 'unsigned
int'. This made the casts to 'unsigned long' in most callers superfluous,
so remove them.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Chris Zankel <zankel@tensilica.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>