Commit Graph

72740 Commits

Author SHA1 Message Date
Heiko Carstens
c397031f58 s390: update defconfig
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:29 +02:00
Heiko Carstens
0eccc783d8 s390/jump label,nss: let shared kernel support depend on !JUMP_LABEL
We can have either shared kernel or jump label support, but not both.
If a kernel gets IPL'ed from an NSS it's not possible to patch the
text segment, since it's read-only.
Therefore any static branches cannot be updated. So we need to make
sure that shared kernel support is disabled if jump label support
is enabled.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:29 +02:00
Heiko Carstens
f7ed68a47f s390/disassembler: fix decoding of risblg instruction
Fix typo: risblk -> risblg.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:28 +02:00
Heiko Carstens
c59eed111b s390/bpf,jit: add support for BPF_S_ANC_ALU_XOR_X instruction
Add support for new BPF_S_ANC_ALU_XOR_X instruction which got added
with ffe06c17 "filter: add XOR operation".

s390 version of 4bfaddf1 "x86 bpf_jit: support BPF_S_ANC_ALU_XOR_X instruction".

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:28 +02:00
Heiko Carstens
022912575c s390/traps: move call to print_modules() out of show_regs()
Same as 0fa0e2f0 "x86: Move call to print_modules() out of show_regs()".

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:27 +02:00
Heiko Carstens
5e249d6e11 s390/mm: mark free_initrd_mem() as __init
Same as 0d26d1d8 "x86/mm: Mark free_initrd_mem() as __init".
In addition also add the __init annotation to setup_zero_pages().

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:27 +02:00
Heiko Carstens
b1d6b40cbd s390/cmpxchg,percpu: implement cmpxchg_double()
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:25 +02:00
Heiko Carstens
ba6f5c2a8d s390/percpu: implement this_cpu_add_return()
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:25 +02:00
Heiko Carstens
28634a07d3 s390/percpu: implement this_cpu_xchg()
The generic variant has a local_irq_save/restore pair which is quite
expensive. It is sufficient to disable preemption, which is a no-op
with !CONFIG_PREEMPT and then use the regular xchg macro.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:24 +02:00
Heiko Carstens
0ed23b3e49 s390/kexec: remove CONFIG_KEXEC
Since "Kconfig: split the s390 base menu" CONFIG_KEXEC gets always selected.
Therefore there is no point in keeping CONFIG_KEXEC anywhere.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:24 +02:00
Heiko Carstens
708c39db0e s390/irq: use designated initializers for irq class array
Use designated initializers for the irq class array in irq.c so
it's always guaranteed that the order of elements is equal to
their corresponding parts in irq.h.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:23 +02:00
Heiko Carstens
41459d36cf s390: add uninitialized_var() to suppress false positive compiler warnings
Get rid of these:

arch/s390/kernel/smp.c:134:19: warning: ‘status’ may be used uninitialized in this function [-Wuninitialized]
arch/s390/mm/pgtable.c:641:10: warning: ‘table’ may be used uninitialized in this function [-Wuninitialized]
arch/s390/mm/pgtable.c:644:12: warning: ‘page’ may be used uninitialized in this function [-Wuninitialized]
drivers/s390/cio/cio.c:1037:14: warning: ‘schid’ may be used uninitialized in this function [-Wuninitialized]

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:23 +02:00
Heiko Carstens
6b563d8c26 s390/crashdump: move fill_cpu_elf_notes() prototype to header file
Move fill_cpu_elf_notes() prototype to header file.
This way we get compile errors if e.g. the number of function
parameters get changed.
Otherwise it's possible to change just the definition and everything
else still compiles fine, but the result is broken code.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:22 +02:00
Heiko Carstens
f6e38691a2 s390/process: add missing header include
Add appropriate header file:

arch/s390/kernel/process.c:327:15: warning: symbol 'arch_align_stack' was not declared.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:22 +02:00
Heiko Carstens
66389e8583 s390/ptrace: add missing ifdef
if (MACHINE_HAS_TE) translates to if (0) on !CONFIG_64BIT however the
compiler still warns about invalid shifts within non-reachable code.
So add an explicit ifdef to get rid of this warning:

arch/s390/kernel/ptrace.c: In function ‘update_per_regs’:
arch/s390/kernel/ptrace.c:63:4: warning: left shift count >= width of type [enabled by default]
arch/s390/kernel/ptrace.c:65:4: warning: left shift count >= width of type [enabled by default]

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:21 +02:00
Heiko Carstens
1a0f24878e s390/ipl,decrompressor: disable branch profiling
The early kernel decrompressing code can't call into the kernel
in order to profile branches. Fixes this link error:

arch/s390/boot/compressed/misc.o: In function `decompress_kernel':
misc.c:(.text+0x9a6): undefined reference to `ftrace_likely_update'

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:21 +02:00
Heiko Carstens
305e4f108b s390/perf_events: compile only for CONFIG_64BIT
The whole hardware support is only available in zArch mode.
Fixes also this compile warning:

arch/s390/kernel/perf_cpum_cf.c: In function ‘cpumf_pmu_init’:
arch/s390/kernel/perf_cpum_cf.c:670:2: warning: left shift count >= width of type [enabled by default]

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:20 +02:00
Wei Yongjun
0327dab0e8 s390/topology: use for_each_set_bit to simplify the code
Use for_each_set_bit() to simplify the code.

spatch with a semantic match is used to found this.
(http://coccinelle.lip6.fr/)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:18 +02:00
Heiko Carstens
c0162b07b3 s390/syscalls: wire up kcmp system call
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:17 +02:00
Heiko Carstens
a11b2ef7bb s390/appldata: change return value of appldata_asm
Change return value of appldata_asm() to -EOPNOTSUPP in case of an
error. The return value was only used internally and not passed to
user space.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:15 +02:00
Heiko Carstens
d978386e79 s390/kexec: change return value of machine_kexec_prepare
Returning -ENOSYS on kexec_load() is a bad idea since user space cannot
tell if the system call is not implmented or if it failed.
Use -EOPNOTSUPP in case somebody tries a kexec_load on a NSS image based
kernel instead.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:14 +02:00
Heiko Carstens
a8f6db4d29 s390/etr,stp: use -EOPNOTSUPP instead of -ENOSYS
Change -ENOSYS to -EOPNOTSUPP. Return value is used only internally.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:14 +02:00
Heiko Carstens
caf757c609 s390/sysinfo,stsi: change return code handling
Change return code handling of the stsi() function:

In case function code 0 was specified the return value is the
current configuration level (already shifted). That way all
the code that actually copied the stsi_0() function can go
away.

Otherwise the return value is 0 (success) or negative to
indicate an error (currently only -EOPNOTSUPP).

Also stsi() is no longer an inline function. The function is
not performance critical, but every caller would generate an
exception table entry for this function.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:12 +02:00
Heiko Carstens
eb608fb366 s390/exceptions: switch to relative exception table entries
This is the s390 port of 70627654 "x86, extable: Switch to relative
exception table entries".
Reduces the size of our exception tables by 50% on 64 bit builds.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:10 +02:00
Sebastian Ott
c3e6d407c0 s390/scm: remove superfluous lock
Remove the spinlock from struct scm_device. drvdata and attributes
are guarded via device_lock.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:09 +02:00
Heiko Carstens
50ab9a9a60 s390/smp,topology: add polarization member to pcpu struct
The cpu polarization member is the only per cpu state that is not part
of the pcpu structure. So add it there and have everything in one place.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:09 +02:00
Heiko Carstens
fade4dc491 s390/sysinfo,topology: fix cpu topology maximum nesting detection
The maximum nesting of the cpu topology is evaluated when /proc/sysinfo
is the first time read. This happens without a lock and a concurrent
reader on a different cpu can see and use an invalid intermediate value.
Besides the fact that this race is quite unlikely the worst thing that
could happen is that /proc/sysinfo would contain bogus information about
the machine's cpu topology.
Nevertheless this should be fixed. So move the detection code to the
early machine detection code and since now the value is early available
use it in the topology code as well.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:08 +02:00
Gerald Schaefer
34cda99260 s390/mm: implement __get_user_pages_fast()
This patch introduces the __get_user_pages_fast() function on s390,
which will be needed with CONFIG_TRANSPARENT_HUGEPAGE and futex.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:08 +02:00
Heiko Carstens
25502f0015 s390/sysinfo: add additional z196 fields to output
Add a couple of missing fields that were introduced with z196.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:07 +02:00
Heiko Carstens
0facaa170a s390/sysinfo: convert /proc/sysinfo to seqfile
The current proc implementation of the /proc/sysinfo file writes all
informations contained in all system information blocks to a single
page.
This is done by calling sprintf all the time in the expectation that
everything will fit into a single page. This however is not necessarily
true if the configuration of a machine is very large.
So convert /proc/sysinfo to avoid writing into random memory regions.

For readability reasons a couple of lines are longer than 80 characters.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:07 +02:00
Heiko Carstens
7860913279 s390/topology: remove sysinfo header include, add forward declaration instead
Any change to sysinfo.h causes a whole kernel recompile since sysinfo.h is
included by topology.h, which again is used nearly everywhere.
So remove that include and add a forward declaration instead.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:06 +02:00
Heiko Carstens
bad4c9c82d s390/mm: shorten addressing mode initialization
Shorten the code for addressing mode initialization. Also add missing
__init annotations, since this code is only used during kernel initialization.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:06 +02:00
Heiko Carstens
d1b0d842c4 s390/mm: rename addressing_mode to s390_user_mode
Renaming the globally visible variable "user_mode" to "addressing_mode" in
order to fix a name clash was not a good idea. (Commit 37fe1d73 "s390/mm:
rename user_mode variable to addressing_mode")
Looking at the code after a couple of weeks one thinks: addressing mode of
what?
So rename the variable again. This time to s390_user_mode. Which hopefully
makes more sense.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:05 +02:00
Heiko Carstens
3d5787c997 s390/mm: change default addressing mode
Change the default addressing mode so that user space runs in primary space
and the kernel runs in home space.
In addition remove the "switch_amode" kernel parameter so all users who
already specified they want the new default behaviour will stay in the
"switched" mode instead of in the opposite they intended.
If there is a need to switch addressing modes, this can be done with the
"user_mode" kernel parameter: user_mode=home

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:04 +02:00
Heiko Carstens
c5e3acd666 s390/smp: avoid concurrent write to sigp status field
When a sigp instruction is issued it may store a status. This status is
currently stored in a per cpu field of the target cpu.
If multiple cpus issue a sigp instruction with the same target cpu
and a status is stored the result is not necessarily as expected.

Currently this is not an issue:
- on cpu hotplug no sigps, except "restart" and "sense" are sent to the
  target cpu.
- on external call we don't look at the status if it is stored
- on sense running the condition code "status stored" is sufficient to
  tell if a cpu is running or not

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:04 +02:00
Heiko Carstens
fbf3c54239 s390/processor: use ARRAY_SIZE instead of hard coded value
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:03 +02:00
Heiko Carstens
6668022c7b s390/cache: add cpu cache information to /proc/cpuinfo
Add a line for each cpu cache to /proc/cpuinfo.
Since we only have information of private cpu caches in sysfs we
add a line for each cpu cache in /proc/cpuinfo which will also
contain information about shared caches.

For a z196 machine /proc/cpuinfo now looks like:

vendor_id       : IBM/S390
bogomips per cpu: 14367.00
features        : esan3 zarch stfle msa ldisp eimm dfp etf3eh highgprs
cache0          : level=1 type=Data scope=Private size=64K line_size=256 associativity=4
cache1          : level=1 type=Instruction scope=Private size=128K line_size=256 associativity=8
cache2          : level=2 type=Unified scope=Private size=1536K line_size=256 associativity=12
cache3          : level=3 type=Unified scope=Shared size=24576K line_size=256 associativity=12
cache4          : level=4 type=Unified scope=Shared size=196608K line_size=256 associativity=24
processor 0: version = FF,  identification = 000123,  machine = 2817
processor 1: version = FF,  identification = 100123,  machine = 2817
processor 2: version = FF,  identification = 200123,  machine = 2817
processor 3: version = FF,  identification = 200123,  machine = 2817

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:03 +02:00
Martin Schwidefsky
d35339a42d s390: add support for transactional memory
Allow user-space processes to use transactional execution (TX).
If the TX facility is available user space programs can use
transactions for fine-grained serialization based on the data
objects that are referenced during a transaction. This is
useful for lockless data structures and speculative compiler
optimizations.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:02 +02:00
Jan Glauber
e4b8b3f33f s390: add support for runtime instrumentation
Allow user-space threads to use runtime instrumentation (RI). To enable RI
for a thread there is a new s390 specific system call, sys_s390_runtime_instr,
that takes as parameter a realtime signal number. If the RI facility is
available the system call sets up a control block for the calling thread with
the appropriate permissions for the thread to modify the control block.

The user-space thread can then use the store and modify RI instructions to
alter the control block and start/stop the instrumentation via RION/RIOFF.

If the user specified program buffer runs full RI triggers an external
interrupt. The external interrupt is translated to a real-time signal that
is delivered to the thread that enabled RI on that CPU. The number of
the real-time signal is the number specified in the RI system call. So,
user-space can select any available real-time signal number in case the
application itself uses real-time signals for other purposes.

The kernel saves the RI control blocks on task switch only if the running
thread was enabled for RI. Therefore, the performance impact on task switch
should be negligible if RI is not used.

RI is only enabled for user-space mode and is disabled for the supervisor
state.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:02 +02:00
Sebastian Ott
2e73c2cf78 s390/eadm_sch: add support for irq statistics
Add support for EADM interrupt statistics in /proc/interrupts.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:00 +02:00
Sebastian Ott
eadb86ab80 s390/cio: add eadm subchannel driver
This driver allows usage of EADM subchannels. EADM subchannels
act as a communication vehicle for SCM increments.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:00 +02:00
Sebastian Ott
40ff4cc066 s390: add scm notification
Detect an scm change notification in store event information.
Update affected scm devices and notify their drivers.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:59 +02:00
Sebastian Ott
1d1c8f78be s390: add scm bus driver
Bus driver to manage Storage Class Memory.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:59 +02:00
Sebastian Ott
d2fc439b99 s390: add eadm related structures
Add structures to be used by the eadm subchannel driver.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:57 +02:00
Sebastian Ott
382b736635 s390: add eadm facility bits
Add the eadm facility bits to the css characteristics and move
them to a new header.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:57 +02:00
Heiko Carstens
48a8ca03f8 s390/kexec: move machine_crash_shutdown() to machine_kexec.c
machine_crash_shutdown() was the only function in crash.c.
So move the function and delete one file.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:54 +02:00
Heiko Carstens
1c725922dd s390/cpu hotplug: mask out CPU_TASKS_FROZEN in cu hotplug notifiers
Unify all our cpu hotplug notifiers to mask out the CPU_TASKS_FROZEN
bit, so we don't have to add all the *_FROZEN variant cases to the
notifiers.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:53 +02:00
Heiko Carstens
0d0e471b46 s390/smp: fix smp_find_processor_id() argument mismatch
For SMP and !SMP smp_find_processor_id() either takes a u16 or
an unsigned int argument. Fix this so both versions take a u16.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:53 +02:00
Heiko Carstens
7755d6b2c0 s390/cpu hotplug: use hotcpu_notifier() instead of register_cpu_notifier()
Saves a couple of lines of code.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:52 +02:00
Cornelia Huck
bdd1fc27cd s390/kvm: Improve Kconfig help text.
CONFIG_S390_GUEST determines whether virtio guest drivers are built, and
the virtio console is not enforced as default.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:52 +02:00
Jan Glauber
843c48fd0f s390/kconfig: split the s390 base menu
Split the base menu into more descriptive categories like processor,
memory, etc. and move virtualization related entries to the
Virtualization menu.
Also select KEXEC unconditionally.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:51 +02:00
Heiko Carstens
881730ad36 s390/cache: expose cpu cache topology via sysfs
Expose cpu cache topology via sysfs.
The created sysfs directory structure is compatible to what x86, ia64
and powerpc have.
On s390 we expose only information about cpu caches which are private
to a cpu via sysfs . Caches which are shared between cpus do not have
a sysfs representation.
The reason for that is that the file "shared_cpu_map" is mandatory
and only if running under LPAR it is possible to tell which cpus
share which cache. Second level hypervisors however do not and cannot
expose that information to guests.
In order to have a consistent view we made the choice to always only
expose information about private cpu caches via sysfs.

Example for a z196 cpu (cpu1 in /sys/devices/cpu):

cpu1/cache/index0/size -- 64K
cpu1/cache/index0/type -- Data
cpu1/cache/index0/level -- 1
cpu1/cache/index0/number_of_sets -- 64
cpu1/cache/index0/shared_cpu_map -- 00000000,00000002
cpu1/cache/index0/shared_cpu_list -- 1
cpu1/cache/index0/coherency_line_size -- 256
cpu1/cache/index0/ways_of_associativity -- 4
cpu1/cache/index1/size -- 128K
cpu1/cache/index1/type -- Instruction
cpu1/cache/index1/level -- 1
cpu1/cache/index1/number_of_sets -- 64
cpu1/cache/index1/shared_cpu_map -- 00000000,00000002
cpu1/cache/index1/shared_cpu_list -- 1
cpu1/cache/index1/coherency_line_size -- 256
cpu1/cache/index1/ways_of_associativity -- 8
cpu1/cache/index2/size -- 1536K
cpu1/cache/index2/type -- Unified
cpu1/cache/index2/level -- 2
cpu1/cache/index2/number_of_sets -- 512
cpu1/cache/index2/shared_cpu_map -- 00000000,00000002
cpu1/cache/index2/shared_cpu_list -- 1
cpu1/cache/index2/coherency_line_size -- 256
cpu1/cache/index2/ways_of_associativity -- 12

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:51 +02:00
Gerald Schaefer
648609e3f2 s390: enable large page support with CONFIG_DEBUG_PAGEALLOC
So far, large page support was completely disabled with
CONFIG_DEBUG_PAGEALLOC, although it would be sufficient if only the
large page kernel mapping was disabled. This patch enables large page
support with CONFIG_DEBUG_PAGEALLOC, while it prevents the large kernel
mapping in that case.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:50 +02:00
Heiko Carstens
535c611ddd s390/string: provide asm lib functions for memcpy and memcmp
Our memcpy and memcmp variants were implemented by calling the corresponding
gcc builtin variants.
However gcc is free to replace a call to __builtin_memcmp with a call to memcmp
which, when called, will result in an endless recursion within memcmp.
So let's provide asm variants and also fix the variants that are used for
uncompressing the kernel image.
In addition remove all other occurences of builtin function calls.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:50 +02:00
Heiko Carstens
68d9884dbc s390/bpf,jit: improve code generation
Make use of new immediate instructions that came with the
extended immediate and general instruction extension facilities.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:49 +02:00
Martin Schwidefsky
c10302efe5 s390/bpf,jit: BPF Just In Time compiler for s390
The s390 implementation of the JIT compiler for packet filter speedup.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:49 +02:00
Frederic Weisbecker
2b1d5024e1 rcu: Settle config for userspace extended quiescent state
Create a new config option under the RCU menu that put
CPUs under RCU extended quiescent state (as in dynticks
idle mode) when they run in userspace. This require
some contribution from architectures to hook into kernel
and userspace boundaries.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alessio Igor Bogani <abogani@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-26 15:44:04 +02:00
Frederic Weisbecker
c416ddf5b9 x86: Unspaghettize do_trap()
Cleanup the label maze in this function. Having a
seperate function to first handle the traps that don't
generate a signal makes it easier to convert into
more readable conditional paths.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1348577479-2564-1-git-send-email-fweisbec@gmail.com
[ Fixed 32-bit build failure. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-26 13:36:50 +02:00
Tao Guo
1b2b23d857 x86_64: Work around old GAS bug
GAS in binutils(2.16.91) could not parse parentheses within
macro parameters unless fully parenthesized, and this is a
workaround to make old gas work without generating below errors:

 arch/x86/kernel/entry_64.S: Assembler messages:
 arch/x86/kernel/entry_64.S:387: Error: too many positional arguments
 arch/x86/kernel/entry_64.S:389: Error: too many positional arguments
 [...]

Signed-off-by: Tao Guo <glorioustao@gmail.com>
Reluctantly-Acked-by: Jan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1348648102-12653-1-git-send-email-glorioustao@gmail.com
[ Jan argues that these old GAS versions are fragile - which is so, but lets give them a chance. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-26 13:35:32 +02:00
Yasuaki Ishimatsu
57c078ce13 x86/api: Rename mp_register_lapic in a comment
Commit 31d2092eb0 ("x86: move
mp_register_lapic_address to boot.c") renamed mp_register_lapic
to acpi_register_lapic. But mp_register_lapic remains in a
comment. So the patch rename it.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/50625239.3050403@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-26 13:29:36 +02:00
Michael Wang
dec08a837f x86: Remove the useless branch in c_start()
Since 'cpu == -1' in cpumask_next() is legal, no need to handle
'*pos == 0' specially.

About the comments:

	/* just in case, cpu 0 is not the first */

A test with a cpumask in which cpu 0 is not the first has been
done, and it works well.

This patch will remove that useless branch to clean the code.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Cc: kjwinchester@gmail.com
Cc: borislav.petkov@amd.com
Cc: ak@linux.intel.com
Link: http://lkml.kernel.org/r/1348033343-23658-1-git-send-email-wangyun@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-26 13:27:56 +02:00
Linus Torvalds
6f0f9b6b3f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull more networking fixes from David Miller:

 1) Eric Dumazet discovered and fixed what turned out to be a family of
    bugs.  These functions were using pskb_may_pull() which might need
    to reallocate the linear SKB data buffer, but the callers were not
    expecting this possibility.  The callers have cached pointers to the
    packet header areas, and would need to reload them if we were to
    continue using pskb_may_pull().

    So they could end up reading garbage.

    It's easier to just change these RAW4/RAW6/MIP6 routines to use
    skb_header_pointer() instead of pskb_may_pull(), which won't modify
    the linear SKB data area.

 2) Dave Jone's syscall spammer caught a case where a non-TCP socket can
    call down into the TCP keepalive code.  The case basically involves
    creating a raw socket with sk_protocol == IPPROTO_TCP, then calling
    setsockopt(sock_fd, SO_KEEPALIVE, ...)

    Fixed by Eric Dumazet.

 3) Bluetooth devices do not get configured properly while being powered
    on, resulting in always using legacy pairing instead of SSP.  Fix
    from Andrzej Kaczmarek.

 4) Bluetooth cancels delayed work erroneously, put stricter checks in
    place.  From Andrei Emeltchenko.

 5) Fix deadlock between cfg80211_mutex and reg_regdb_search_mutex in
    cfg80211, from Luis R.  Rodriguez.

 6) Fix interrupt double release in iwlwifi, from Emmanuel Grumbach.

 7) Missing module license in bcm87xx driver, from Peter Huewe.

 8) Team driver can lose port changed events when adding devices to a
    team, fix from Jiri Pirko.

 9) Fix endless loop when trying ot unregister PPPOE device in zombie
    state, from Xiaodong Xu.

10) batman-adv layer needs to set MAC address of software device
    earlier, otherwise we call tt_local_add with it uninitialized.

11) Fix handling of KSZ8021 PHYs, it's matched currently by KS8051 but
    that doesn't program the device properly.  From Marek Vasut.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  ipv6: mip6: fix mip6_mh_filter()
  ipv6: raw: fix icmpv6_filter()
  net: guard tcp_set_keepalive() to tcp sockets
  phy/micrel: Add missing header to micrel_phy.h
  phy/micrel: Rename KS80xx to KSZ80xx
  phy/micrel: Implement support for KSZ8021
  batman-adv: Fix symmetry check / route flapping in multi interface setups
  batman-adv: Fix change mac address of soft iface.
  pppoe: drop PPPOX_ZOMBIEs in pppoe_release
  team: send port changed when added
  ipv4: raw: fix icmp_filter()
  net/phy/bcm87xx: Add MODULE_LICENSE("GPL") to GPL driver
  iwlwifi: don't double free the interrupt in failure path
  cfg80211: fix possible circular lock on reg_regdb_search()
  Bluetooth: Fix not removing power_off delayed work
  Bluetooth: Fix freeing uninitialized delayed works
  Bluetooth: mgmt: Fix enabling LE while powered off
  Bluetooth: mgmt: Fix enabling SSP while powered off
2012-09-25 14:20:29 -07:00
Paul E. McKenney
593d1006cd Merge remote-tracking branch 'tip/core/rcu' into next.2012.09.25b
Resolved conflict in kernel/sched/core.c using Peter Zijlstra's
approach from https://lkml.org/lkml/2012/9/5/585.
2012-09-25 10:03:56 -07:00
Frederic Weisbecker
fdf9c35650 cputime: Make finegrained irqtime accounting generally available
There is no known reason for this option to be unavailable on other
archs than x86. They just need to call enable_sched_clock_irqtime()
if they have a sufficiently finegrained clock to make it working.

Move it to the general option and let the user choose between
it and pure tick based or virtual cputime accounting.

Note that virtual cputime accounting already performs a finegrained
irqtime accounting. CONFIG_IRQ_TIME_ACCOUNTING is a kind of middle ground
between tick and virtual based accounting. So CONFIG_IRQ_TIME_ACCOUNTING
and CONFIG_VIRT_CPU_ACCOUNTING are mutually exclusive choices.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2012-09-25 16:01:36 +02:00
Frederic Weisbecker
9dc16f64e8 ia64: Reuse system and user vtime accounting functions on task switch
To avoid code duplication.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2012-09-25 15:42:59 +02:00
Frederic Weisbecker
5bf412cd76 ia64: Consolidate user vtime accounting
Factorize the code that accounts user time into a
single function to avoid code duplication.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2012-09-25 15:42:57 +02:00
Frederic Weisbecker
a7e1a9e3af vtime: Consolidate system/idle context detection
Move the code that finds out to which context we account the
cputime into generic layer.

Archs that consider the whole time spent in the idle task as idle
time (ia64, powerpc) can rely on the generic vtime_account()
and implement vtime_account_system() and vtime_account_idle(),
letting the generic code to decide when to call which API.

Archs that have their own meaning of idle time, such as s390
that only considers the time spent in CPU low power mode as idle
time, can just override vtime_account().

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2012-09-25 15:42:37 +02:00
Frederic Weisbecker
bf9fae9f5e cputime: Use a proper subsystem naming for vtime related APIs
Use a naming based on vtime as a prefix for virtual based
cputime accounting APIs:

- account_system_vtime() -> vtime_account()
- account_switch_vtime() -> vtime_task_switch()

It makes it easier to allow for further declension such
as vtime_account_system(), vtime_account_idle(), ... if we
want to find out the context we account to from generic code.

This also make it better to know on which subsystem these APIs
refer to.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2012-09-25 15:31:31 +02:00
Paul E. McKenney
bda4ec9f6a Merge branches 'bigrt.2012.09.23a', 'doctorture.2012.09.23a', 'fixes.2012.09.23a', 'hotplug.2012.09.23a' and 'idlechop.2012.09.23a' into HEAD
bigrt.2012.09.23a contains additional commits to reduce scheduling latency
	from RCU on huge systems (many hundrends or thousands of CPUs).

doctorture.2012.09.23a contains documentation changes and rcutorture fixes.

fixes.2012.09.23a contains miscellaneous fixes.

hotplug.2012.09.23a contains CPU-hotplug-related changes.

idle.2012.09.23a fixes architectures for which RCU no longer considered
	the idle loop to be a quiescent state due to earlier
	adaptive-dynticks changes.  Affected architectures are alpha,
	cris, frv, h8300, m32r, m68k, mn10300, parisc, score, xtensa,
	and ia64.
2012-09-24 20:02:22 -07:00
Bjorn Helgaas
78c8f84302 Merge branch 'pci/yinghai-misc' into next 2012-09-24 17:24:11 -06:00
Linus Torvalds
56d27adcb5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull tile gxio ABI fix from Chris Metcalf:
 "This fixes a last-minute change in the Tilera hypervisor ABI for TRIO
  (PCI root complex) support.  We've locked in this ABI going forward
  and will make sure no further ABI changes like this occur."

* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: gxio iorpc numbering change for TRIO interface
2012-09-24 16:17:17 -07:00
Linus Torvalds
0c59f23613 * Disable PV NUMA support as we do not do anything with it (yet)
and it can cause bootup crashes on certain AMD machines.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJQYGDkAAoJEFjIrFwIi8fJrR0H/2iRt3nH3KeG6S2l2UaSvJuH
 BqtUSFFtMxKwAc9WC8gx9lc6y9HFig1SThzWuWJulNpF50QnBp38+OuMzEespoUN
 JLJtIp/jlYFZL5w2K7soXq7e0elbWTainPWvz5qJE7RifcnclAOGGrf/0LEVf/FQ
 xCjn9MLDq5WzbkwuA7TPMDb9RSD3ZJThMShj82kziwpTaniJCpl4YY0lVYUfknXo
 t2NTV6Ze6mkzU5QvxSA2ZJt89vsFXQNpsvUK0WzZfLmzugXqnqQHstaVti64ovZc
 e64PW61PIRhaZMCDuieclMXWbXFjkp4AsEQJLv3K/kojG20xJ/X08nmolnl64vw=
 =t4AN
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.6-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull a Xen fix from Konrad Rzeszutek Wilk:
 "It is a bug-fix when we run the initial PV guest on a AMD K8 machine
  and have CONFIG_AMD_NUMA enabled and detect the NUMA topology from the
  Northbridge.

  We end up in the situation where the initial domain gets too much
  information and gets confused and crashes - the fix is to restrict the
  domain to get the information - and we do it by just disabling NUMA on
  the PV guest (the hypervisor is still able to do its proper NUMA
  allocations of guests).

  It is OK to disable the PV guest from accessing NUMA data as right now
  we do not inject any NUMA node information to the PV guests.  When we
  do get to that point, then this patch will have to be reverted."

 * Disable PV NUMA support as we do not do anything with it (yet) and it
   can cause bootup crashes on certain AMD machines.

* tag 'stable/for-linus-3.6-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/boot: Disable NUMA for PV guests.
2012-09-24 16:14:34 -07:00
Marek Vasut
510d573fef phy/micrel: Rename KS80xx to KSZ80xx
There is no such part as KS8001, KS8041 or KS8051. There are only
KSZ8001, KSZ8041 and KSZ8051. Rename these parts as such to match
the Micrel naming.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David J. Choi <david.choi@micrel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Cc: Linux ARM kernel <linux-arm-kernel@lists.infradead.org>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Cc: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-24 15:54:33 -04:00
Chris Metcalf
e70cf54073 tile: gxio iorpc numbering change for TRIO interface
An ABI numbering change was made in the hypervisor for Tilera's 4.1
MDE release (just shipped).  It's incompatible with the previous 4.0
release ABI numbering, so we track the new numbering going forward.
We plan to avoid modifying ABI numbering for these interfaces again.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-09-24 15:11:53 -04:00
Ingo Molnar
f74eb72868 perf/core improvements and fixes:
. Convert the trace builtins to use the growing evsel/evlist
   tracepoint infrastructure, removing several open coded constructs
   like switch like series of strcmp to dispatch events, etc.
   Basically what had already been showcased in 'perf sched'.
 
 . Add evsel constructor for tracepoints, that uses libtraceevent
   just to parse the /format events file, use it in a new 'perf test'
   to make sure the libtraceevent format parsing regressions can
   be more readily caught.
 
 . Some strange errors were happening in some builds, but not on the
   next, reported by several people, problem was some parser related
   files, generated during the build, didn't had proper make deps,
   fix from Eric Sandeen.
 
 . Fix some compiling errors on 32-bit, from Feng Tang.
 
 . Don't use sscanf extension %as, not available on bionic, reimplementation
   by Irina Tirdea.
 
 . Fix bfd.h/libbfd detection with recent binutils, from Markus Trippelsdorf.
 
 . Introduce struct and cache information about the environment where a
   perf.data file was captured, from Namhyung Kim.
 
 . Fix several error paths in libtraceevent, from Namhyung Kim.
 
   Print event causing perf_event_open() to fail in 'perf record',
   from Stephane Eranian.
 
 . New 'kvm' analysis tool, from Xiao Guangrong.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJQYIGnAAoJENZQFvNTUqpApNIP/2yUe3kfZ6Sz5rVwVdxyDBQN
 Y0sv6IwtT31G6WA8o8fvIKylMiTn0XMt5TG3e57ebw01lHH/xQAkQs7EmpmGclIT
 qOE5a1JwD8eyTMhBL92dt9LUHEq103pW/oBS1d6TeQ9XyEh+HThv+wVT/yRKneu6
 IHV7b/8fC++v80bpsWLr8S3+p9OXMclsjR7IfDCeJnicxyHb/yWfRVbQiBwGk5/v
 IfKc0uat60+1qAy7GwBM/nVDJqzekrPI76krNP8tVdftxBjatpXS+VBKiYfo352g
 nr3Gpg2MWEY2oRzkN6jCj5yQAVTxa2OrAYY/I8dSJWkTHXL7UbzNzatNpVpvS0we
 0wFP5gEumyuYE1YwjsR14ICSOJdMaO0pBYO4YMnqXsXGiS0JEM+2o+2bL2Nl9EDz
 LEGWQGWVC23Tu3P1SaHgdb2YnQLZ18AVBwBljESrLvCf4leC/RWoA3HvJ4NOhook
 08ee24r57SLLe7z8W3fBPRsslpmoZnhnLHES/N/qf2y5u4Ig9IkafMS2ARVSwgPX
 6u0uVIf48YQGAlv8p3cNsYa2PITB3R04Nm5GB2KvTxplwuRD1jYqPpR4BaKrXe+s
 SBVIxgDygglY+husTYLBWMBYswWhu5ECVobdfxtrQbQPt+v9hxf5S8UhAbu3AVcV
 IocrCPg3REvuxPwwPmY9
 =FgCY
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

 * Convert the trace builtins to use the growing evsel/evlist
   tracepoint infrastructure, removing several open coded constructs
   like switch like series of strcmp to dispatch events, etc.
   Basically what had already been showcased in 'perf sched'.

 * Add evsel constructor for tracepoints, that uses libtraceevent
   just to parse the /format events file, use it in a new 'perf test'
   to make sure the libtraceevent format parsing regressions can
   be more readily caught.

 * Some strange errors were happening in some builds, but not on the
   next, reported by several people, problem was some parser related
   files, generated during the build, didn't had proper make deps,
   fix from Eric Sandeen.

 * Fix some compiling errors on 32-bit, from Feng Tang.

 * Don't use sscanf extension %as, not available on bionic, reimplementation
   by Irina Tirdea.

 * Fix bfd.h/libbfd detection with recent binutils, from Markus Trippelsdorf.

 * Introduce struct and cache information about the environment where a
   perf.data file was captured, from Namhyung Kim.

 * Fix several error paths in libtraceevent, from Namhyung Kim.

   Print event causing perf_event_open() to fail in 'perf record',
   from Stephane Eranian.

 * New 'kvm' analysis tool, from Xiao Guangrong.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-24 20:55:26 +02:00
Mark Salter
b02d617585 c6x: use asm-generic/barrier.h
A recent patch in the linux-next tree caused a build failure on
C6X because C6X didn't define a read_barrier_depends() macro. C6X
does not support SMP and the architecture doesn't provide any
special memory ordering instructions, so it makes sense to just
use the generic barrier.h rather than patching the existing c6x
specific header.

Signed-off-by: Mark Salter <msalter@redhat.com>
2012-09-24 14:39:36 -04:00
Catalin Marinas
0d0109a440 arm64: Do not set the SMP/nAMP processor bit
If such bit exists on a given CPU, it must be set by the firmware or
boot-loader prior to starting the kernel (see
Documentation/arm64/booting.txt).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2012-09-24 18:10:29 +01:00
Konrad Rzeszutek Wilk
8d54db795d xen/boot: Disable NUMA for PV guests.
The hypervisor is in charge of allocating the proper "NUMA" memory
and dealing with the CPU scheduler to keep them bound to the proper
NUMA node. The PV guests (and PVHVM) have no inkling of where they
run and do not need to know that right now. In the future we will
need to inject NUMA configuration data (if a guest spans two or more
NUMA nodes) so that the kernel can make the right choices. But those
patches are not yet present.

In the meantime, disable the NUMA capability in the PV guest, which
also fixes a bootup issue. Andre says:

"we see Dom0 crashes due to the kernel detecting the NUMA topology not
by ACPI, but directly from the northbridge (CONFIG_AMD_NUMA).

This will detect the actual NUMA config of the physical machine, but
will crash about the mismatch with Dom0's virtual memory. Variation of
the theme: Dom0 sees what it's not supposed to see.

This happens with the said config option enabled and on a machine where
this scanning is still enabled (K8 and Fam10h, not Bulldozer class)

We have this dump then:
NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10
Scanning NUMA topology in Northbridge 24
Number of physical nodes 4
Node 0 MemBase 0000000000000000 Limit 0000000040000000
Node 1 MemBase 0000000040000000 Limit 0000000138000000
Node 2 MemBase 0000000138000000 Limit 00000001f8000000
Node 3 MemBase 00000001f8000000 Limit 0000000238000000
Initmem setup node 0 0000000000000000-0000000040000000
  NODE_DATA [000000003ffd9000 - 000000003fffffff]
Initmem setup node 1 0000000040000000-0000000138000000
  NODE_DATA [0000000137fd9000 - 0000000137ffffff]
Initmem setup node 2 0000000138000000-00000001f8000000
  NODE_DATA [00000001f095e000 - 00000001f0984fff]
Initmem setup node 3 00000001f8000000-0000000238000000
Cannot find 159744 bytes in node 3
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96
Pid: 0, comm: swapper Not tainted 3.3.6 #1 AMD Dinar/Dinar
RIP: e030:[<ffffffff81d220e6>]  [<ffffffff81d220e6>] __alloc_bootmem_node+0x43/0x96
.. snip..
  [<ffffffff81d23024>] sparse_early_usemaps_alloc_node+0x64/0x178
  [<ffffffff81d23348>] sparse_init+0xe4/0x25a
  [<ffffffff81d16840>] paging_init+0x13/0x22
  [<ffffffff81d07fbb>] setup_arch+0x9c6/0xa9b
  [<ffffffff81683954>] ? printk+0x3c/0x3e
  [<ffffffff81d01a38>] start_kernel+0xe5/0x468
  [<ffffffff81d012cf>] x86_64_start_reservations+0xba/0xc1
  [<ffffffff81007153>] ? xen_setup_runstate_info+0x2c/0x36
  [<ffffffff81d050ee>] xen_start_kernel+0x565/0x56c
"

so we just disable NUMA scanning by setting numa_off=1.

CC: stable@vger.kernel.org
Reported-and-Tested-by: Andre Przywara <andre.przywara@amd.com>
Acked-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-09-24 08:47:20 -04:00
Sachin Kamat
ec10665cbf ARM: dma-mapping: Fix potential memory leak in atomic_pool_init()
When either of __alloc_from_contiguous or __alloc_remap_buffer fails
to provide a valid pointer, allocated memory is freed up and an error
is returned. 'pages' was however not freed before returning error.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-09-24 08:35:03 +02:00
Linus Torvalds
56bae80268 Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild fixes from Michal Marek:
 "There are two more kbuild fixes for 3.6.

  One fixes a race between x86's archscripts target and the rule
  (re)building scripts/basic/fixdep.  The second is a fix for the
  previous attempt at fixing make firmware_install with make 3.82.
  This new solution should work with any version of GNU make"

* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  x86/kbuild: archscripts depends on scripts_basic
  firmware: fix directory creation rule matching with make 3.80
2012-09-23 15:40:58 -07:00
Paul E. McKenney
93482f4ef1 ia64: Add missing RCU idle APIs on idle loop
Traditionally, the entire idle task served as an RCU quiescent state.
But when RCU read side critical sections started appearing within the
idle loop, this traditional strategy became untenable.  The fix was to
create new RCU APIs named rcu_idle_enter() and rcu_idle_exit(), which
must be called by each architecture's idle loop so that RCU can tell
when it is safe to ignore a given idle CPU.

Unfortunately, this fix was never applied to ia64, a shortcoming remedied
by this commit.

Reported by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:52 -07:00
Frederic Weisbecker
11ad47a0ed xtensa: Add missing RCU idle APIs on idle loop
In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the xtensa's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:51 -07:00
Frederic Weisbecker
0ee23fda59 score: Add missing RCU idle APIs on idle loop
In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in scores's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:50 -07:00
Frederic Weisbecker
fbe752188d parisc: Add missing RCU idle APIs on idle loop
In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the parisc's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Parisc <linux-parisc@vger.kernel.org>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:49 -07:00
Frederic Weisbecker
5b0753a90b mn10300: Add missing RCU idle APIs on idle loop
In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the mn10300's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:48 -07:00
Frederic Weisbecker
5b57ba37e8 m68k: Add missing RCU idle APIs on idle loop
In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the m68k's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: m68k <linux-m68k@lists.linux-m68k.org>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:47 -07:00
Frederic Weisbecker
48ae077cfc m32r: Add missing RCU idle APIs on idle loop
In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the m32r's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:47 -07:00
Frederic Weisbecker
b2fe1430d4 h8300: Add missing RCU idle APIs on idle loop
In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the h8300's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:46 -07:00
Frederic Weisbecker
41d8fe5bb3 frv: Add missing RCU idle APIs on idle loop
In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the Frv's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org> # 3.3+
Acked-by: David Howells <dhowells@redhat.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:45 -07:00
Frederic Weisbecker
c633f9e788 cris: Add missing RCU idle APIs on idle loop
In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the Cris's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Cris <linux-cris-kernel@axis.com>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:44 -07:00
Frederic Weisbecker
4c94cada48 alpha: Add missing RCU idle APIs on idle loop
In the old times, the whole idle task was considered
as an RCU quiescent state. But as RCU became more and
more successful overtime, some RCU read side critical
section have been added even in the code of some
architectures idle tasks, for tracing for example.

So nowadays, rcu_idle_enter() and rcu_idle_exit() must
be called by the architecture to tell RCU about the part
in the idle loop that doesn't make use of rcu read side
critical sections, typically the part that puts the CPU
in low power mode.

This is necessary for RCU to find the quiescent states in
idle in order to complete grace periods.

Add this missing pair of calls in the Alpha's idle loop.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Michael Cree <mcree@orcon.net.nz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: alpha <linux-alpha@vger.kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # 3.3+
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:43 -07:00
Frederic Weisbecker
6a6c0272f1 alpha: Fix preemption handling in idle loop
cpu_idle() is called on the boot CPU by the init code with
preemption disabled. But the cpu_idle() function in alpha
doesn't handle this when it calls schedule() directly.

Fix it by converting it into schedule_preempt_disabled().

Also disable preemption before calling cpu_idle() from
secondary CPU entry code to stay consistent with this
state.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Michael Cree <mcree@orcon.net.nz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: alpha <linux-alpha@vger.kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:44:42 -07:00
Silas Boyd-Wickizer
429227bbe5 Use get_online_cpus to avoid races involving CPU hotplug
If arch/x86/kernel/cpuid.c is a module, a CPU might offline or online
between the for_each_online_cpu() loop and the call to
register_hotcpu_notifier in cpuid_init or the call to
unregister_hotcpu_notifier in cpuid_exit.  The potential races can
lead to leaks/duplicates, attempts to destroy non-existant devices, or
random pointer dereferences.

For example, in cpuid_exit if:

        for_each_online_cpu(cpu)
                cpuid_device_destroy(cpu);
        class_destroy(cpuid_class);
        __unregister_chrdev(CPUID_MAJOR, 0, NR_CPUS, "cpu/cpuid");
        <----- CPU onlines
        unregister_hotcpu_notifier(&cpuid_class_cpu_notifier);

the hotcpu notifier will attempt to create a device for the
cpuid_class, which the module already destroyed.

This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.

Tested on a VM.

Signed-off-by: Silas Boyd-Wickizer <sbw@mit.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-09-23 07:43:56 -07:00
Silas Boyd-Wickizer
a2db672aa3 Use get_online_cpus to avoid races involving CPU hotplug
If arch/x86/kernel/msr.c is a module, a CPU might offline or online
between the for_each_online_cpu(i) loop and the call to
register_hotcpu_notifier in msr_init or the call to
unregister_hotcpu_notifier in msr_exit. The potential races can lead
to leaks/duplicates, attempts to destroy non-existant devices, or
random pointer dereferences.

For example, in msr_init if:

        for_each_online_cpu(i) {
                err = msr_device_create(i);
                if (err != 0)
                        goto out_class;
        }
        <----- CPU offlines
        register_hotcpu_notifier(&msr_class_cpu_notifier);

and the CPU never onlines before msr_exit, then the module will never
call msr_device_destroy for the associated CPU.

This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.

Tested on a VM.

Signed-off-by: Silas Boyd-Wickizer <sbw@mit.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-09-23 07:43:56 -07:00
Linus Torvalds
e5e77cf9f9 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "Random fixes across arch/mips, essentially.

  One fix for an issue in get_user_pages_fast() which previously was
  discovered on x86, a miscalculation in the support for the MIPS MT
  hardware multithreading support, the RTC support for the Malta and a
  fix for a spurious interrupt issue that seems to bite only very
  special Malta configurations."

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Malta: Don't crash on spurious interrupt.
  MIPS: Malta: Remove RTC Data Mode bootstrap breakage
  MIPS: mm: Add compound tail page _mapcount when mapped
  MIPS: CMP/SMTC: Fix tc_id calculation
2012-09-22 12:47:53 -07:00
Linus Torvalds
b3a297d15b Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM and clkdev fixes from Russell King:
 "Two patches for clkdev which resolve the long standing issue that the
  devm_* versions were dependent on clkdev, which they shouldn't have
  been.  Instead, they're dependent on HAVE_CLK instead, which implies
  that you're providing clk_get() and clk_put().

  A small fix to the ARM decompressor to ensure that the page tables are
  properly interpreted by the CPU, and reserve syscall 378 for kcmp (the
  checksyscalls.sh script is unfortunately currently broken so arch
  maintainers aren't getting notified of new syscalls...)

  Lastly, a larger fix for an issue between the common clk subsystem and
  smp_twd which causes warnings to be spat out."

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: reserve syscall 378 for kcmp
  ARM: 7535/1: Reprogram smp_twd based on new common clk framework notifiers
  ARM: 7537/1: clk: Fix release in devm_clk_put()
  ARM: 7532/1: decompressor: reset SCTLR.TRE for VMSA ARMv7 cores
  ARM: 7534/1: clk: Make the managed clk functions generically available
2012-09-22 12:40:16 -07:00
Suresh Siddha
b1a74bf821 x86, kvm: fix kvm's usage of kernel_fpu_begin/end()
Preemption is disabled between kernel_fpu_begin/end() and as such
it is not a good idea to use these routines in kvm_load/put_guest_fpu()
which can be very far apart.

kvm_load/put_guest_fpu() routines are already called with
preemption disabled and KVM already uses the preempt notifier to save
the guest fpu state using kvm_put_guest_fpu().

So introduce __kernel_fpu_begin/end() routines which don't touch
preemption and use them instead of kernel_fpu_begin/end()
for KVM's use model of saving/restoring guest FPU state.

Also with this change (and with eagerFPU model), fix the host cr0.TS vm-exit
state in the case of VMX. For eagerFPU case, host cr0.TS is always clear.
So no need to worry about it. For the traditional lazyFPU restore case,
change the cr0.TS bit for the host state during vm-exit to be always clear
and cr0.TS bit is set in the __vmx_load_host_state() when the FPU
(guest FPU or the host task's FPU) state is not active. This ensures
that the host/guest FPU state is properly saved, restored
during context-switch and with interrupts (using irq_fpu_usable()) not
stomping on the active FPU state.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1348164109.26695.338.camel@sbsiddha-desk.sc.intel.com
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-21 16:59:04 -07:00
Linus Torvalds
6219844e72 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc updates from David Miller:

1) Debugging builds on 32-bit sparc need to handle the R_SPARC_DISP32
   relocation, not just 64-bit sparc.  From Andreas Larsson.

2) Wei Yongjun noticed that module_alloc() on sparc can return an
   error pointer, but that's not allowed.  module_alloc() should
   return only a valid pointer, or NULL.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: fix the return value of module_alloc()
  sparc32: Enable the relocation target R_SPARC_DISP32 for sparc32
2012-09-21 14:31:50 -07:00
Linus Torvalds
9d10890792 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Small fixlets"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/init.c: Fix devmem_is_allowed() off by one
  x86/kconfig: Remove outdated reference to Intel CPUs in CONFIG_SWIOTLB
2012-09-21 14:26:23 -07:00
Linus Torvalds
18f5600ba2 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Small perf fixlets"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tracing: Don't call page_to_pfn() if page is NULL
  perf/x86: Fix Intel Ivy Bridge support
  perf/x86/ibs: Check syscall attribute flags
  perf/x86: Export Sandy Bridge uncore clockticks event in sysfs
2012-09-21 14:24:48 -07:00