Commit Graph

13605 Commits

Author SHA1 Message Date
Frederic Weisbecker
9e46294dad x86: Save stack pointer in perf live regs savings
In order to prepare for fetching the stack pointer from the
regs when possible in dump_trace() instead of taking the
local one, save the current stack pointer in perf live regs saving.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-07-02 18:04:03 +02:00
Sergei Shtylyov
50c31e4a24 x86, mtrr: Use pci_dev->revision
This code uses PCI_CLASS_REVISION instead of PCI_REVISION_ID, so
it wasn't converted by commit 44c10138fd ("PCI: Change all
drivers to use pci_device->revision") before being moved to
arch/x86/...

Do it now at last -- and save one level of indentation...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/201107012242.08347.sshtylyov@ru.mvista.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-02 11:10:07 +02:00
Linus Torvalds
c9e0b84545 Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pci: Use the INT_SRC_OVR IRQ (instead of GSI) to preset the ACPI SCI IRQ.
  xen/mmu: Fix for linker errors when CONFIG_SMP is not defined.
2011-07-01 13:25:56 -07:00
Tejun Heo
a26474e864 x86-32, NUMA: Fix boot regression caused by NUMA init unification on highmem machines
During 32/64 NUMA init unification, commit 797390d855 ("x86-32,
NUMA: use sparse_memory_present_with_active_regions()") made
32bit mm init call memory_present() automatically from
active_regions instead of leaving it to each NUMA init path.

This commit description is inaccurate - memory_present() calls
aren't the same for flat and numaq.  After the commit,
memory_present() is only called for the intersection of e820 and
NUMA layout.  Before, on flatmem, memory_present() would be
called from 0 to max_pfn.  After, it would be called only on the
areas that e820 indicates to be populated.

This is how x86_64 works and should be okay as memmap is allowed
to contain holes; however, x86_32 DISCONTIGMEM is missing
early_pfn_valid(), which makes memmap_init_zone() assume that
memmap doesn't contain any hole.  This leads to the following
oops if e820 map contains holes as it often does on machine with
near or more 4GiB of memory by calling pfn_to_page() on a pfn
which isn't mapped to a NUMA node, a reported by Conny Seidel:

  BUG: unable to handle kernel paging request at 000012b0
  IP: [<c1aa13ce>] memmap_init_zone+0x6c/0xf2
  *pdpt =3D 0000000000000000 *pde =3D f000eef3f000ee00
  Oops: 0000 [#1] SMP
  last sysfs file:
  Modules linked in:

  Pid: 0, comm: swapper Not tainted 2.6.39-rc5-00164-g797390d #1 To Be Filled By O.E.M. To Be Filled By O.E.M./E350M1
  EIP: 0060:[<c1aa13ce>] EFLAGS: 00010012 CPU: 0
  EIP is at memmap_init_zone+0x6c/0xf2
  EAX: 00000000 EBX: 000a8000 ECX: 000a7fff EDX: f2c00b80
  ESI: 000a8000 EDI: f2c00800 EBP: c19ffe54 ESP: c19ffe34
   DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
  Process swapper (pid: 0, ti=3Dc19fe000 task=3Dc1a07f60 task.ti=3Dc19fe000)
  Stack:
   00000002 00000000 0023f000 00000000 10000000 00000a00 f2c00000 f2c00b58
   c19ffeb0 c1a80f24 000375fe 00000000 f2c00800 00000800 00000100 00000030
   c1abb768 0000003c 00000000 00000000 00000004 00207a02 f2c00800 000375fe
  Call Trace:
   [<c1a80f24>] free_area_init_node+0x358/0x385
   [<c1a81384>] free_area_init_nodes+0x420/0x487
   [<c1a79326>] paging_init+0x114/0x11b
   [<c1a6cb13>] setup_arch+0xb37/0xc0a
   [<c1a69554>] start_kernel+0x76/0x316
   [<c1a690a8>] i386_start_kernel+0xa8/0xb0

This patch fixes the bug by defining early_pfn_valid() to be the
same as pfn_valid() when DISCONTIGMEM.

Reported-bisected-and-tested-by: Conny Seidel <conny.seidel@amd.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: hans.rosenfeld@amd.com
Cc: Christoph Lameter <cl@linux.com>
Cc: Conny Seidel <conny.seidel@amd.com>
Link: http://lkml.kernel.org/r/20110628094107.GB3386@htj.dyndns.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 13:38:51 +02:00
Avi Kivity
0af3ac1fdb x86, perf: Add constraints for architectural PMU
The v1 PMU does not have any fixed counters.  Using the v2 constraints,
which do have fixed counters, causes an additional choice to be present
in the weight calculation, but not when actually scheduling the event,
leading to an event being not scheduled at all.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1309362157-6596-3-git-send-email-avi@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:39 +02:00
Avi Kivity
4dc0da8696 perf: Add context field to perf_event
The perf_event overflow handler does not receive any caller-derived
argument, so many callers need to resort to looking up the perf_event
in their local data structure.  This is ugly and doesn't scale if a
single callback services many perf_events.

Fix by adding a context parameter to perf_event_create_kernel_counter()
(and derived hardware breakpoints APIs) and storing it in the perf_event.
The field can be accessed from the callback as event->overflow_handler_context.
All callers are updated.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1309362157-6596-2-git-send-email-avi@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:38 +02:00
Peter Zijlstra
89d6c0b5bd perf, arch: Add generic NODE cache events
Add a NODE level to the generic cache events which is used to measure
local vs remote memory accesses. Like all other cache events, an
ACCESS is HIT+MISS, if there is no way to distinguish between reads
and writes do reads only etc..

The below needs filling out for !x86 (which I filled out with
unsupported events).

I'm fairly sure ARM can leave it like that since it doesn't strike me as
an architecture that even has NUMA support. SH might have something since
it does appear to have some NUMA bits.

Sparc64, PowerPC and MIPS certainly want a good look there since they
clearly are NUMA capable.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Miller <davem@davemloft.net>
Cc: Anton Blanchard <anton@samba.org>
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1303508226.4865.8.camel@laptop
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:38 +02:00
Peter Zijlstra
b79e8941fb perf, intel: Try alternative OFFCORE encodings
Since the OFFCORE registers are fully symmetric, try the other one
when the specified one is already in use.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1306141897.18455.8.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:37 +02:00
Stephane Eranian
ee89cbc2d4 perf_events: Add Intel Sandy Bridge offcore_response low-level support
This patch adds Intel Sandy Bridge offcore_response support by
providing the low-level constraint table for those events.

On Sandy Bridge, there are two offcore_response events. Each uses
its own dedictated extra register. But those registers are NOT shared
between sibling CPUs when HT is on unlike Nehalem/Westmere. They are
always private to each CPU. But they still need to be controlled within
an event group. All events within an event group must use the same
value for the extra MSR. That's not controlled by the second patch in
this series.

Furthermore on Sandy Bridge, the offcore_response events have NO
counter constraints contrary to what the official documentation
indicates, so drop the events from the contraint table.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110606145712.GA7304@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:37 +02:00
Stephane Eranian
cd8a38d33e perf_events: Fix validation of events using an extra reg
The validate_group() function needs to validate events with
extra shared regs. Within an event group, only events with
the same value for the extra reg can co-exist. This was not
checked by validate_group() because it was missing the
shared_regs logic.

This patch changes the allocation of the fake cpuc used for
validation to also point to a fake shared_regs structure such
that group events be properly testing.

It modifies __intel_shared_reg_get_constraints() to use
spin_lock_irqsave() to avoid lockdep issues.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110606145708.GA7279@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:36 +02:00
Stephane Eranian
efc9f05df2 perf_events: Update Intel extra regs shared constraints management
This patch improves the code managing the extra shared registers
used for offcore_response events on Intel Nehalem/Westmere. The
idea is to use static allocation instead of dynamic allocation.
This simplifies greatly the get and put constraint routines for
those events.

The patch also renames per_core to shared_regs because the same
data structure gets used whether or not HT is on. When HT is
off, those events still need to coordination because they use
a extra MSR that has to be shared within an event group.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110606145703.GA7258@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:36 +02:00
Peter Zijlstra
a7ac67ea02 perf: Remove the perf_output_begin(.sample) argument
Since only samples call perf_output_sample() its much saner (and more
correct) to put the sample logic in there than in the
perf_output_begin()/perf_output_end() pair.

Saves a useless argument, reduces conditionals and shrinks
struct perf_output_handle, win!

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-2crpvsx3cqu67q3zqjbnlpsc@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:35 +02:00
Peter Zijlstra
a8b0ca17b8 perf: Remove the nmi parameter from the swevent and overflow interface
The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.

For the various event classes:

  - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
    the PMI-tail (ARM etc.)
  - tracepoint: nmi=0; since tracepoint could be from NMI context.
  - software: nmi=[0,1]; some, like the schedule thing cannot
    perform wakeups, and hence need 0.

As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:35 +02:00
Cyrill Gorcunov
1880c4ae18 perf, x86: Add hw_watchdog_set_attr() in a sake of nmi-watchdog on P4
Due to restriction and specifics of Netburst PMU we need a separated
event for NMI watchdog. In particular every Netburst event
consumes not just a counter and a config register, but also an
additional ESCR register.

Since ESCR registers are grouped upon counters (i.e. if ESCR is occupied
for some event there is no room for another event to enter until its
released) we need to pick up the "least" used ESCR (or the most available
one) for nmi-watchdog purposes -- so MSR_P4_CRU_ESCR2/3 was chosen.

With this patch nmi-watchdog and perf top should be able to run simultaneously.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Lin Ming <ming.m.lin@intel.com>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
Tested-and-reviewed-by: Don Zickus <dzickus@redhat.com>
Tested-and-reviewed-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110623124918.GC13050@sun
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:34 +02:00
Thomas Gleixner
01898e3e29 i8253: Cleanup outb/inb magic
Remove the hysterical outb/inb_pit defines and use outb_p/inb_p in the
code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20110609130622.348437125@linutronix.de
2011-07-01 10:37:15 +02:00
Thomas Gleixner
0a779c5713 x86: Use common i8253 clockevent
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20110609130622.026152527@linutronix.de
2011-07-01 10:37:14 +02:00
Linus Torvalds
3b775e2246 Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  watchdog: update author email for at32ap700x_wdt
  watchdog: gef_wdt: fix MODULE_ALIAS
  watchdog: Intel SCU Watchdog: Fix build and remove duplicate code
  watchdog: mtx1-wdt: fix section mismatch
  watchdog: mtx1-wdt: fix GPIO toggling
  watchdog: mtx1-wdt: request gpio before using it
  watchdog: Handle multiple wm831x watchdogs being registered
2011-06-30 10:43:57 -07:00
Konrad Rzeszutek Wilk
155a16f219 xen/pci: Use the INT_SRC_OVR IRQ (instead of GSI) to preset the ACPI SCI IRQ.
In the past we would use the GSI value to preset the ACPI SCI
IRQ which worked great as GSI == IRQ:

ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)

While that is most often seen, there are some oddities:

ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 20 low level)

which means that GSI 20 (or pin 20) is to be overriden for IRQ 9.
Our code that presets the interrupt for ACPI SCI however would
use the GSI 20 instead of IRQ 9 ending up with:

xen: sci override: global_irq=20 trigger=0 polarity=1
xen: registering gsi 20 triggering 0 polarity 1
xen: --> pirq=20 -> irq=20
xen: acpi sci 20
.. snip..
calling  acpi_init+0x0/0xbc @ 1
ACPI: SCI (IRQ9) allocation failed
ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20110413/evevent-119)
ACPI: Unable to start the ACPI Interpreter

as the ACPI interpreter made a call to 'acpi_gsi_to_irq' which got nine.
It used that value to request an IRQ (request_irq) and since that was not
present it failed.

The fix is to recognize that for interrupts that are overriden (in our
case we only care about the ACPI SCI) we should use the IRQ number
to present the IRQ instead of the using GSI. End result is that we get:

xen: sci override: global_irq=20 trigger=0 polarity=1
xen: registering gsi 20 triggering 0 polarity 1
xen: --> pirq=20 -> irq=9 (gsi=9)
xen: acpi sci 9

which fixes the ACPI interpreter failing on startup.

CC: stable@kernel.org
Reported-by: Liwei <xieliwei@gmail.com>
Tested-by: Liwei <xieliwei@gmail.com>
[http://lists.xensource.com/archives/html/xen-devel/2011-06/msg01727.html]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-30 11:23:39 -04:00
Konrad Rzeszutek Wilk
32dd11942a xen/mmu: Fix for linker errors when CONFIG_SMP is not defined.
Simple enough - we use an extern defined symbol which is not
defined when CONFIG_SMP is not defined. This fixes the linker
dying.

CC: stable@kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-30 09:21:10 -04:00
Avi Kivity
cb16c34876 KVM: x86 emulator: fix %rip-relative addressing with immediate source operand
%rip-relative addressing is relative to the first byte of the next instruction,
so we need to add %rip only after we've fetched any immediate bytes.

Based on original patch by Li Xin <xin.li@intel.com>.

Signed-off-by: Avi Kivity <avi@redhat.com>
Acked-by: Li Xin <xin.li@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-29 10:09:25 +03:00
Jesper Juhl
e376fd664b watchdog: Intel SCU Watchdog: Fix build and remove duplicate code
Trying to build the Intel SCU Watchdog fails for me with gcc 4.6.0 -
$ gcc --version | head -n 1
gcc (GCC) 4.6.0 20110513 (prerelease)

like this :
  CC      drivers/watchdog/intel_scu_watchdog.o
In file included from drivers/watchdog/intel_scu_watchdog.c:49:0:
/home/jj/src/linux-2.6/arch/x86/include/asm/apb_timer.h: In function ‘apbt_time_init’:
/home/jj/src/linux-2.6/arch/x86/include/asm/apb_timer.h:65:42: warning: ‘return’ with a value, in function returning void [enabled by default]
drivers/watchdog/intel_scu_watchdog.c: In function ‘intel_scu_watchdog_init’:
drivers/watchdog/intel_scu_watchdog.c:468:2: error: implicit declaration of function ‘sfi_get_mtmr’ [-Werror=implicit-function-declaration]
drivers/watchdog/intel_scu_watchdog.c:468:32: warning: assignment makes pointer from integer without a cast [enabled by default]
cc1: some warnings being treated as errors

make[1]: *** [drivers/watchdog/intel_scu_watchdog.o] Error 1
make: *** [drivers/watchdog/intel_scu_watchdog.o] Error 2

Additionally, linux/types.h is needlessly being included twice in 
drivers/watchdog/intel_scu_watchdog.c

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2011-06-28 07:42:50 +00:00
Suresh Siddha
192d885742 x86, mtrr: use stop_machine APIs for doing MTRR rendezvous
MTRR rendezvous sequence is not implemened using stop_machine() before, as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).

Now that we have a new stop_machine_from_inactive_cpu() API, use it for
rendezvous during mtrr init of a logical processor that is coming online.

For the rest (runtime MTRR modification, system boot, resume paths), use
stop_machine() to implement the rendezvous sequence. This will consolidate and
cleanup the code.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110623182057.076997177@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-27 15:17:13 -07:00
KAMEZAWA Hiroyuki
c6830c2260 Fix node_start/end_pfn() definition for mm/page_cgroup.c
commit 21a3c96 uses node_start/end_pfn(nid) for detection start/end
of nodes. But, it's not defined in linux/mmzone.h but defined in
/arch/???/include/mmzone.h which is included only under
CONFIG_NEED_MULTIPLE_NODES=y.

Then, we see
  mm/page_cgroup.c: In function 'page_cgroup_init':
  mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn'
  mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn'

So, fixiing page_cgroup.c is an idea...

But node_start_pfn()/node_end_pfn() is a very generic macro and
should be implemented in the same manner for all archs.
(m32r has different implementation...)

This patch removes definitions of node_start/end_pfn() in each archs
and defines a unified one in linux/mmzone.h. It's not under
CONFIG_NEED_MULTIPLE_NODES, now.

A result of macro expansion is here (mm/page_cgroup.c)

for !NUMA
 start_pfn = ((&contig_page_data)->node_start_pfn);
  end_pfn = ({ pg_data_t *__pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

for NUMA (x86-64)
  start_pfn = ((node_data[nid])->node_start_pfn);
  end_pfn = ({ pg_data_t *__pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

Changelog:
 - fixed to avoid using "nid" twice in node_end_pfn() macro.

Reported-and-acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-and-tested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 14:13:09 -07:00
Suresh Siddha
6d3321e8e2 x86, mtrr: lock stop machine during MTRR rendezvous sequence
MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
happen in parallel with another system wide rendezvous using
stop_machine(). This can lead to deadlock (The order in which
works are queued can be different on different cpu's. Some cpu's
will be running the first rendezvous handler and others will be running
the second rendezvous handler. Each set waiting for the other set to join
for the system wide rendezvous, leading to a deadlock).

MTRR rendezvous sequence is not implemented using stop_machine() as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).
stop_machine() works with only online cpus.

For now, take the stop_machine mutex in the MTRR rendezvous sequence that
gets called from an online cpu (here we are in the process context
and can potentially sleep while taking the mutex). And the MTRR rendezvous
that gets triggered during cpu online doesn't need to take this stop_machine
lock (as the stop_machine() already ensures that there is no cpu hotplug
going on in parallel by doing get_online_cpus())

    TBD: Pursue a cleaner solution of extending the stop_machine()
         infrastructure to handle the case where the calling cpu is
         still not online and use this for MTRR rendezvous sequence.

fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008

Reported-by: Vadim Kotelnikov <vadimuzzz@inbox.ru>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110623182056.807230326@sbsiddha-MOBL3.sc.intel.com
Cc: stable@kernel.org # 2.6.35+, backport a week or two after this gets more testing in mainline
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-27 14:00:46 -07:00
Linus Torvalds
12f1ba5a7d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86/PCI/ACPI: fix type mismatch
  PCI: fix new kernel-doc warning
  PCI: Fix warning in drivers/pci/probe.c on sparc64
2011-06-24 08:36:16 -07:00
Ingo Molnar
debf1d4948 Merge branch 'for-tip' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/urgent 2011-06-22 16:25:34 +02:00
Alexey Dobriyan
b7f080cfe2 net: remove mm.h inclusion from netdevice.h
Remove linux/mm.h inclusion from netdevice.h -- it's unused (I've checked manually).

To prevent mm.h inclusion via other channels also extract "enum dma_data_direction"
definition into separate header. This tiny piece is what gluing netdevice.h with mm.h
via "netdevice.h => dmaengine.h => dma-mapping.h => scatterlist.h => mm.h".
Removal of mm.h from scatterlist.h was tried and was found not feasible
on most archs, so the link was cutoff earlier.

Hope people are OK with tiny include file.

Note, that mm_types.h is still dragged in, but it is a separate story.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-21 19:17:20 -07:00
Konrad Rzeszutek Wilk
f7fdd84e04 Merge branch 'stable/vga.support' into stable/drivers
* stable/vga.support:
  xen: allow enable use of VGA console on dom0
2011-06-21 09:25:41 -04:00
cpw@sgi.com
ae90c232be x86, UV: Correct UV2 BAU destination timeout
Correct the UV2 broacast assist unit's destination timeout
period. And the activation status register in UV2 should be
tested for a destination timeout with a 4, not a 2.  The values
for Active versus Timeout were reversed.

This patch is critical for TLB shootdown on an Altix UV2 system
(i.e. the follow-on to the current Altix UV).

 Destination timeout period:
  The period is set in 4 bits of memory-mapped register MISC_CONTROL.
  The left bit toggles base period between 10us and 80us.
  The other 3 bits are the multiplier.
 Decimal 15, hex f, gives the maximum: 7 * 80us

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20110621122243.117324443@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-21 14:50:34 +02:00
cpw@sgi.com
bbd270e6f4 x86, UV: Correct failed topology memory leak
Fix a memory leak in init_per_cpu() when the topology check
fails.

The leak should never occur on deployed systems. It would only occur
in an unexpected topology that would make the BAU unuseable as a result.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20110621122242.981533045@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-21 14:50:33 +02:00
cpw@sgi.com
442d392492 x86, UV: Remove cpumask_t from the stack
Remove the large stack-resident cpumask_t from
reset_with_ipi()'s stack by allocating one per uvhub.

Due to the limited size of the stack the potentially huge cpumask_t may
cause stack overrun.  We haven't seen it happen yet, but we need to make it
a practice not to push such structures onto the stack.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20110621122242.832589130@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-21 14:50:33 +02:00
cpw@sgi.com
a456eaab87 x86, UV: Rename hubmask to pnmask
Rename 'bau_targ_hubmask' to 'pnmask' for clarity.

The BAU distribution bit mask is indexed by pnode number, not hub or
blade number.  This important fact is not clear while the mask is
called a 'hubmask'.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20110621122242.630995969@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-21 14:50:32 +02:00
cpw@sgi.com
485f07d349 x86, UV: Correct reset_with_ipi()
Fix reset_with_ipi() to look up a cpu for a blade based on the
distribution map being indexed by the potentially sparsely
numbered pnode.

This patch is critical to tlb shootdown on a partitioned UV
system, or one with nonconsecutive blade numbers.

The distribution map bits represent pnodes relative to the partition base
pnode. Previous to this patch it had been assuming bits based on 0-based,
consecutive blade ids.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20110621122242.497700003@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-21 14:50:32 +02:00
cpw@sgi.com
9c9153db22 x86, UV: Allow for non-consecutive sockets
Fix for the topology in which there is a socket 1 on a blade
with no socket 0.

Only call make_per_cpu_thp() for present sockets.
We have only seen this fail for internal configurations.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20110621122242.363757364@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-21 14:50:32 +02:00
cpw@sgi.com
b18fb2c04a x86, UV: Inline header file functions
Make all the functions in uv_bau.h inline so that it can
be included in the fake prom (used in simulations).

If not inlined the unused functions will generate compiler warnings.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20110621122242.230529678@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-21 14:50:31 +02:00
cpw@sgi.com
00b30cf04a x86, UV: Fix smp_processor_id() use in a preemptable region
Fix a call by tunables_write() to smp_processor_id() within a
preemptable region.

Call get_cpu()/put_cpu() around the region where the returned
cpu number is actually used, which makes it non-preemptable.

A DEBUG_PREEMPT warning is prevented.

UV does not support cpu hotplug yet, but this is a step toward
that ability as well.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/20110621122242.086384966@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-21 14:50:31 +02:00
Joerg Roedel
801019d59d Merge branches 'amd/transparent-bridge' and 'core'
Conflicts:
	arch/x86/include/asm/amd_iommu_types.h
	arch/x86/kernel/amd_iommu.c

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 11:14:10 +02:00
Joerg Roedel
403f81d8ee iommu/amd: Move missing parts to drivers/iommu
A few parts of the driver were missing in drivers/iommu.
Move them there to have the complete driver in that
directory.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:31 +02:00
Ohad Ben-Cohen
166e9278a3 x86/ia64: intel-iommu: move to drivers/iommu/
This should ease finding similarities with different platforms,
with the intention of solving problems once in a generic framework
which everyone can use.

Note: to move intel-iommu.c, the declaration of pci_find_upstream_pcie_bridge()
has to move from drivers/pci/pci.h to include/linux/pci.h. This is handled
in this patch, too.

As suggested, also drop DMAR's EXPERIMENTAL tag while we're at it.

Compile-tested on x86_64.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:30 +02:00
Ohad Ben-Cohen
29b68415e3 x86: amd_iommu: move to drivers/iommu/
This should ease finding similarities with different platforms,
with the intention of solving problems once in a generic framework
which everyone can use.

Compile-tested on x86_64.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:29 +02:00
Linus Torvalds
ef46222e7b Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/setup: Fix for incorrect xen_extra_mem_start.
  xen: When calling power_off, don't call the halt function.
  xen: Fix compile warning when CONFIG_SMP is not defined.
  xen: support CONFIG_MAXSMP
  xen: partially revert "xen: set max_pfn_mapped to the last pfn mapped"
2011-06-20 09:01:33 -07:00
Linus Torvalds
10e18e6230 Merge branch 'kvm-updates/3.0' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/3.0' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Fix register corruption in pvclock_scale_delta
  KVM: MMU: fix opposite condition in mapping_level_dirty_bitmap
  KVM: VMX: do not overwrite uptodate vcpu->arch.cr3 on KVM_SET_SREGS
  KVM: MMU: Fix build warnings in walk_addr_generic()
2011-06-20 08:58:07 -07:00
Zachary Amsden
de2d1a524e KVM: Fix register corruption in pvclock_scale_delta
The 128-bit multiply in pvclock.h was missing an output constraint for
EDX which caused a register corruption to appear.  Thanks to Ulrich for
diagnosing the EDX corruption and Avi for providing this fix.

Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-06-19 19:23:14 +03:00
Steve
a0a8eaba16 KVM: MMU: fix opposite condition in mapping_level_dirty_bitmap
The condition is opposite, it always maps huge page for the dirty tracked page

Reported-by: Steve <stefan.bosak@gmail.com>
Signed-off-by: Steve <stefan.bosak@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-06-19 19:23:13 +03:00
Marcelo Tosatti
5233dd51ec KVM: VMX: do not overwrite uptodate vcpu->arch.cr3 on KVM_SET_SREGS
Only decache guest CR3 value if vcpu->arch.cr3 is stale.
Fixes loadvm with live guest.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Tested-by: Markus Schade <markus.schade@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-06-19 19:23:13 +03:00
Borislav Petkov
b72336355b KVM: MMU: Fix build warnings in walk_addr_generic()
On 3.0-rc1 I get

In file included from arch/x86/kvm/mmu.c:2856:
arch/x86/kvm/paging_tmpl.h: In function ‘paging32_walk_addr_generic’:
arch/x86/kvm/paging_tmpl.h:124: warning: ‘ptep_user’ may be used uninitialized in this function
In file included from arch/x86/kvm/mmu.c:2852:
arch/x86/kvm/paging_tmpl.h: In function ‘paging64_walk_addr_generic’:
arch/x86/kvm/paging_tmpl.h:124: warning: ‘ptep_user’ may be used uninitialized in this function

caused by 6e2ca7d180. According to Takuya
Yoshikawa, ptep_user won't be used uninitialized so shut up gcc.

Cc: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Link: http://lkml.kernel.org/r/20110530094604.GC21833@liondog.tnic
Signed-off-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-06-19 19:23:13 +03:00
Maarten Lankhorst
7d68dc3f10 x86, efi: Do not reserve boot services regions within reserved areas
Commit 916f676f8d started reserving boot service code since some systems
require you to keep that code around until SetVirtualAddressMap is called.

However, in some cases those areas will overlap with reserved regions.
The proper medium-term fix is to fix the bootloader to prevent the
conflicts from occurring by moving the kernel to a better position,
but the kernel should check for this possibility, and only reserve regions
which can be reserved.

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Link: http://lkml.kernel.org/r/4DF7A005.1050407@gmail.com
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-18 22:48:49 +02:00
Konrad Rzeszutek Wilk
acd049c6e9 xen/setup: Fix for incorrect xen_extra_mem_start.
The earlier attempts (24bdb0b62c)
at fixing this problem caused other problems to surface (PV guests
with no PCI passthrough would have SWIOTLB turned on - which meant
64MB of precious contingous DMA32 memory being eaten up per guest).
The problem was: "on xen we add an extra memory region at the end of
the e820, and on this particular machine this extra memory region
would start below 4g and cross over the 4g boundary:

[0xfee01000-0x192655000)

Unfortunately e820_end_of_low_ram_pfn does not expect an
e820 layout like that so it returns 4g, therefore initial_memory_mapping
will map [0 - 0x100000000), that is a memory range that includes some
reserved memory regions."

The memory range was the IOAPIC regions, and with the 1-1 mapping
turned on, it would map them as RAM, not as MMIO regions. This caused
the hypervisor to complain. Fortunately this is experienced only under
the initial domain so we guard for it.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-16 13:51:32 -04:00
Borislav Petkov
40b7f3dfcc x86, microcode, AMD: Fix section header size check
The ucode size check has to take the section header size into account
too when sanity checking the section length. Shorten and clarify define
names, while at it.

Caught-by: Ben Hutchings <ben@decadent.org.uk>
Link: http://lkml.kernel.org/r/1302752223.5282.674.camel@localhost
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 17:23:54 +02:00
Hidetoshi Seto
c7cece89f1 x86, mce: Use mce_sysdev_ prefix to group functions
There are many functions named mce_* so use a new prefix for the subset
of functions related to sysfs support.

And since f3c6ea1b06 introduces
syscore_ops, use the prefix mce_syscore for some functions related to
power management which were in sysdev_class before.

  Before:			After:
   mce_device   		 mce_sysdev
   mce_sysclass 		 mce_sysdev_class
   mce_attrs    		 mce_sysdev_attrs
   mce_dev_initialized  	 mce_sysdev_initialized
   mce_create_device    	 mce_sysdev_create
   mce_remove_device    	 mce_sysdev_remove

   mce_suspend  		 mce_syscore_suspend
   mce_shutdown 		 mce_syscore_shutdown
   mce_resume   		 mce_syscore_resume

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4DEED81B.8020506@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 12:10:16 +02:00
Hidetoshi Seto
93b62c3cf5 x86, mce: Use mce_chrdev_ prefix to group functions
There are many functions named mce_* so use a new prefix for the subset
of functions dealing with the character device /dev/mcelog.

This change doesn't impact the mce-inject module because the exported
symbol mce_chrdev_ops already has the prefix, therefore it is left
unchanged.

  Before:			After:
   mce_wait			 mce_chrdev_wait
   mce_state_lock		 mce_chrdev_state_lock
   open_count   		 mce_chrdev_open_count
   open_exclu   		 mce_chrdev_open_exclu
   mce_open			 mce_chrdev_open
   mce_release  		 mce_chrdev_release
   mce_read_mutex		 mce_chrdev_read_mutex
   mce_read			 mce_chrdev_read
   mce_poll			 mce_chrdev_poll
   mce_ioctl    		 mce_chrdev_ioctl
   mce_log_device		 mce_chrdev_device

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4DEED7CD.3040500@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 12:10:15 +02:00
Hidetoshi Seto
559faa6be1 x86, mce: Cleanup mce_read()
Use a temporary local variable m to simplify the code. No change in
logic.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4DEED7A8.8020307@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 12:10:13 +02:00
Hidetoshi Seto
f6783c4234 x86, mce: Cleanup mce_create()/remove_device()
Use temporary local variable sysdev to simplify the code. No change in
logic.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4DEED777.7080205@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 12:10:12 +02:00
Hidetoshi Seto
3a97fc3413 x86, mce: Check the result of ancient_init()
Because "ancient CPUs" like p5 and winchip don't have X86_FEATURE_MCA
(I suppose so), mcheck_cpu_init() on such CPUs will return at check of
mce_available() after __mcheck_cpu_ancient_init().

It is hard to know this implicit behavior without knowing the CPUs
well. So make it clear that we leave mcheck_cpu_init() when the CPU is
initialized in __mcheck_cpu_ancient_init().

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4DEED74B.20502@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 12:10:12 +02:00
Hidetoshi Seto
b8325c5b11 x86, mce: Introduce mce_gather_info()
This patch introduces mce_gather_info() which is to be called at the
beginning of error handling and gathers minimum error information from
proper error registers (and saved registers).

As the result of mce_get_rip() is integrated, unnecessary zeroing
is removed. This also takes care of saving RIP which is required to
make some decision about error severity for SRAR errors, instead of
retrieving it later in the handler.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4DEED71A.1060906@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 12:10:10 +02:00
Hidetoshi Seto
2b90e77eae x86, mce: Replace MCM_ with MCI_MISC_
Follow other MCi register defines. Plus define MCI_MISC_ADDR_LSB() and
MCI_MISC_ADDR_MODE().

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4DEED6E8.9090509@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 12:10:10 +02:00
Hidetoshi Seto
b77e70bf35 x86, mce: Replace MCE_SELF_VECTOR by irq_work
The MCE handler uses a special vector for self IPI to invoke
post-emergency processing in an interrupt context, e.g. call an
NMI-unsafe function, wakeup loggers, schedule time-consuming work for
recovery, etc.

This mechanism is now generalized by the following commit:

 > e360adbe29
 > Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
 > Date:   Thu Oct 14 14:01:34 2010 +0800
 >
 >  irq_work: Add generic hardirq context callbacks
 >
 >  Provide a mechanism that allows running code in IRQ context. It is
 >  most useful for NMI code that needs to interact with the rest of the
 >  system -- like wakeup a task to drain buffers.
 :

So change to use provided generic mechanism.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4DEED6B2.6080005@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 12:10:08 +02:00
Hidetoshi Seto
7639bfc753 x86, mce, severity: Clean up trivial coding style problems
More specifically:

- sort bits in the macros
- use BITCLR/BITSET
- coordinate message pattern
- use m for struct mce
- cleanup for severities_debugfs_init()

No functional change.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4DEED679.9090503@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 12:10:07 +02:00
Hidetoshi Seto
a17957cdec x86, mce, severity: Cleanup severity table
The current format of an item in this table is:
  condition(param, ..., level, message [, condition2 ...])

So we have to check both an item's head and tail to find the conditions
which match the item.

Format them in a more straight forward manner:
  item(level, message, condition [, condition2 ...])

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4DEED61F.5010502@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 12:09:42 +02:00
Hidetoshi Seto
901d7691d3 x86, mce, severity: Make formatting a bit more readable
The table looks very complicated and hard to read for people other than
skilled developers. So let's clean it up a bit. At first, change format
to ease reading elements in the table.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4DEED5EB.6050400@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 11:40:21 +02:00
Tony Luck
880a317abc x86, mce, severity: Fix two severities table signatures
The "Spurious not enabled" entry is redundant: the "Not enabled" entry
earlier in the table will cover this case.

The "Action required; unknown MCACOD" entry shouldn't specify MCACOD in
the .mask field. Current code will only match for mcacod==0 rather than
all AR=1 entries.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Link: http://lkml.kernel.org/r/4DEED5BC.8030703@jp.fujitsu.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-16 11:37:57 +02:00
Tom Goetz
b2abe50688 xen: When calling power_off, don't call the halt function.
.. As it won't actually power off the machine.

Reported-by: Sven Köhler <sven.koehler@gmail.com>
Tested-by: Sven Köhler <sven.koehler@gmail.com>
Signed-off-by: Tom Goetz <tom.goetz@virtualcomputer.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-15 16:48:29 -04:00
Andrew Jones
900cba8881 xen: support CONFIG_MAXSMP
The MAXSMP config option requires CPUMASK_OFFSTACK, which in turn
requires we init the memory for the maps while we bring up the cpus.
MAXSMP also increases NR_CPUS to 4096. This increase in size exposed an
issue in the argument construction for multicalls from
xen_flush_tlb_others. The args should only need space for the actual
number of cpus.

Also in 2.6.39 it exposes a bootup problem.

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff8157a1d3>] set_cpu_sibling_map+0x123/0x30d
...
Call Trace:
[<ffffffff81039a3f>] ? xen_restore_fl_direct_reloc+0x4/0x4
[<ffffffff819dc4db>] xen_smp_prepare_cpus+0x36/0x135
..

CC: stable@kernel.org
Signed-off-by: Andrew Jones <drjones@redhat.com>
[v2: Updated to compile on 3.0]
[v3: Updated to compile when CONFIG_SMP is not defined]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-15 14:18:49 -04:00
Borislav Petkov
86b445676d x86, microcode, AMD: Correct buf references
Both the equivalence table and the microcode patch types are u32. Access
them properly through the buf-ptr.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-06-15 15:13:49 +02:00
Robert Richter
a0e3e70243 oprofile, x86: Fix nmi-unsafe callgraph support
Current oprofile's x86 callgraph support may trigger page faults
throwing the BUG_ON(in_nmi()) message below. This patch fixes this by
using the same nmi-safe copy-from-user code as in perf.

------------[ cut here ]------------
kernel BUG at .../arch/x86/kernel/traps.c:436!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:0a.0/0000:07:00.0/0000:08:04.0/net/eth0/broadcast
CPU 5
Modules linked in:

Pid: 8611, comm: opcontrol Not tainted 2.6.39-00007-gfe47ae7 #1 Advanced Micro Device Anaheim/Anaheim
RIP: 0010:[<ffffffff813e8e35>]  [<ffffffff813e8e35>] do_nmi+0x22/0x1ee
RSP: 0000:ffff88042fd47f28  EFLAGS: 00010002
RAX: ffff88042c0a7fd8 RBX: 0000000000000001 RCX: 00000000c0000101
RDX: 00000000ffff8804 RSI: ffffffffffffffff RDI: ffff88042fd47f58
RBP: ffff88042fd47f48 R08: 0000000000000004 R09: 0000000000001484
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88042fd47f58
R13: 0000000000000000 R14: ffff88042fd47d98 R15: 0000000000000020
FS:  00007fca25e56700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000074 CR3: 000000042d28b000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process opcontrol (pid: 8611, threadinfo ffff88042c0a6000, task ffff88042c532310)
Stack:
 0000000000000000 0000000000000001 ffff88042c0a7fd8 0000000000000000
 ffff88042fd47de8 ffffffff813e897a 0000000000000020 ffff88042fd47d98
 0000000000000000 ffff88042c0a7fd8 ffff88042fd47de8 0000000000000074
Call Trace:
 <NMI>
 [<ffffffff813e897a>] nmi+0x1a/0x20
 [<ffffffff813f08ab>] ? bad_to_user+0x25/0x771
 <<EOE>>
Code: ff 59 5b 41 5c 41 5d c9 c3 55 65 48 8b 04 25 88 b5 00 00 48 89 e5 41 55 41 54 49 89 fc 53 48 83 ec 08 f6 80 47 e0 ff ff 04 74 04 <0f> 0b eb fe 81 80 44 e0 ff ff 00 00 01 04 65 ff 04 25 c4 0f 01
RIP  [<ffffffff813e8e35>] do_nmi+0x22/0x1ee
 RSP <ffff88042fd47f28>
---[ end trace ed6752185092104b ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 8611, comm: opcontrol Tainted: G      D     2.6.39-00007-gfe47ae7 #1
Call Trace:
 <NMI>  [<ffffffff813e5e0a>] panic+0x8c/0x188
 [<ffffffff813e915c>] oops_end+0x81/0x8e
 [<ffffffff8100403d>] die+0x55/0x5e
 [<ffffffff813e8c45>] do_trap+0x11c/0x12b
 [<ffffffff810023c8>] do_invalid_op+0x91/0x9a
 [<ffffffff813e8e35>] ? do_nmi+0x22/0x1ee
 [<ffffffff8131e6fa>] ? oprofile_add_sample+0x83/0x95
 [<ffffffff81321670>] ? op_amd_check_ctrs+0x4f/0x2cf
 [<ffffffff813ee4d5>] invalid_op+0x15/0x20
 [<ffffffff813e8e35>] ? do_nmi+0x22/0x1ee
 [<ffffffff813e8e7a>] ? do_nmi+0x67/0x1ee
 [<ffffffff813e897a>] nmi+0x1a/0x20
 [<ffffffff813f08ab>] ? bad_to_user+0x25/0x771
 <<EOE>>

Cc: John Lumby <johnlumby@hotmail.com>
Cc: Maynard Johnson <maynardj@us.ibm.com>
Cc: <stable@kernel.org> # .37+
Signed-off-by: Robert Richter <robert.richter@amd.com>
2011-06-15 14:31:33 +02:00
Robert Richter
8fe7e94eb7 oprofile, x86: Fix race in nmi handler while starting counters
In some rare cases, nmis are generated immediately after the nmi
handler of the cpu was started. This causes the counter not to be
enabled. Before enabling the nmi handlers we need to set variable
ctr_running first and make sure its value is written to memory.

Also, the patch makes all existing barriers a memory barrier instead
of a compiler barrier only.

Reported-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: <stable@kernel.org> # .35+
Signed-off-by: Robert Richter <robert.richter@amd.com>
2011-06-15 14:31:29 +02:00
Masami Hiramatsu
395810627b x86: Swap save_stack_trace_regs parameters
Swap the 1st and 2nd parameters of save_stack_trace_regs()
as same as the parameters of save_stack_trace_tsk().

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@gmail.com>
Link: http://lkml.kernel.org/r/20110608070921.17777.31103.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14 22:48:51 -04:00
Andy Whitcroft
60b8b1de0d x86 idle: APM requires pm_idle/default_idle unconditionally when a module
[ Also from Ben Hutchings <ben@decadent.org.uk> and Vitaliy Ivanov
  <vitalivanov@gmail.com> ]

Commit 06ae40ce07 ("x86 idle: EXPORT_SYMBOL(default_idle, pm_idle)
only when APM demands it") removed the export for pm_idle/default_idle
unless the apm module was modularised and CONFIG_APM_CPU_IDLE was set.

But the apm module uses pm_idle/default_idle unconditionally,
CONFIG_APM_CPU_IDLE only affects the bios idle threshold.  Adjust the
export accordingly.

[ Used #ifdef instead of #if defined() as it's shorter, and what both
  Ben and Vitaliy used.. Andy, you're out-voted ;)    - Linus ]

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-14 13:42:20 -07:00
Linus Torvalds
f39e840995 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm: Compare only lower 32 bits of framebuffer map offsets
  drm/i915: Don't leak in i915_gem_shmem_pread_slow()
  drm/radeon/kms: do bounds checking for 3D_LOAD_VBPNTR and bump array limit
  drm/radeon/kms: fix mac g5 quirk
  x86/uv/x2apic: update for change in pci bridge handling.
  alpha, drm: Remove obsolete Alpha support in MGA DRM code
  alpha/drm: Cleanup Alpha support in DRM generic code
  savage: remove unnecessary if statement
  drm/radeon: fix GUI idle IH debug statements
  drm/radeon/kms: check modes against max pixel clock
  drm: fix fbs in DRM_IOCTL_MODE_GETRESOURCES ioctl
2011-06-14 11:25:32 -07:00
Ohad Ben-Cohen
ab493a0f0f drivers: iommu: move to a dedicated folder
Create a dedicated folder for iommu drivers, and move the base
iommu implementation over there.

Grouping the various iommu drivers in a single location will help
finding similar problems shared by different platforms, so they
could be solved once, in the iommu framework, instead of solved
differently (or duplicated) in each driver.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-14 14:47:41 +02:00
Joerg Roedel
71f7758090 x86/amd-iommu: Store device alias as dev_data pointer
This finally allows PCI-Device-IDs to be handled by the
IOMMU driver that have no corresponding struct device
present in the system.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-14 12:49:58 +02:00
Joerg Roedel
3b03bb745e x86/amd-iommu: Search for existind dev_data before allocting a new one
Search for existing dev_data first will allow to switch
dev_data->alias to just store dev_data instead of struct
device.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-14 12:49:58 +02:00
Joerg Roedel
2b02b091ab x86/amd-iommu: Allow dev_data->alias to be NULL
Let dev_data->alias be just NULL if the device has no alias.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-14 12:49:58 +02:00
Joerg Roedel
ec9e79ef06 x86/amd-iommu: Use only dev_data in low-level domain attach/detach functions
With this patch the low-level attach/detach functions only
work on dev_data structures. This allows to remove the
dev_data->dev pointer.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-14 12:49:58 +02:00
Joerg Roedel
6c54204793 x86/amd-iommu: Use only dev_data for dte and iotlb flushing routines
This patch make the functions flushing the DTE and IOTLBs
only take the dev_data structure instead of the struct
device directly.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-14 12:49:57 +02:00
Joerg Roedel
ea61cddb9d x86/amd-iommu: Store ATS state in dev_data
This allows the low-level functions to operate on dev_data
exclusivly later.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-14 12:49:57 +02:00
Joerg Roedel
f62dda66b5 x86/amd-iommu: Store devid in dev_data
This allows to use dev_data independent of struct device
later.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-14 12:49:57 +02:00
Joerg Roedel
8fa5f802ab x86/amd-iommu: Introduce global dev_data_list
This list keeps all allocated iommu_dev_data structs in a
list together. This is needed for instances that have no
associated device.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-14 12:49:57 +02:00
Joerg Roedel
39c555460c x86/amd-iommu: Remove redundant device_flush_dte() calls
Remove these function calls from places where the function
has already been called by another function.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-14 12:49:57 +02:00
Dave Airlie
7ad35cf288 x86/uv/x2apic: update for change in pci bridge handling.
When I added 3448a19da4
I forgot about the special uv handling code for this, so this
patch fixes it up.

Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Ingo Molnar
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-06-14 09:50:12 +10:00
Linus Torvalds
c78a9b9b8e Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ftrace: Revert 8ab2b7efd ftrace: Remove unnecessary disabling of irqs
  kprobes/trace: Fix kprobe selftest for gcc 4.6
  ftrace: Fix possible undefined return code
  oprofile, dcookies: Fix possible circular locking dependency
  oprofile: Fix locking dependency in sync_start()
  oprofile: Free potentially owned tasks in case of errors
  oprofile, x86: Add comments to IBS LVT offset initialization
2011-06-13 10:45:49 -07:00
Linus Torvalds
842c895d14 Merge branches 'x86-urgent-for-linus' and 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: devicetree: Add missing early_init_dt_setup_initrd_arch stub
  x86: cpu-hotplug: Prevent softirq wakeup on wrong CPU

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: Prevent potential NULL dereference in irq_set_irq_wake()
2011-06-13 10:45:10 -07:00
Mathias Krause
dac853ae89 exec: delay address limit change until point of no return
Unconditionally changing the address limit to USER_DS and not restoring
it to its old value in the error path is wrong because it prevents us
using kernel memory on repeated calls to this function.  This, in fact,
breaks the fallback of hard coded paths to the init program from being
ever successful if the first candidate fails to load.

With this patch applied switching to USER_DS is delayed until the point
of no return is reached which makes it possible to have a multi-arch
rootfs with one arch specific init binary for each of the (hard coded)
probed paths.

Since the address limit is already set to USER_DS when start_thread()
will be invoked, this redundancy can be safely removed.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-09 12:50:05 -07:00
Florian Fainelli
977cb76d52 x86: devicetree: Add missing early_init_dt_setup_initrd_arch stub
This patch fixes the following build failure:

drivers/built-in.o: In function `early_init_dt_check_for_initrd':
/home/florian/dev/kernel/x86/linux-2.6-x86/drivers/of/fdt.c:571:
undefined reference to `early_init_dt_setup_initrd_arch'
make: *** [.tmp_vmlinux1] Error 1

which happens as soon as we enable initrd support on a x86 devicetree
platform such as Intel CE4100.

Signed-off-by: Florian Fainelli <ffainelli@freebox.fr>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Maxime Bizon <mbizon@freebox.fr>
Acked-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: stable@kernel.org # 2.6.39
Link: http://lkml.kernel.org/r/201106061015.50039.ffainelli@freebox.fr
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-09 15:39:43 +02:00
Stefano Stabellini
a91d92875e xen: partially revert "xen: set max_pfn_mapped to the last pfn mapped"
We only need to set max_pfn_mapped to the last pfn mapped on x86_64 to
make sure that cleanup_highmap doesn't remove important mappings at
_end.

We don't need to do this on x86_32 because cleanup_highmap is not called
on x86_32. Besides lowering max_pfn_mapped on x86_32 has the unwanted
side effect of limiting the amount of memory available for the 1:1
kernel pagetable allocation.

This patch reverts the x86_32 part of the original patch.

CC: stable@kernel.org
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-09 09:08:53 -04:00
Ralf Baechle
8761f1ab71 pcspkr: Cleanup Kconfig dependencies
Lenghty lists of the kind "depends on ARCH1 || ARCH2 ... || ARCH123" are
usually either wrong or too coarse grained.  Or plain an ugly sin.

[ tglx: Fixed up amigaone ]

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linux-alpha@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Gerhard Pircher <gerhard_pircher@gmx.net>
Link: http://lkml.kernel.org/r/20110601180610.984881988@duck.linux-mips.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-09 15:01:41 +02:00
Ralf Baechle
850492760c i8253: Move remaining content and delete asm/i8253.h
Move setup_pit_timer() declaration to the common header file and
remove the arch specific ones.

[ tglx: Move it to linux/i8253.h instead of asm/mips and asm/x86 ]

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Russell King <linux@arm.linux.org.uk>
Cc: linux-mips@linux-mips.org
Cc: Sergei Shtylyov <sshtylyov@mvista.com
Link: http://lkml.kernel.org/r/20110601180610.913463093@duck.linux-mips.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-09 15:01:40 +02:00
Ralf Baechle
49cf3f29a1 i8253: Consolidate definitions of PIT_LATCH
x86 defines PIT_LATCH as LATCH which in <linux/timex.h> is defined as
((CLOCK_TICK_RATE + HZ/2) / HZ) and <asm/timex.h> again defines
CLOCK_TICK_RATE as PIT_TICK_RATE.

MIPS defines PIT_LATCH as LATCH which in <linux/timex.h> is defined as
((CLOCK_TICK_RATE + HZ/2) / HZ) and <asm/timex.h> again defines
CLOCK_TICK_RATE as 1193182.

ARM defines PITCH_LATCH as ((PIT_TICK_RATE + HZ / 2) / HZ) - and that's
the sanest thing and equivalent to above definitions so use that as the
new definition in <linux/i8253.h>.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20110601180610.832810002@duck.linux-mips.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-09 15:01:40 +02:00
Ralf Baechle
16f871bc30 x86: i8253: Consolidate definitions of global_clock_event
There are multiple declarations of global_clock_event in header files
specific to particular clock event implementations.  Consolidate them
in <asm/time.h> and make sure all users include that header.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Venkatesh Pallipadi (Venki) <venki@google.com>
Link: http://lkml.kernel.org/r/20110601180610.762763451@duck.linux-mips.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-09 15:01:40 +02:00
Ralf Baechle
15f304b664 i8253: Consolidate all kernel definitions of i8253_lock
Move them to drivers/clocksource/i8253.c and remove the
implementations in arch/

[ tglx: Avoid the extra file in lib - folded arch patches in. The
  export will become conditional in a later step ]

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Link: http://lkml.kernel.org/r/20110601180610.221426078@duck.linux-mips.net
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-09 15:01:38 +02:00
Ralf Baechle
cb2455aa27 i8253: Unify all kernel declarations of i8253_lock
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20110601180610.134151920@duck.linux-mips.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-09 15:01:38 +02:00
Ralf Baechle
334955ef96 i8253: Create linux/i8253.h and use it in all 8253 related files
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20110601180610.054254048@duck.linux-mips.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

 arch/arm/mach-footbridge/isa-timer.c |    2 +-
 arch/mips/cobalt/time.c              |    2 +-
 arch/mips/jazz/irq.c                 |    2 +-
 arch/mips/kernel/i8253.c             |    2 +-
 arch/mips/mti-malta/malta-time.c     |    2 +-
 arch/mips/sgi-ip22/ip22-time.c       |    2 +-
 arch/mips/sni/time.c                 |    2 +-
 arch/x86/kernel/apic/apic.c          |    2 +-
 arch/x86/kernel/apm_32.c             |    2 +-
 arch/x86/kernel/hpet.c               |    2 +-
 arch/x86/kernel/i8253.c              |    2 +-
 arch/x86/kernel/time.c               |    2 +-
 drivers/block/hd.c                   |    2 +-
 drivers/clocksource/i8253.c          |    2 +-
 drivers/input/gameport/gameport.c    |    2 +-
 drivers/input/joystick/analog.c      |    2 +-
 drivers/input/misc/pcspkr.c          |    2 +-
 include/linux/i8253.h                |   11 +++++++++++
 sound/drivers/pcsp/pcsp.h            |    2 +-
 19 files changed, 29 insertions(+), 18 deletions(-)
2011-06-09 15:01:37 +02:00
Linus Torvalds
467701e286 Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: off by one errors in multicalls.c
  xen: use the trigger info we already have to choose the irq handler
2011-06-08 12:03:37 -07:00
Ingo Molnar
86dd7909c2 Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/urgent 2011-06-08 15:49:03 +02:00
Thomas Gleixner
fd8a7de177 x86: cpu-hotplug: Prevent softirq wakeup on wrong CPU
After a newly plugged CPU sets the cpu_online bit it enables
interrupts and goes idle. The cpu which brought up the new cpu waits
for the cpu_online bit and when it observes it, it sets the cpu_active
bit for this cpu. The cpu_active bit is the relevant one for the
scheduler to consider the cpu as a viable target.

With forced threaded interrupt handlers which imply forced threaded
softirqs we observed the following race:

cpu 0                         cpu 1

bringup(cpu1);
                              set_cpu_online(smp_processor_id(), true);
		              local_irq_enable();
while (!cpu_online(cpu1));
                              timer_interrupt()
                                -> wake_up(softirq_thread_cpu1);
                                     -> enqueue_on(softirq_thread_cpu1, cpu0);

                                                                        ^^^^

cpu_notify(CPU_ONLINE, cpu1);
  -> sched_cpu_active(cpu1)
     -> set_cpu_active((cpu1, true);

When an interrupt happens before the cpu_active bit is set by the cpu
which brought up the newly onlined cpu, then the scheduler refuses to
enqueue the woken thread which is bound to that newly onlined cpu on
that newly onlined cpu due to the not yet set cpu_active bit and
selects a fallback runqueue. Not really an expected and desirable
behaviour.

So far this has only been observed with forced hard/softirq threading,
but in theory this could happen without forced threaded hard/softirqs
as well. It's probably unobservable as it would take a massive
interrupt storm on the newly onlined cpu which causes the softirq loop
to wake up the softirq thread and an even longer delay of the cpu
which waits for the cpu_online bit.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org # 2.6.39
2011-06-08 11:21:19 +02:00
Linus Torvalds
d681f1204d Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86/amd-iommu: Fix boot crash with hidden PCI devices
  x86/amd-iommu: Use only per-device dma_ops
  x86/amd-iommu: Fix 3 possible endless loops
2011-06-07 19:20:53 -07:00
Linus Torvalds
58a9a36b54 Merge branch 'kvm-updates/3.0' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/3.0' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Initialize kvm before registering the mmu notifier
  KVM: x86: use proper port value when checking io instruction permission
  KVM: add missing void __user * cast to access_ok() call
2011-06-07 19:06:28 -07:00
Benjamin Herrenschmidt
ef3b4f8cc2 pci/of: Consolidate pci_bus_to_OF_node()
The generic code always get the device-node in the right place now
so a single implementation will work for all archs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-08 09:08:57 +10:00
Benjamin Herrenschmidt
64099d981c pci/of: Consolidate pci_device_to_OF_node()
All archs do more or less the same thing now, move it into
a single generic place.

I chose pci.h rather than of_pci.h to avoid having to change
all call-sites to include the later.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-08 09:08:43 +10:00
Benjamin Herrenschmidt
3d5fe5a65a x86/devicetree: Use generic PCI <-> OF matching
Instead of walking the whole PCI tree to update the of_node's for
PCI busses and devices after the fact, enable the new generic core
code for doing so by providing the proper device nodes for the
PCI host bridges

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2011-06-08 09:08:40 +10:00
Joerg Roedel
26018874e3 x86/amd-iommu: Fix boot crash with hidden PCI devices
Some PCIe cards ship with a PCI-PCIe bridge which is not
visible as a PCI device in Linux. But the device-id of the
bridge is present in the IOMMU tables which causes a boot
crash in the IOMMU driver.
This patch fixes by removing these cards from the IOMMU
handling. This is a pure -stable fix, a real fix to handle
this situation appriatly will follow for the next merge
window.

Cc: stable@kernel.org	# > 2.6.32
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-07 10:06:59 +02:00
Andy Lutomirski
5cec93c216 x86-64: Emulate legacy vsyscalls
There's a fair amount of code in the vsyscall page.  It contains
a syscall instruction (in the gettimeofday fallback) and who
knows what will happen if an exploit jumps into the middle of
some other code.

Reduce the risk by replacing the vsyscalls with short magic
incantations that cause the kernel to emulate the real
vsyscalls. These incantations are useless if entered in the
middle.

This causes vsyscalls to be a little more expensive than real
syscalls.  Fortunately sensible programs don't use them.
The only exception is time() which is still called by glibc
through the vsyscall - but calling time() millions of times
per second is not sensible. glibc has this fixed in the
development tree.

This patch is not perfect: the vread_tsc and vread_hpet
functions are still at a fixed address.  Fixing that might
involve making alternative patching work in the vDSO.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
Cc: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: pageexec@freemail.hu
Link: http://lkml.kernel.org/r/e64e1b3c64858820d12c48fa739efbd1485e79d5.1307292171.git.luto@mit.edu
[ Removed the CONFIG option - it's simpler to just do it unconditionally. Tidied up the code as well. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-07 10:02:35 +02:00
Matthew Garrett
3b3702377c x86, efi: Add infrastructure for UEFI 2.0 runtime services
We're currently missing support for any of the runtime service calls
introduced with the UEFI 2.0 spec in 2006. Add the infrastructure for
supporting them.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Link: http://lkml.kernel.org/r/1307388985-7852-2-git-send-email-mjg@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-06 13:30:30 -07:00
Matthew Garrett
f7a2d73fe7 x86, efi: Fix argument types for SetVariable()
The spec says this takes uint32 for attributes, not uintn.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Link: http://lkml.kernel.org/r/1307388985-7852-1-git-send-email-mjg@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-06 13:30:27 -07:00
Jeremy Fitzhardinge
c2419b4a47 xen: allow enable use of VGA console on dom0
Get the information about the VGA console hardware from Xen, and put
it into the form the bootloader normally generates, so that the rest
of the kernel can deal with VGA as usual.

[ Impact: make VGA console work in dom0 ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
[v1: Rebased on 2.6.39]
[v2: Removed incorrect comments and fixed compile warnings]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-06 11:46:00 -04:00
Joerg Roedel
27c2127a15 x86/amd-iommu: Use only per-device dma_ops
Unfortunatly there are systems where the AMD IOMMU does not
cover all devices. This breaks with the current driver as it
initializes the global dma_ops variable. This patch limits
the AMD IOMMU to the devices listed in the IVRS table fixing
DMA for devices not covered by the IOMMU.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-06 17:37:27 +02:00
Joerg Roedel
0de66d5b35 x86/amd-iommu: Fix 3 possible endless loops
The driver contains several loops counting on an u16 value
where the exit-condition is checked against variables that
can have values up to 0xffff. In this case the loops will
never exit. This patch fixed 3 such loops.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-06 16:10:15 +02:00
Marcelo Tosatti
221192bdff KVM: x86: use proper port value when checking io instruction permission
Commit f6511935f4 moved the permission check for io instructions
to the ->check_perm callback. It failed to copy the port value from RDX
register for string and "in,out ax,dx" instructions.

Fix it by reading RDX register at decode stage when appropriate.

Fixes FC8.32 installation.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-06 10:52:09 +03:00
Andy Lutomirski
5dfcea629a x86-64: Fill unused parts of the vsyscall page with 0xcc
Jumping to 0x00 might do something depending on the following
bytes. Jumping to 0xcc is a trap.  So fill the unused parts of
the vsyscall page with 0xcc to make it useless for exploits to
jump there.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
Cc: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: pageexec@freemail.hu
Link: http://lkml.kernel.org/r/ed54bfcfbe50a9070d20ec1edbe0d149e22a4568.1307292171.git.luto@mit.edu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-06 09:43:14 +02:00
Andy Lutomirski
bb5fe2f78e x86-64: Remove vsyscall number 3 (venosys)
It just segfaults since April 2008 (a4928cff), so I'm pretty
sure that nothing uses it.  And having an empty section makes
the linker script a bit fragile.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
Cc: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: pageexec@freemail.hu
Link: http://lkml.kernel.org/r/4a4abcf47ecadc269f2391a313576fe6d06acef7.1307292171.git.luto@mit.edu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-06 09:43:14 +02:00
Andy Lutomirski
d319bb79af x86-64: Map the HPET NX
Currently the HPET mapping is a user-accessible syscall
instruction at a fixed address some of the time.

A sufficiently determined hacker might be able to guess when.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
Cc: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: pageexec@freemail.hu
Link: http://lkml.kernel.org/r/ab41b525a4ca346b1ca1145d16fb8d181861a8aa.1307292171.git.luto@mit.edu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-05 21:30:33 +02:00
Andy Lutomirski
0d7b8547fb x86-64: Remove kernel.vsyscall64 sysctl
It's unnecessary overhead in code that's supposed to be highly
optimized.  Removing it allows us to remove one of the two
syscall instructions in the vsyscall page.

The only sensible use for it is for UML users, and it doesn't
fully address inconsistent vsyscall results on UML.  The real
fix for UML is to stop using vsyscalls entirely.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
Cc: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: pageexec@freemail.hu
Link: http://lkml.kernel.org/r/973ae803fe76f712da4b2740e66dccf452d3b1e4.1307292171.git.luto@mit.edu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-05 21:30:33 +02:00
Andy Lutomirski
9fd67b4ed0 x86-64: Give vvars their own page
Move vvars out of the vsyscall page into their own page and mark
it NX.

Without this patch, an attacker who can force a daemon to call
some fixed address could wait until the time contains, say,
0xCD80, and then execute the current time.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
Cc: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: pageexec@freemail.hu
Link: http://lkml.kernel.org/r/b1460f81dc4463d66ea3f2b5ce240f58d48effec.1307292171.git.luto@mit.edu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-05 21:30:32 +02:00
Andy Lutomirski
8b4777a4b5 x86-64: Document some of entry_64.S
Signed-off-by: Andy Lutomirski <luto@mit.edu>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
Cc: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: pageexec@freemail.hu
Link: http://lkml.kernel.org/r/fc134867cc550977cc996866129e11a16ba0f9ea.1307292171.git.luto@mit.edu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-05 21:30:32 +02:00
Andy Lutomirski
6879eb2dee x86-64: Fix alignment of jiffies variable
It's declared __attribute__((aligned(16)) but it's explicitly
not aligned.  This is probably harmless but it's a bit
embarrassing.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
Cc: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: pageexec@freemail.hu
Link: http://lkml.kernel.org/r/5f3bc5542e9aaa9382d53f153f54373165cdef89.1307292171.git.luto@mit.edu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-05 21:30:31 +02:00
Borislav Petkov
dd2897bf0f x86, asm: Fix binutils 2.16 issue with __USER32_CS
While testing the patchset at

http://lkml.kernel.org/r/1306873314-32523-1-git-send-email-bp@alien8.de

with binutils 2.16.1 from hell, kernel build fails with the following
error:

arch/x86/ia32/ia32entry.S: Assembler messages:
arch/x86/ia32/ia32entry.S:139: Error: too many positional arguments
make[2]: *** [arch/x86/ia32/ia32entry.o] Error 1
make[1]: *** [arch/x86/ia32] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [arch/x86] Error 2
make: *** Waiting for unfinished jobs....

due to spaces between the operators of the __USER32_CS define. Fix it so
that gas 2.16 can swallow it too.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1307131642-32595-1-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-03 14:39:14 -07:00
Borislav Petkov
38e6b75d3b x86, asm: Cleanup thunk_64.S
Drop thunk_ra macro in favor of an additional argument to the thunk
macro since their bodies are almost identical. Do a whitespace scrubbing
and use CFI-aware macros for full annotation.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1306873314-32523-5-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-03 14:38:55 -07:00
Borislav Petkov
838feb4754 x86, asm: Flip RESTORE_ARGS arguments logic
... thus getting rid of the "else" part of the conditional statement in
the macro.

No functionality change.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1306873314-32523-4-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-03 14:38:53 -07:00
Borislav Petkov
cac0e0a78f x86, asm: Flip SAVE_ARGS arguments logic
This saves us the else part of the conditional statement in the macro.

No functionality change.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1306873314-32523-3-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-03 14:38:51 -07:00
Borislav Petkov
a268fcfaa6 x86, asm: Thin down SAVE/RESTORE_* asm macros
Use dwarf2 cfi annotation macros, making SAVE/RESTORE_* marginally more
readable.

No functionality change.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1306873314-32523-2-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-03 14:38:49 -07:00
Dan Carpenter
f124c6ae59 xen: off by one errors in multicalls.c
b->args[] has MC_ARGS elements, so the comparison here should be
">=" instead of ">".  Otherwise we read past the end of the array
one space.

CC: stable@kernel.org
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2011-06-03 16:04:02 -04:00
Jack Steiner
6885685923 x66, UV: Enable 64-bit ACPI MFCG support for SGI UV2 platform
Enable 64-bit ACPI MFCG support for SGI UV2 platform. The check
is similar to the check on UV1. UV2 has a different oem_id
string.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Link: http://lkml.kernel.org/r/20110602195943.GA27079@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-03 16:00:03 +02:00
Márton Németh
6e33a852a3 x86/PCI/ACPI: fix type mismatch
The flags field of struct resource from linux/ioport.h is "unsigned
long". Change the "type" parameter of coalesce_windows() function to
match that field. This fixes the following warning messages when
compiling with "make C=1 W=1 bzImage modules":

arch/x86/pci/acpi.c: In function ‘coalesce_windows’:
arch/x86/pci/acpi.c:198: warning: conversion to ‘long unsigned int’ from ‘int’ may change the sign of the result
arch/x86/pci/acpi.c:203: warning: conversion to ‘long unsigned int’ from ‘int’ may change the sign of the result

Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-01 11:51:05 -07:00
Linus Torvalds
e12ca23d41 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  virtio_net: delay TX callbacks
  virtio: add api for delayed callbacks
  virtio_test: support event index
  vhost: support event index
  virtio_ring: support event idx feature
  virtio ring: inline function to check for events
  virtio: event index interface
  virtio: add full three-clause BSD text to headers.
  virtio balloon: kill tell-host-first logic
  virtio console: don't manually set or finalize VIRTIO_CONSOLE_F_MULTIPORT.
  drivers, block: virtio_blk: Replace cryptic number with the macro
  virtio_blk: allow re-reading config space at runtime
  lguest: remove support for VIRTIO_F_NOTIFY_ON_EMPTY.
  lguest: fix up compilation after move
  lguest: fix timer interrupt setup
2011-06-01 06:45:08 +09:00
Linus Torvalds
af0d6a0a3a Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix mwait_play_dead() faulting on mwait-incapable cpus
  x86 idle: Fix mwait deprecation warning message

Evil merge to remove extra quote noticed by Joe Perches
2011-06-01 02:07:22 +09:00
Linus Torvalds
643d2d7992 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Put back -pg to tsc.o and add no GCOV to vread_tsc_64.o
2011-05-31 20:32:54 +09:00
Robert Richter
cbf74cea07 oprofile, x86: Add comments to IBS LVT offset initialization
Adding a comment in the code as IBS LVT setup is not obvious at all ...

Signed-off-by: Robert Richter <robert.richter@amd.com>
2011-05-30 16:36:54 +02:00
Avi Kivity
4f3c125c74 x86: Fix mwait_play_dead() faulting on mwait-incapable cpus
A logic error in mwait_play_dead() causes the kernel to use
mwait even on cpus which don't support it, such as KVM virtual
cpus.

Introduced by:

  349c004e3d: x86: A fast way to check capabilities of the current cpu

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=36222
Reported-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/1306758237-9327-1-git-send-email-avi@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-30 14:37:54 +02:00
Borislav Petkov
598e887d8b x86 idle: Fix mwait deprecation warning message
Fix:

  arch/x86/kernel/process.c:645:1: warning: unknown escape sequence '\i'

due to missing escape backslash, introduced by this commit:

  5d4c47e019: x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param

Signed-off-by: Borislav Petkov <bp@alien8.de>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1306748286-24701-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-30 13:02:04 +02:00
Jack Steiner
55ba412028 x86, UV: Clean up uv_mmrs.h
No code changes. Reformat definitions to make it more readable.

I fixed alignment of comments in the structure definitions.

Also aligned comments and most field definitions & values. Also
sorted the defines for the SHIFT & MASK values for each MMR.
This make the file visually much more acceptable.

Some of the symbol names are still quite long. The file is based
on post-processing of verilog definitions that are used for the
node controller chip design. Although some symbol names are not
what I would chose, I would like to maintain compatibility with
the names used by the chip designers. We have a number of
cross-reference utilities & having common names is important.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Link: http://lkml.kernel.org/r/20110527145256.GA31224@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
--
 arch/x86/include/asm/uv/uv_mmrs.h | 2873 +++++++++++++++++++++-----------------
 1 file changed, 1600 insertions(+), 1273 deletions(-)
2011-05-30 11:08:48 +02:00
Rusty Russell
15517f7c21 lguest: fix timer interrupt setup
Without an IRQ chip set, we now get a WARN_ON and no timer interrupt.  This
prevents booting.

Fortunately, the fix is a one-liner: set up the timer IRQ like everything
else.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org # .39.x
2011-05-30 11:14:10 +09:30
Linus Torvalds
f310642123 Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
  x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param
  x86 idle: deprecate "no-hlt" cmdline param
  x86 idle APM: deprecate CONFIG_APM_CPU_IDLE
  x86 idle floppy: deprecate disable_hlt()
  x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it
  x86 idle: clarify AMD erratum 400 workaround
  idle governor: Avoid lock acquisition to read pm_qos before entering idle
  cpuidle: menu: fixed wrapping timers at 4.294 seconds
2011-05-29 11:18:09 -07:00
Len Brown
5d4c47e019 x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param
mwait_idle() is a C1-only idle loop intended to be more efficient
than HLT on SMP hardware that supports it.

But mwait_idle() has been replaced by the more general
mwait_idle_with_hints(), which handles both C1 and deeper C-states.
ACPI uses only mwait_idle_with_hints(), and never uses mwait_idle().

Deprecate mwait_idle() and the "idle=mwait" cmdline param
to simplify the x86 idle code.

After this change, kernels configured with
(!CONFIG_ACPI=n && !CONFIG_INTEL_IDLE=n) when run on hardware
that support MWAIT will simply use HLT.  If MWAIT is desired
on those systems, cpuidle and the cpuidle drivers above
can be used.

cc: x86@kernel.org
cc: stable@kernel.org # .39.x
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 03:39:17 -04:00
Len Brown
cdaab4a0d3 x86 idle: deprecate "no-hlt" cmdline param
We'd rather that modern machines not check if HLT works on
every entry into idle, for the benefit of machines that had
marginal electricals 15-years ago.  If those machines are still running
the upstream kernel, they can use "idle=poll".  The only difference
will be that they'll now invoke HLT in machine_hlt().

cc: x86@kernel.org # .39.x
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 03:39:16 -04:00
Len Brown
99c6322143 x86 idle APM: deprecate CONFIG_APM_CPU_IDLE
We don't want to export the pm_idle function pointer to modules.
Currently CONFIG_APM_CPU_IDLE w/ CONFIG_APM_MODULE forces us to.

CONFIG_APM_CPU_IDLE is of dubious value, it runs only on 32-bit
uniprocessor laptops that are over 10 years old.  It calls into
the BIOS during idle, and is known to cause a number of machines
to fail.

Removing CONFIG_APM_CPU_IDLE and will allow us to stop exporting
pm_idle.  Any systems that were calling into the APM BIOS
at run-time will simply use HLT instead.

cc: x86@kernel.org
cc: Jiri Kosina <jkosina@suse.cz>
cc: stable@kernel.org # .39.x
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 03:39:15 -04:00
Len Brown
06ae40ce07 x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it
In the long run, we don't want default_idle() or (pm_idle)() to
be exported outside of process.c.  Start by not exporting them
to modules, unless the APM build demands it.

cc: x86@kernel.org
cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 03:39:14 -04:00
Len Brown
02c68a0201 x86 idle: clarify AMD erratum 400 workaround
The workaround for AMD erratum 400 uses the term "c1e" falsely suggesting:
1. Intel C1E is somehow involved
2. All AMD processors with C1E are involved

Use the string "amd_c1e" instead of simply "c1e" to clarify that
this workaround is specific to AMD's version of C1E.
Use the string "e400" to clarify that the workaround is specific
to AMD processors with Erratum 400.

This patch is text-substitution only, with no functional change.

cc: x86@kernel.org
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 03:38:57 -04:00
Linus Torvalds
a947e23a8e Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, asm: Clean up desc.h a bit
  x86, amd: Do not enable ARAT feature on AMD processors below family 0x12
  x86: Move do_page_fault()'s error path under unlikely()
  x86, efi: Retain boot service code until after switching to virtual mode
  x86: Remove unnecessary check in detect_ht()
  x86: Reorder mm_context_t to remove x86_64 alignment padding and thus shrink mm_struct
  x86, UV: Clean up uv_tlb.c
  x86, UV: Add support for SGI UV2 hub chip
  x86, cpufeature: Update CPU feature RDRND to RDRAND
2011-05-28 12:57:01 -07:00
Linus Torvalds
c4a227d89f Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits)
  perf: Fix SIGIO handling
  perf top: Don't stop if no kernel symtab is found
  perf top: Handle kptr_restrict
  perf top: Remove unused macro
  perf events: initialize fd array to -1 instead of 0
  perf tools: Make sure kptr_restrict warnings fit 80 col terms
  perf tools: Fix build on older systems
  perf symbols: Handle /proc/sys/kernel/kptr_restrict
  perf: Remove duplicate headers
  ftrace: Add internal recursive checks
  tracing: Update btrfs's tracepoints to use u64 interface
  tracing: Add __print_symbolic_u64 to avoid warnings on 32bit machine
  ftrace: Set ops->flag to enabled even on static function tracing
  tracing: Have event with function tracer check error return
  ftrace: Have ftrace_startup() return failure code
  jump_label: Check entries limit in __jump_label_update
  ftrace/recordmcount: Avoid STT_FUNC symbols as base on ARM
  scripts/tags.sh: Add magic for trace-events for etags too
  scripts/tags.sh: Fix ctags for DEFINE_EVENT()
  x86/ftrace: Fix compiler warning in ftrace.c
  ...
2011-05-28 12:55:55 -07:00
Joe Perches
c4d017f213 x86: Convert vmalloc()+memset() to vzalloc()
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Jiri Kosina <trivial@kernel.org>
Link: http://lkml.kernel.org/r/10e35243fda0b8739c89ac32a7bdf348ec4752e1.1306603968.git.joe@perches.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-28 19:53:57 +02:00
Linus Torvalds
571503e100 Merge branch 'setns'
* setns:
  ns: Wire up the setns system call

Done as a merge to make it easier to fix up conflicts in arm due to
addition of sendmmsg system call
2011-05-28 10:51:01 -07:00
Eric W. Biederman
7b21fddd08 ns: Wire up the setns system call
32bit and 64bit on x86 are tested and working.  The rest I have looked
at closely and I can't find any problems.

setns is an easy system call to wire up.  It just takes two ints so I
don't expect any weird architecture porting problems.

While doing this I have noticed that we have some architectures that are
very slow to get new system calls.  cris seems to be the slowest where
the last system calls wired up were preadv and pwritev.  avr32 is weird
in that recvmmsg was wired up but never declared in unistd.h.  frv is
behind with perf_event_open being the last syscall wired up.  On h8300
the last system call wired up was epoll_wait.  On m32r the last system
call wired up was fallocate.  mn10300 has recvmmsg as the last system
call wired up.  The rest seem to at least have syncfs wired up which was
new in the 2.6.39.

v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com>
v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com>
v4: Moved wiring up of the system call to another patch
v5: ported to v2.6.39-rc6
v6: rebased onto parisc-next and net-next to avoid syscall  conflicts.
v7: ported to Linus's latest post 2.6.39 tree.

>  arch/blackfin/include/asm/unistd.h     |    3 ++-
>  arch/blackfin/mach-common/entry.S      |    1 +
Acked-by: Mike Frysinger <vapier@gentoo.org>

Oh - ia64 wiring looks good.
Acked-by: Tony Luck <tony.luck@intel.com>

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-28 10:48:39 -07:00
Steven Rostedt
89e1be50c6 x86: Put back -pg to tsc.o and add no GCOV to vread_tsc_64.o
The commit 44259b1abf
    Author: Andy Lutomirski <luto@MIT.EDU>
    x86-64: Move vread_tsc into a new file with sensible options

Removed the -pg from tsc.o which caused the function graph tracer
to go into an infinite function call recursion as it uses the tsc
internally outside its recursion protection, thus tracing the tsc
breaks the function graph tracer.

This commit also added the file vread_tsc_64.c that gets used
by vdso but failed to prevent GCOV from monkeying with it,
causing userspace to try to access kernel data when GCOV was
enabled.

Thanks to Thomas Gleixner for pointing out GCOV as the likely
culprit that added strange kernel accesses into the vread_tsc()
call.

Cc: Author: Andy Lutomirski <luto@MIT.EDU>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-27 23:47:16 -04:00
Linus Torvalds
f23a5e1405 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PM: Fix PM QOS's user mode interface to work with ASCII input
  PM / Hibernate: Update kerneldoc comments in hibernate.c
  PM / Hibernate: Remove arch_prepare_suspend()
  PM / Hibernate: Update some comments in core hibernate code
2011-05-27 14:27:34 -07:00
Ingo Molnar
d6a72fe465 Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent 2011-05-27 14:28:09 +02:00
Ingo Molnar
b1d2dc3c06 Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/urgent 2011-05-27 14:08:09 +02:00
Ingo Molnar
9a3865b185 x86, asm: Clean up desc.h a bit
I have looked at this file and found it rather ugly - improve
readability a bit. No change in functionality.

Link: http://lkml.kernel.org/n/tip-incpt6y26yd8586idx65t9ll@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-27 09:30:50 +02:00
Linus Torvalds
dc7acbb251 Merge branch 'upstream/tidy-xen-mmu-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'upstream/tidy-xen-mmu-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xen: fix compile without CONFIG_XEN_DEBUG_FS
  Use arbitrary_virt_to_machine() to deal with ioremapped pud updates.
  Use arbitrary_virt_to_machine() to deal with ioremapped pmd updates.
  xen/mmu: remove all ad-hoc stats stuff
  xen: use normal virt_to_machine for ptes
  xen: make a pile of mmu pvop functions static
  vmalloc: remove vmalloc_sync_all() from alloc_vm_area()
  xen: condense everything onto xen_set_pte
  xen: use mmu_update for xen_set_pte_at()
  xen: drop all the special iomap pte paths.
2011-05-26 19:01:15 -07:00
Akinobu Mita
63e424c844 arch: remove CONFIG_GENERIC_FIND_{NEXT_BIT,BIT_LE,LAST_BIT}
By the previous style change, CONFIG_GENERIC_FIND_NEXT_BIT,
CONFIG_GENERIC_FIND_BIT_LE, and CONFIG_GENERIC_FIND_LAST_BIT are not used
to test for existence of find bitops anymore.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:38 -07:00
Mike Frysinger
63ab25ebbc kgdbts: unify/generalize gdb breakpoint adjustment
The Blackfin arch, like the x86 arch, needs to adjust the PC manually
after a breakpoint is hit as normally this is handled by the remote gdb.
However, rather than starting another arch ifdef mess, create a common
GDB_ADJUSTS_BREAK_OFFSET define for any arch to opt-in via their kgdb.h.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Dongdong Deng <dongdong.deng@windriver.com>
Cc: Sergei Shtylyov <sshtylyov@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:36 -07:00