Commit Graph

7345 Commits

Author SHA1 Message Date
Robert Richter
855357a217 perf, x86: Fix AMD family 15h FPU event constraints
Depending on the unit mask settings some FPU events may be scheduled
only on cpu counter #3. This patch fixes this.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@googlemail.com>
Link: http://lkml.kernel.org/r/1302913676-14352-3-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-19 10:07:55 +02:00
Andre Przywara
83112e688f perf, x86: Fix pre-defined cache-misses event for AMD family 15h cpus
With AMD cpu family 15h a unit mask was introduced for the Data Cache
Miss event (0x041/L1-dcache-load-misses). We need to enable bit 0
(first data cache miss or streaming store to a 64 B cache line) of
this mask to proper count data cache misses.

Now we set this bit for all families and models. In case a PMU does
not implement a unit mask for event 0x041 the bit is ignored.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1302913676-14352-2-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-19 10:07:54 +02:00
Linus Torvalds
8b9686ff4d Merge branches 'x86-fixes-for-linus', 'sched-fixes-for-linus', 'timers-fixes-for-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86-32, fpu: Fix FPU exception handling on non-SSE systems
  x86, hibernate: Initialize mmu_cr4_features during boot
  x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
  x86: visws: Fixup irq overhaul fallout

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Clean up rebalance_domains() load-balance interval calculation

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86/mrst/vrtc: Fix boot crash in mrst_rtc_init()
  rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm()

* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: Fix cpumask leak in __setup_irq()

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf probe: Fix listing incorrect line number with inline function
  perf probe: Fix to find recursively inlined function
  perf probe: Fix multiple --vars options behavior
  perf probe: Fix to remove redundant close
  perf probe: Fix to ensure function declared file
2011-04-07 12:12:58 -07:00
H. Peter Anvin
4da9484bde x86, hibernate: Initialize mmu_cr4_features during boot
Restore the initialization of mmu_cr4_features during boot, which was
removed without comment in checkin e5f15b45dd

x86: Cleanup highmap after brk is concluded

thereby breaking resume from hibernate.  This restores previous
functionality in approximately the same place, and corrects the
reading of %cr4 on pre-CPUID hardware (%cr4 exists if and only if
CPUID is supported.)

However, part of the problem is that the hibernate suspend/resume
sequence should manage the save/restore of %cr4 explicitly.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <201104020154.57136.rjw@sisk.pl>
2011-04-06 13:10:02 -07:00
Linus Torvalds
d7c764c4c7 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, UV: Fix kdump reboot
  x86, amd-nb: Rename CPU PCI id define for F4
  sound: Add delay.h to sound/soc/codecs/sn95031.c
  x86, mtrr, pat: Fix one cpu getting out of sync during resume
  x86, microcode: Unregister syscore_ops after microcode unloaded
  x86: Stop including <linux/delay.h> in two asm header files
2011-04-04 08:37:45 -07:00
Linus Torvalds
fb9a7d76da Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rcu: create new rcu_access_index() and use in mce
  WARN_ON_SMP(): Add comment to explain ({0;})
2011-04-04 08:36:15 -07:00
Paul E. McKenney
a4dd99250d rcu: create new rcu_access_index() and use in mce
The MCE subsystem needs to sample an RCU-protected index outside of
any protection for that index.  If this was a pointer, we would use
rcu_access_pointer(), but there is no corresponding rcu_access_index().
This commit therefore creates an rcu_access_index() and applies it
to MCE.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Zdenek Kabelac <zkabelac@redhat.com>
2011-04-01 07:27:31 -07:00
Cliff Wickman
818987e9a1 x86, UV: Fix kdump reboot
After a crash dump on an SGI Altix UV system the crash kernel
fails to cause a reboot.  EFI mode is disabled in the kdump
kernel, so only the reboot_type of BOOT_ACPI works.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: rja@sgi.com
LKML-Reference: <E1Q5Iuo-00013b-UK@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-31 18:44:03 +02:00
Borislav Petkov
cb6c8520f6 x86, amd-nb: Rename CPU PCI id define for F4
With increasing number of PCI function ids, add the PCI function
id in the define name instead of its symbolic name in the BKDG
for more clarity. This renames function 4 define.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
LKML-Reference: <20110330183447.GA3668@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-31 08:51:38 +02:00
Suresh Siddha
84ac7cdbdd x86, mtrr, pat: Fix one cpu getting out of sync during resume
On laptops with core i5/i7, there were reports that after resume
graphics workloads were performing poorly on a specific AP, while
the other cpu's were ok. This was observed on a 32bit kernel
specifically.

Debug showed that the PAT init was not happening on that AP
during resume and hence it contributing to the poor workload
performance on that cpu.

On this system, resume flow looked like this:

1. BP starts the resume sequence and we reinit BP's MTRR's/PAT
   early on using mtrr_bp_restore()

2. Resume sequence brings all AP's online

3. Resume sequence now kicks off the MTRR reinit on all the AP's.

4. For some reason, between point 2 and 3, we moved from BP
   to one of the AP's. My guess is that printk() during resume
   sequence is contributing to this. We don't see similar
   behavior with the 64bit kernel but there is no guarantee that
   at this point the remaining resume sequence (after AP's bringup)
   has to happen on BP.

5. set_mtrr() was assuming that we are still on BP and skipped the
   MTRR/PAT init on that cpu (because of 1 above)

6. But we were on an AP and this led to not reprogramming PAT
   on this cpu leading to bad performance.

Fix this by doing unconditional mtrr_if->set_all() in set_mtrr()
during MTRR/PAT init. This might be unnecessary if we are still
running on BP. But it is of no harm and will guarantee that after
resume, all the cpu's will be in sync with respect to the
MTRR/PAT registers.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1301438292-28370-1-git-send-email-eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Tested-by: Keith Packard <keithp@keithp.com>
Cc: stable@kernel.org	[v2.6.32+]
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-03-29 16:17:42 -07:00
Thomas Gleixner
86cc8dfc21 x86: apb_timer: Fixup genirq fallout
The lonely user of the internal interface was not in the coccinelle
script.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-30 00:13:30 +02:00
Xiaotian Feng
4ac5fc6a3e x86, microcode: Unregister syscore_ops after microcode unloaded
Currently, microcode doesn't unregister syscore_ops after it's
unloaded. So if we modprobe then rmmod microcode, the stale
microcode syscore_ops info will stay on syscore_ops_list.

Later when we're trying to reboot/halt/shutdown the machine, kernel
will panic on syscore_shutdown().

With the patch applied, I can reboot/halt/shutdown my machine successfully.

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
LKML-Reference: <1301387672-23661-1-git-send-email-dfeng@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-29 11:12:04 +02:00
Jean Delvare
ca444564a9 x86: Stop including <linux/delay.h> in two asm header files
Stop including <linux/delay.h> in x86 header files which don't
need it. This will let the compiler complain when this header is
not included by source files when it should, so that
contributors can fix the problem before building on other
architectures starts to fail.

Credits go to Geert for the idea.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: James E.J. Bottomley <James.Bottomley@suse.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
LKML-Reference: <20110325152014.297890ec@endymion.delvare>
[ this also fixes an upstream build bug in drivers/media/rc/ite-cir.c ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-29 09:37:42 +02:00
Linus Torvalds
16c29dafcc Merge branch 'syscore' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'syscore' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  Introduce ARCH_NO_SYSDEV_OPS config option (v2)
  cpufreq: Use syscore_ops for boot CPU suspend/resume (v2)
  KVM: Use syscore_ops instead of sysdev class and sysdev
  PCI / Intel IOMMU: Use syscore_ops instead of sysdev class and sysdev
  timekeeping: Use syscore_ops instead of sysdev class and sysdev
  x86: Use syscore_ops instead of sysdev classes and sysdevs
2011-03-25 21:07:59 -07:00
Linus Torvalds
95e14ed7fc Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  kdb: add usage string of 'per_cpu' command
  kgdb,x86_64: fix compile warning found with sparse
  kdb: code cleanup to use macro instead of value
  kgdboc,kgdbts: strlen() doesn't count the terminator
2011-03-25 21:04:56 -07:00
Linus Torvalds
2a20b02c05 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue
  perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows
  perf symbols: Look at .dynsym again if .symtab not found
  perf build-id: Add quirk to deal with perf.data file format breakage
  perf session: Pass evsel in event_ops->sample()
  perf: Better fit max unprivileged mlock pages for tools needs
  perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx()
  perf top: Fix uninitialized 'counter' variable
  tracing: Fix set_ftrace_filter probe function display
  perf, x86: Fix Intel fixed counters base initialization
2011-03-25 17:53:09 -07:00
Linus Torvalds
94df491c4a Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  futex: Fix WARN_ON() test for UP
  WARN_ON_SMP(): Allow use in if() statements on UP
  x86, dumpstack: Use %pB format specifier for stack trace
  vsprintf: Introduce %pB format specifier
  lockdep: Remove unused 'factor' variable from lockdep_stats_show()
2011-03-25 17:52:22 -07:00
Linus Torvalds
26ff6801f7 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: DT: Cleanup namespace and call irq_set_irq_type() unconditional
  x86: DT: Fix return condition in irq_create_of_mapping()
  x86, mpparse: Move check_slot into CONFIG_X86_IO_APIC context
2011-03-25 17:51:51 -07:00
Jason Wessel
21431c2900 kgdb,x86_64: fix compile warning found with sparse
Fix sparse warning:

arch/x86/kernel/kgdb.c:123:9: warning: switch with no cases

Reported-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2011-03-25 16:37:31 -05:00
Ingo Molnar
45daae575e perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue
Eric Dumazet reported that hardware PMU events do not work on his
system, due to the BIOS corrupting PMU state:

    Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only.
    [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c)

Linus suggested that we continue in the face of such BIOS-induced CPU
state corruption:

   http://lkml.org/lkml/2011/3/24/608

Such BIOSes will have to be fixed - Linux developers rely on a working and
fully capable PMU and the BIOS interfering with the CPU's PMU state is simply
not acceptable.

So this patch changes perf to continue when it detects such BIOS
interaction, some hardware events may be unreliable due to the BIOS
writing and re-writing them - there's not much the kernel can do
about that but to detect the corruption and report it.

Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-25 11:23:41 +01:00
Thomas Gleixner
07611dbda5 x86: DT: Cleanup namespace and call irq_set_irq_type() unconditional
That call escaped the name space cleanup. Fix it up.

We really want to call there. The chip might have changed since the
irq was setup initially. So let the core code and the chip decide what
to do. The status is just an unreliable snapshot.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2011-03-24 23:17:56 +01:00
Thomas Gleixner
00a30b254b x86: DT: Fix return condition in irq_create_of_mapping()
The xlate() function returns 0 or a negative error code. Returning the
error code blindly will be seen as an huge irq number by the calling
function because irq_create_of_mapping() returns an unsigned value.

Return 0 (NO_IRQ) as required.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2011-03-24 23:17:56 +01:00
Don Zickus
242214f9c1 perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows
The read of a proper MSR register was missed and instead of
counter the configration register was tested (it has
ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI
hitting the system. As result the user may obtain "Dazed and
confused, but trying to continue" message. Fix it by reading a
proper MSR register.

When an NMI happens on a P4, the perf nmi handler checks the
configuration register to see if the overflow bit is set or not
before taking appropriate action.  Unfortunately, various P4
machines had a broken overflow bit, so a backup mechanism was
implemented.  This mechanism checked to see if the counter
rolled over or not.

A previous commit that implemented this backup mechanism was
broken. Instead of reading the counter register, it used the
configuration register to determine if the counter rolled over
or not. Reading that bit would give incorrect results.

This would lead to 'Dazed and confused' messages for the end
user when using the perf tool (or if the nmi watchdog is
running).

The fix is to read the counter register before determining if
the counter rolled over or not.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <4D8BAB49.3080701@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-24 21:40:01 +01:00
Linus Torvalds
047f61c5d1 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (42 commits)
  ACPI: minor printk format change in acpi_pad
  ACPI: make acpi_pad /sys output more readable
  ACPICA: Update version to 20110316
  ACPICA: Header support for SLIC table
  ACPI: Make sure the FADT is at least rev 2 before using the reset register
  ACPI: Bug compatibility for Windows on the ACPI reboot vector
  ACPICA: Fix access width for reset vector
  ACPI battery: fribble sysfs files from a resume notifier
  ACPI button: remove unused procfs I/F
  ACPI, APEI, Add PCIe AER error information printing support
  PCIe, AER, use pre-generated prefix in error information printing
  ACPI, APEI, Add ERST record ID cache
  ACPI: Use syscore_ops instead of sysdev class and sysdev
  ACPI: Remove the unused EC sysdev class
  ACPI: use __cpuinit for the acpi_processor_set_pdc() call tree
  ACPI: use __init where possible in processor driver
  Thermal_Framework-Fix_crash_during_hwmon_unregister
  ACPICA: Update version to 20110211.
  ACPICA: Add mechanism to defer _REG methods for some installed handlers
  ACPICA: Add support for FunctionalFixedHW in acpi_ut_get_region_name
  ...
2011-03-24 08:25:15 -07:00
Namhyung Kim
71f9e59800 x86, dumpstack: Use %pB format specifier for stack trace
Improve noreturn function entries in call traces:

Before:

 Call Trace:
  [<ffffffff812a8502>] panic+0x8c/0x18d
  [<ffffffffa000012a>] deep01+0x0/0x38 [test_panic]  <--- bad
  [<ffffffff81104666>] proc_file_write+0x73/0x8d
  [<ffffffff811000b3>] proc_reg_write+0x8d/0xac
  [<ffffffff810c7d32>] vfs_write+0xa1/0xc5
  [<ffffffff810c7e0f>] sys_write+0x45/0x6c
  [<ffffffff8f02943b>] system_call_fastpath+0x16/0x1b

After:

 Call Trace:
  [<ffffffff812bce69>] panic+0x8c/0x18d
  [<ffffffffa000012a>] panic_write+0x20/0x20 [test_panic] <--- good
  [<ffffffff81115fab>] proc_file_write+0x73/0x8d
  [<ffffffff81111a5f>] proc_reg_write+0x8d/0xac
  [<ffffffff810d90ee>] vfs_write+0xa1/0xc5
  [<ffffffff810d91cb>] sys_write+0x45/0x6c
  [<ffffffff812c07fb>] system_call_fastpath+0x16/0x1b

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1300934550-21394-2-git-send-email-namhyung@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-24 08:36:10 +01:00
Linus Torvalds
b81a618dcd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  deal with races in /proc/*/{syscall,stack,personality}
  proc: enable writing to /proc/pid/mem
  proc: make check_mem_permission() return an mm_struct on success
  proc: hold cred_guard_mutex in check_mem_permission()
  proc: disable mem_write after exec
  mm: implement access_remote_vm
  mm: factor out main logic of access_process_vm
  mm: use mm_struct to resolve gate vma's in __get_user_pages
  mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm
  mm: arch: make in_gate_area take an mm_struct instead of a task_struct
  mm: arch: make get_gate_vma take an mm_struct instead of a task_struct
  x86: mark associated mm when running a task in 32 bit compatibility mode
  x86: add context tag to mark mm when running a task in 32-bit compatibility mode
  auxv: require the target to be tracable (or yourself)
  close race in /proc/*/environ
  report errors in /proc/*/*map* sanely
  pagemap: close races with suid execve
  make sessionid permissions in /proc/*/task/* match those in /proc/*
  fix leaks in path_lookupat()

Fix up trivial conflicts in fs/proc/base.c
2011-03-23 20:51:42 -07:00
Olaf Hering
93a72052be crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn
The Xen PV drivers in a crashed HVM guest can not connect to the dom0
backend drivers because both frontend and backend drivers are still in
connected state.  To run the connection reset function only in case of a
crashdump, the is_kdump_kernel() function needs to be available for the PV
driver modules.

Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
usable for modules.

Leave 'elfcorehdr' as early_param().  This changes powerpc from __setup()
to early_param().  It adds an address range check from x86 also on ia64
and powerpc.

[akpm@linux-foundation.org: additional #includes]
[akpm@linux-foundation.org: remove elfcorehdr_addr export]
[akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:47:19 -07:00
Rafael J. Wysocki
f3c6ea1b06 x86: Use syscore_ops instead of sysdev classes and sysdevs
Some subsystems in the x86 tree need to carry out suspend/resume and
shutdown operations with one CPU on-line and interrupts disabled and
they define sysdev classes and sysdevs or sysdev drivers for this
purpose.  This leads to unnecessarily complicated code and excessive
memory usage, so switch them to using struct syscore_ops objects for
this purpose instead.

Generally, there are three categories of subsystems that use
sysdevs for implementing PM operations: (1) subsystems whose
suspend/resume callbacks ignore their arguments entirely (the
majority), (2) subsystems whose suspend/resume callbacks use their
struct sys_device argument, but don't really need to do that,
because they can be implemented differently in an arguably simpler
way (io_apic.c), and (3) subsystems whose suspend/resume callbacks
use their struct sys_device argument, but the value of that argument
is always the same and could be ignored (microcode_core.c).  In all
of these cases the subsystems in question may be readily converted to
using struct syscore_ops objects for power management and shutdown.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
2011-03-23 22:15:54 +01:00
Stephen Wilson
375906f876 x86: mark associated mm when running a task in 32 bit compatibility mode
This patch simply follows the same practice as for setting the TIF_IA32 flag.
In particular, an mm is marked as holding 32-bit tasks when a 32-bit binary is
exec'ed.  Both ELF and a.out formats are updated.

Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23 16:36:53 -04:00
Len Brown
02e2407858 Merge branch 'linus' into release
Conflicts:
	arch/x86/kernel/acpi/sleep.c

Signed-off-by: Len Brown <len.brown@intel.com>
2011-03-23 02:34:54 -04:00
Len Brown
5c129a8600 Merge commit 'v2.6.38' into release 2011-03-23 02:33:54 -04:00
David Rientjes
4061d68e1a x86: only compile 8237A if CONFIG_ISA_DMA_API is enabled
8237A utilizes the interface provided by CONFIG_ISA_DMA_API, specifically
claim_dma_lock() and release_dma_lock().  Thus, there's a strict
dependency on the config option and the module should only be loaded if
the kernel supports ISA-style DMA.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22 17:44:16 -07:00
Olaf Hering
d404ab0a11 move x86 specific oops=panic to generic code
The oops=panic cmdline option is not x86 specific, move it to generic code.
Update documentation.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22 17:44:11 -07:00
Linus Torvalds
73d5a8675f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  xen: update mask_rw_pte after kernel page tables init changes
  xen: set max_pfn_mapped to the last pfn mapped
  x86: Cleanup highmap after brk is concluded

Fix up trivial onflict (added header file includes) in
arch/x86/mm/init_64.c
2011-03-22 10:41:36 -07:00
Rakib Mullick
cbb84c4cc1 x86, mpparse: Move check_slot into CONFIG_X86_IO_APIC context
When CONFIG_X86_MPPARSE=y and CONFIG_X86_IO_APIC=n, then we get
the following warning:

  arch/x86/kernel/mpparse.c:723: warning: 'check_slot' defined but not used

So, put check_slot into CONFIG_X86_IO_APIC context. Its only
called from CONFIG_X86_IO_APIC=y context.

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
LKML-Reference: <AANLkTinsUfGc=NG_GeH_B+zFVu+DXJzZbJKdQLscqfuH@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-22 13:00:22 +01:00
Len Brown
25076246e8 Merge branch 'apei-release' into release 2011-03-22 01:41:47 -04:00
Huang Ying
885b976fad ACPI, APEI, Add ERST record ID cache
APEI ERST firmware interface and implementation has no multiple users
in mind.  For example, if there is four records in storage with ID: 1,
2, 3 and 4, if two ERST readers enumerate the records via
GET_NEXT_RECORD_ID as follow,

reader 1		reader 2
1
			2
3
			4
-1
			-1

where -1 signals there is no more record ID.

Reader 1 has no chance to check record 2 and 4, while reader 2 has no
chance to check record 1 and 3.  And any other GET_NEXT_RECORD_ID will
return -1, that is, other readers will has no chance to check any
record even they are not cleared by anyone.

This makes raw GET_NEXT_RECORD_ID not suitable for used by multiple
users.

To solve the issue, an in-memory ERST record ID cache is designed and
implemented.  When enumerating record ID, the ID returned by
GET_NEXT_RECORD_ID is added into cache in addition to be returned to
caller.  So other readers can check the cache to get all record ID
available.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-03-21 22:59:06 -04:00
Sage Weil
b7ed78f565 introduce sys_syncfs to sync a single file system
It is frequently useful to sync a single file system, instead of all
mounted file systems via sync(2):

 - On machines with many mounts, it is not at all uncommon for some of
   them to hang (e.g. unresponsive NFS server).  sync(2) will get stuck on
   those and may never get to the one you do care about (e.g., /).
 - Some applications write lots of data to the file system and then
   want to make sure it is flushed to disk.  Calling fsync(2) on each
   file introduces unnecessary ordering constraints that result in a large
   amount of sub-optimal writeback/flush/commit behavior by the file
   system.

There are currently two ways (that I know of) to sync a single super_block:

 - BLKFLSBUF ioctl on the block device: That also invalidates the bdev
   mapping, which isn't usually desirable, and doesn't work for non-block
   file systems.
 - 'mount -o remount,rw' will call sync_filesystem as an artifact of the
   current implemention.  Relying on this little-known side effect for
   something like data safety sounds foolish.

Both of these approaches require root privileges, which some applications
do not have (nor should they need?) given that sync(2) is an unprivileged
operation.

This patch introduces a new system call syncfs(2) that takes an fd and
syncs only the file system it references.  Maybe someday we can

 $ sync /some/path

and not get

 sync: ignoring all arguments

The syscall is motivated by comments by Al and Christoph at the last LSF.
syncfs(2) seems like an appropriate name given statfs(2).

A similar ioctl was also proposed a while back, see
	http://marc.info/?l=linux-fsdevel&m=127970513829285&w=2

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-21 00:40:29 -04:00
Yinghai Lu
e5f15b45dd x86: Cleanup highmap after brk is concluded
Now cleanup_highmap actually is in two steps: one is early in head64.c
and only clears above _end; a second one is in init_memory_mapping() and
tries to clean from _brk_end to _end.
It should check if those boundaries are PMD_SIZE aligned but currently
does not.
Also init_memory_mapping() is called several times for numa or memory
hotplug, so we really should not handle initial kernel mappings there.

This patch moves cleanup_highmap() down after _brk_end is settled so
we can do everything in one step.
Also we honor max_pfn_mapped in the implementation of cleanup_highmap.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2011-03-19 11:58:19 -07:00
Stephane Eranian
fc66c5210e perf, x86: Fix Intel fixed counters base initialization
The following patch solves the problems introduced by Robert's
commit 41bf498 and reported by Arun Sharma. This commit gets rid
of the base + index notation for reading and writing PMU msrs.

The problem is that for fixed counters, the new calculation for
the base did not take into account the fixed counter indexes,
thus all fixed counters were read/written from fixed counter 0.
Although all fixed counters share the same config MSR, they each
have their own counter register.

Without:

 $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds

  242202299 unhalted_core_cycles (0.00% scaling, ena=1000790892, run=1000790892)
 2389685946 instructions_retired (0.00% scaling, ena=1000790892, run=1000790892)
      49473 baclears             (0.00% scaling, ena=1000790892, run=1000790892)

With:

 $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds

 2392703238 unhalted_core_cycles (0.00% scaling, ena=1000840809, run=1000840809)
 2389793744 instructions_retired (0.00% scaling, ena=1000840809, run=1000840809)
      47863 baclears             (0.00% scaling, ena=1000840809, run=1000840809)

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ming.m.lin@intel.com
Cc: robert.richter@amd.com
Cc: asharma@fb.com
Cc: perfmon2-devel@lists.sf.net
LKML-Reference: <20110319172005.GB4978@quad>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-19 19:00:49 +01:00
Len Brown
05534c9ffc Merge branch 'acpica' into release 2011-03-18 18:06:08 -04:00
Linus Torvalds
f2e1fbb5f2 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Flush TLB if PGD entry is changed in i386 PAE mode
  x86, dumpstack: Correct stack dump info when frame pointer is available
  x86: Clean up csum-copy_64.S a bit
  x86: Fix common misspellings
  x86: Fix misspelling and align params
  x86: Use PentiumPro-optimized partial_csum() on VIA C7
2011-03-18 10:45:21 -07:00
Linus Torvalds
619297855a Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
  trace, filters: Initialize the match variable in process_ops() properly
  trace, documentation: Fix branch profiling location in debugfs
  oprofile, s390: Cleanups
  oprofile, s390: Remove hwsampler_files.c and merge it into init.c
  perf: Fix tear-down of inherited group events
  perf: Reorder & optimize perf_event_context to remove alignment padding on 64 bit builds
  perf: Handle stopped state with tracepoints
  perf: Fix the software events state check
  perf, powerpc: Handle events that raise an exception without overflowing
  perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints
  perf, x86: Clean up SandyBridge PEBS events
  perf lock: Fix sorting by wait_min
  perf tools: Version incorrect with some versions of grep
  perf evlist: New command to list the names of events present in a perf.data file
  perf script: Add support for H/W and S/W events
  perf script: Add support for dumping symbols
  perf script: Support custom field selection for output
  perf script: Move printing of 'common' data from print_event and rename
  perf tracing: Remove print_graph_cpu and print_graph_proc from trace-event-parse
  perf script: Change process_event prototype
  ...
2011-03-18 10:38:34 -07:00
Linus Torvalds
e16b396ce3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits)
  doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore
  Update cpuset info & webiste for cgroups
  dcdbas: force SMI to happen when expected
  arch/arm/Kconfig: remove one to many l's in the word.
  asm-generic/user.h: Fix spelling in comment
  drm: fix printk typo 'sracth'
  Remove one to many n's in a word
  Documentation/filesystems/romfs.txt: fixing link to genromfs
  drivers:scsi Change printk typo initate -> initiate
  serial, pch uart: Remove duplicate inclusion of linux/pci.h header
  fs/eventpoll.c: fix spelling
  mm: Fix out-of-date comments which refers non-existent functions
  drm: Fix printk typo 'failled'
  coh901318.c: Change initate to initiate.
  mbox-db5500.c Change initate to initiate.
  edac: correct i82975x error-info reported
  edac: correct i82975x mci initialisation
  edac: correct commented info
  fs: update comments to point correct document
  target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c
  ...

Trivial conflict in fs/eventpoll.c (spelling vs addition)
2011-03-18 10:37:40 -07:00
Namhyung Kim
e8e999cf3c x86, dumpstack: Correct stack dump info when frame pointer is available
Current stack dump code scans entire stack and check each entry
contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
could mark whether the pointer is valid or not based on value of
the frame pointer. Invalid entries could be preceded by '?' sign.

However this was not going to happen because scan start point
was always higher than the frame pointer so that they could not
meet.

Commit 9c0729dc80 ("x86: Eliminate bp argument from the stack
tracing routines") delayed bp acquisition point, so the bp was
read in lower frame, thus all of the entries were marked
invalid.

This patch fixes this by reverting above commit while retaining
stack_frame() helper as suggested by Frederic Weisbecker.

End result looks like below:

before:

 [    3.508329] Call Trace:
 [    3.508551]  [<ffffffff814f35c9>] ? panic+0x91/0x199
 [    3.508662]  [<ffffffff814f3739>] ? printk+0x68/0x6a
 [    3.508770]  [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e
 [    3.508876]  [<ffffffff81a9821f>] ? mount_root+0x56/0x5a
 [    3.508975]  [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9
 [    3.509216]  [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2
 [    3.509335]  [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10
 [    3.509442]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
 [    3.509542]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
 [    3.509641]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10

after:

 [    3.522991] Call Trace:
 [    3.523351]  [<ffffffff814f35b9>] panic+0x91/0x199
 [    3.523468]  [<ffffffff814f3729>] ? printk+0x68/0x6a
 [    3.523576]  [<ffffffff81a981b2>] mount_block_root+0x257/0x26e
 [    3.523681]  [<ffffffff81a9821f>] mount_root+0x56/0x5a
 [    3.523780]  [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9
 [    3.523885]  [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2
 [    3.523987]  [<ffffffff81003894>] kernel_thread_helper+0x4/0x10
 [    3.524228]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
 [    3.524345]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
 [    3.524445]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10

 -v5:
   * fix build breakage with oprofile

 -v4:
   * use 0 instead of regs->bp
   * separate out printk changes

 -v3:
   * apply comment from Frederic
   * add a couple of printk fixes

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Soren Sandmann <ssp@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Robert Richter <robert.richter@amd.com>
LKML-Reference: <1300416006-3163-1-git-send-email-namhyung@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-18 10:51:42 +01:00
Lucas De Marchi
0d2eb44f63 x86: Fix common misspellings
They were generated by 'codespell' and then manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: trivial@kernel.org
LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-18 10:39:30 +01:00
Ingo Molnar
8dd8997d2c Merge branch 'linus' into x86/urgent
Merge reason: Merge upstream commits to avoid conflicts in upcoming patches.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-18 10:39:00 +01:00
Linus Torvalds
ec0afc9311 Merge branch 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (55 commits)
  KVM: unbreak userspace that does not sets tss address
  KVM: MMU: cleanup pte write path
  KVM: MMU: introduce a common function to get no-dirty-logged slot
  KVM: fix rcu usage in init_rmode_* functions
  KVM: fix kvmclock regression due to missing clock update
  KVM: emulator: Fix permission checking in io permission bitmap
  KVM: emulator: Fix io permission checking for 64bit guest
  KVM: SVM: Load %gs earlier if CONFIG_X86_32_LAZY_GS=n
  KVM: x86: Remove useless regs_page pointer from kvm_lapic
  KVM: improve comment on rcu use in irqfd_deassign
  KVM: MMU: remove unused macros
  KVM: MMU: cleanup page alloc and free
  KVM: MMU: do not record gfn in kvm_mmu_pte_write
  KVM: MMU: move mmu pages calculated out of mmu lock
  KVM: MMU: set spte accessed bit properly
  KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bits
  KVM: Start lock documentation
  KVM: better readability of efer_reserved_bits
  KVM: Clear async page fault hash after switching to real mode
  KVM: VMX: Initialize vm86 TSS only once.
  ...
2011-03-17 18:40:35 -07:00
Linus Torvalds
61ef46fd45 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] pcc-cpufreq: remove duplicate statements
  [CPUFREQ] Remove the pm_message_t argument from driver suspend
  [CPUFREQ] Remove unneeded locks
  [CPUFREQ] Remove old, deprecated per cpu ondemand/conservative sysfs files
  [CPUFREQ] Remove deprecated sysfs file sampling_rate_max
  [CPUFREQ] powernow-k8: The table index is not worth displaying
  [CPUFREQ] calculate delay after dbs_check_cpu
  [CPUFREQ] Add documentation for sampling_down_factor
  [CPUFREQ] drivers/cpufreq: Remove unnecessary semicolons
2011-03-17 17:42:14 -07:00
Linus Torvalds
978ca164bd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (38 commits)
  amd64_edac: Fix decode_syndrome types
  amd64_edac: Fix DCT argument type
  amd64_edac: Fix ranges signedness
  amd64_edac: Drop local variable
  amd64_edac: Fix PCI config addressing types
  amd64_edac: Fix DRAM base macros
  amd64_edac: Fix node id signedness
  amd64_edac: Drop redundant declarations
  amd64_edac: Enable driver on F15h
  amd64_edac: Adjust ECC symbol size to F15h
  amd64_edac: Simplify scrubrate setting
  PCI: Rename CPU PCI id define
  amd64_edac: Improve DRAM address mapping
  amd64_edac: Sanitize ->read_dram_ctl_register
  amd64_edac: Adjust sys_addr to chip select conversion routine to F15h
  amd64_edac: Beef up early exit reporting
  amd64_edac: Revamp online spare handling
  amd64_edac: Fix channel interleave removal
  amd64_edac: Correct node interleaving removal
  amd64_edac: Add support for interleaved region swapping
  ...

Fix up trivial conflict in include/linux/pci_ids.h due to
AMD_15H_NB_MISC being renamed as AMD_15H_NB_F3 next to the new
AMD_15H_NB_LINK entry.
2011-03-17 17:21:32 -07:00