Commit Graph

8573 Commits

Author SHA1 Message Date
Ken'ichi Ohmichi
fd59d231f8 Add vmcoreinfo
This patch set frees the restriction that makedumpfile users should install a
vmlinux file (including the debugging information) into each system.

makedumpfile command is the dump filtering feature for kdump.  It creates a
small dumpfile by filtering unnecessary pages for the analysis.  To
distinguish unnecessary pages, it needs a vmlinux file including the debugging
information.  These days, the debugging package becomes a huge file, and it is
hard to install it into each system.

To solve the problem, kdump developers discussed it at lkml and kexec-ml.  As
the result, we reached the conclusion that necessary information for dump
filtering (called "vmcoreinfo") should be embedded into the first kernel file
and it should be accessed through /proc/vmcore during the second kernel.
(http://www.uwsg.iu.edu/hypermail/linux/kernel/0707.0/1806.html)

Dan Aloni created the patch set for the above implementation.
(http://www.uwsg.iu.edu/hypermail/linux/kernel/0707.1/1053.html)

And I updated it for multi architectures and memory models.
(http://lists.infradead.org/pipermail/kexec/2007-August/000479.html)

Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:54 -07:00
Paul E. McKenney
3806204ca9 Remove workaround for unimmunized rcu_dereference from mce_log()
Remove the rmb() from mce_log(), since the immunized version of
rcu_dereference() makes it unnecessary.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:46 -07:00
H. Peter Anvin
30c826451d [x86] remove uses of magic macros for boot_params access
Instead of using magic macros for boot_params access, simply use the
boot_params structure.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2007-10-16 17:38:31 -07:00
Jeremy Fitzhardinge
8965c1c095 paravirt: clean up lazy mode handling
Currently, the set_lazy_mode pv_op is overloaded with 5 functions:
 1. enter lazy cpu mode
 2. leave lazy cpu mode
 3. enter lazy mmu mode
 4. leave lazy mmu mode
 5. flush pending batched operations

This complicates each paravirt backend, since it needs to deal with
all the possible state transitions, handling flushing, etc. In
particular, flushing is quite distinct from the other 4 functions, and
seems to just cause complication.

This patch removes the set_lazy_mode operation, and adds "enter" and
"leave" lazy mode operations on mmu_ops and cpu_ops.  All the logic
associated with enter and leaving lazy states is now in common code
(basically BUG_ONs to make sure that no mode is current when entering
a lazy mode, and make sure that the mode is current when leaving).
Also, flush is handled in a common way, by simply leaving and
re-entering the lazy mode.

The result is that the Xen, lguest and VMI lazy mode implementations
are much simpler.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Zach Amsden <zach@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Anthony Liguory <aliguori@us.ibm.com>
Cc: "Glauber de Oliveira Costa" <glommer@gmail.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
2007-10-16 11:51:29 -07:00
Jeremy Fitzhardinge
93b1eab3d2 paravirt: refactor struct paravirt_ops into smaller pv_*_ops
This patch refactors the paravirt_ops structure into groups of
functionally related ops:

pv_info - random info, rather than function entrypoints
pv_init_ops - functions used at boot time (some for module_init too)
pv_misc_ops - lazy mode, which didn't fit well anywhere else
pv_time_ops - time-related functions
pv_cpu_ops - various privileged instruction ops
pv_irq_ops - operations for managing interrupt state
pv_apic_ops - APIC operations
pv_mmu_ops - operations for managing pagetables

There are several motivations for this:

1. Some of these ops will be general to all x86, and some will be
   i386/x86-64 specific.  This makes it easier to share common stuff
   while allowing separate implementations where needed.

2. At the moment we must export all of paravirt_ops, but modules only
   need selected parts of it.  This allows us to export on a case by case
   basis (and also choose which export license we want to apply).

3. Functional groupings make things a bit more readable.

Struct paravirt_ops is now only used as a template to generate
patch-site identifiers, and to extract function pointers for inserting
into jmp/calls when patching.  It is only instantiated when needed.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Cc: Zach Amsden <zach@vmware.com>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Anthony Liguory <aliguori@us.ibm.com>
Cc: "Glauber de Oliveira Costa" <glommer@gmail.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
2007-10-16 11:51:29 -07:00
Linus Torvalds
92d15c2ccb Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block: (63 commits)
  Fix memory leak in dm-crypt
  SPARC64: sg chaining support
  SPARC: sg chaining support
  PPC: sg chaining support
  PS3: sg chaining support
  IA64: sg chaining support
  x86-64: enable sg chaining
  x86-64: update pci-gart iommu to sg helpers
  x86-64: update nommu to sg helpers
  x86-64: update calgary iommu to sg helpers
  swiotlb: sg chaining support
  i386: enable sg chaining
  i386 dma_map_sg: convert to using sg helpers
  mmc: need to zero sglist on init
  Panic in blk_rq_map_sg() from CCISS driver
  remove sglist_len
  remove blk_queue_max_phys_segments in libata
  revert sg segment size ifdefs
  Fixup u14-34f ENABLE_SG_CHAINING
  qla1280: enable use_sg_chaining option
  ...
2007-10-16 10:09:16 -07:00
Masami Hiramatsu
f438d914b2 kprobes: support kretprobe blacklist
Introduce architecture dependent kretprobe blacklists to prohibit users
from inserting return probes on the function in which kprobes can be
inserted but kretprobes can not.

This patch also removes "__kprobes" mark from "__switch_to" on x86_64 and
registers "__switch_to" to the blacklist on x86-64, because that mark is to
prohibit user from inserting only kretprobe.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:43:10 -07:00
Christoph Hellwig
74a0b57627 x86: optimize page faults like all other achitectures and kill notifier cruft
x86(-64) are the last architectures still using the page fault notifier
cruft for the kprobes page fault hook.  This patch converts them to the
proper direct calls, and removes the now unused pagefault notifier bits
aswell as the cruft in kprobes.c that was related to this mess.

I know Andi didn't really like this, but all other architecture maintainers
agreed the direct calls are much better and besides the obvious cruft
removal a common way of dealing with kprobes across architectures is
important aswell.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix sparc64]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Andi Kleen <ak@suse.de>
Cc: <linux-arch@vger.kernel.org>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:50 -07:00
Mike Travis
d5a7430ddc Convert cpu_sibling_map to be a per cpu variable
Convert cpu_sibling_map from a static array sized by NR_CPUS to a per_cpu
variable.  This saves sizeof(cpumask_t) * NR unused cpus.  Access is mostly
from startup and CPU HOTPLUG functions.

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:50 -07:00
Mike Travis
0835761129 x86: Convert cpu_core_map to be a per cpu variable
This is from an earlier message from 'Christoph Lameter':

    cpu_core_map is currently an array defined using NR_CPUS. This means that
    we overallocate since we will rarely really use maximum configured cpu.

    If we put the cpu_core_map into the per cpu area then it will be allocated
    for each processor as it comes online.

    This means that the core map cannot be accessed until the per cpu area
    has been allocated. Xen does a weird thing here looping over all processors
    and zeroing the masks that are not yet allocated and that will be zeroed
    when they are allocated. I commented the code out.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:50 -07:00
Alexey Dobriyan
1bcf548293 Consolidate PTRACE_DETACH
Identical handlers of PTRACE_DETACH go into ptrace_request().
Not touching compat code.
Not touching archs that don't call ptrace_request.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:49 -07:00
Jens Axboe
9ee1bea469 x86-64: update pci-gart iommu to sg helpers
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:26:02 +02:00
Jens Axboe
b922f53bd0 x86-64: update nommu to sg helpers
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:26:02 +02:00
Jens Axboe
8b87d9f48a x86-64: update calgary iommu to sg helpers
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:26:02 +02:00
Linus Torvalds
6a84258e5f Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (37 commits)
  PCI: merge almost all of pci_32.h and pci_64.h together
  PCI: X86: Introduce and enable PCI domain support
  PCI: Add 'nodomains' boot option, and pci_domains_supported global
  PCI: modify PCI bridge control ISA flag for clarity
  PCI: use _CRS for PCI resource allocation
  PCI: avoid P2P prefetch window for expansion ROMs
  PCI: skip ISA ioresource alignment on some systems
  PCI: remove transparent bridge sizing
  pci: write file size to inode on proc bus file write
  pci: use size stored in proc_dir_entry for proc bus files
  pci: implement "pci=noaer"
  PCI: fix IDE legacy mode resources
  MSI: Use correct data offset for 32-bit MSI in read_msi_msg()
  PCI: Fix incorrect argument order to list_add_tail() in PCI dynamic ID code
  PCI: i386: Compaq EVO N800c needs PCI bus renumbering
  PCI: Remove no longer correct documentation regarding MSI vector assignment
  PCI: re-enable onboard sound on "MSI K8T Neo2-FIR"
  PCI: quirk_vt82c586_acpi: Omit reading PCI revision ID
  PCI: quirk amd_8131_mmrbc: Omit reading pci revision ID
  cpqphp: Use PCI_CLASS_REVISION instead of PCI_REVISION_ID for read
  ...
2007-10-12 15:50:23 -07:00
Linus Torvalds
4d5709a7b7 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Don't take semaphore in cpufreq_quick_get()
  [CPUFREQ] Support different families in fid/did to frequency conversion
  [CPUFREQ] cpufreq_stats: misc cpuinit section annotations
  [CPUFREQ] implement !CONFIG_CPU_FREQ stub for  cpufreq_unregister_notifier()
  [CPUFREQ] mark hotplug notifier callback as __cpuinit
  [CPUFREQ] Only check for transition latency on problematic governors (kconfig fix)
  [CPUFREQ] allow ondemand and conservative cpufreq governors to be used as default
  [CPUFREQ] move policy's governor initialisation out of low-level drivers into cpufreq core
  [CPUFREQ] Longhaul - Add support for PM133 northbridge
  [CPUFREQ] x86: use num_online_nodes to get physical cpus numbers for
2007-10-12 15:42:01 -07:00
Denis V. Lunev
3d034aecd8 PCI: pci_get_device call from interrupt in reboot fixups
The following calltrace is possible now:
 handle_sysrq
   machine_emergency_restart
     mach_reboot_fixups
       pci_get_device
         pci_get_subsys
	   down_read
The patch skips reboot fixup if called from sysrq-B code.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:15 -07:00
David Brownell
aa24886e37 dma_free_coherent() needs irqs enabled (sigh)
On at least ARM (and I'm told MIPS too) dma_free_coherent() has a newish
call context requirement: unlike its dma_alloc_coherent() sibling, it may
not be called with IRQs disabled.  (This was new behavior on ARM as of late
2005, caused by ARM SMP updates.) This little surprise can be annoyingly
driver-visible.

Since it looks like that restriction won't be removed, this patch changes
the definition of the API to include that requirement.  Also, to help catch
nonportable drivers, it updates the x86 and swiotlb versions to include the
relevant warnings.  (I already observed that it trips on the
bus_reset_tasklet of the new firewire_ohci driver.)

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: David Miller <davem@davemloft.net>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:15 -07:00
Venki Pallipadi
ed6fb174ee x86: HPET add another ICH7 PCI id
Add another PCI ID for ICH7 force hpet.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:24 +02:00
Venki Pallipadi
32a2da64c2 x86: HPET force enable ICH5 suspend/resume fix
A bugfix in ich5 hpet force detect which caused resumes to fail.  Thanks to
Udo A Steinberg for reporting the problem.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-12 23:04:24 +02:00
Venki Pallipadi
bfe0c1cc64 x86: HPET force enable for ICH5
force_enable hpet for ICH5.

[ Build fixes from Andrew Morton ]

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-12 23:04:24 +02:00
Venki Pallipadi
59c69f2a51 x86: HPET try to activate force detected hpet
Enable HPET later during boot, after the force detect in PCI quirks.  Also add
a call to repeat the force enabling at resume time.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-12 23:04:23 +02:00
Venki Pallipadi
d54bd57d65 x86: HPET force enable o ICH7 and later
Force detect and/or enable HPET on ICH chipsets.  This patch just handles the
detection part and following patches use this information.  Adds a function to
repeat the force enabling during resume time.

Using HPET this way, instead of PIT increases the time CPUs can reside in
C-state when system is totally idle.  On my test system with Core 2 Duo,
average C-state residency goes up from ~20mS to ~80mS.

[ Build fixed from Andrew Morton ]

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-12 23:04:23 +02:00
Venki Pallipadi
610bf2f143 x86: HPET restructure hpet code for hpet force enable
Restructure and rename legacy replacement mode HPET timer support.  Just the
code structural changes and should be zero functionality change.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-12 23:04:23 +02:00
Chris Wright
31c435d75e i386/x8664: cleanup the shared hpet code
Remove hpet_readl/writel from vsyscall.h, where it does not belong
anyway. Use the hpet code itself.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:23 +02:00
Chris Wright
bc1d99c1de x86_64: cleanup apic.c after clock events switch
Make variables static.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-12 23:04:23 +02:00
Thomas Gleixner
9f75e9b74a x86_64: remove now unused code
Remove the unused code after the switch to clock events.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:23 +02:00
Thomas Gleixner
2f0798a3b1 x86: unify timex.h variants
Combine the timex.h variants and move the TSC related code into tsc.h.
Move the set_cyc2ns_scale() call into the tsc calibraction code, where
it belongs.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:23 +02:00
Thomas Gleixner
5d5a2989b7 x86: kill 8253pit.h
Useless header file with 32 bit and 64 bit variants. Move the
single useful line to the place where it is used.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:23 +02:00
Thomas Gleixner
fb79d22e1d x86: disable apic timer for AMD C1E enabled CPUs
AMDs C1E enabled CPUs stop the local apic timer, when both cores are
idle. This is a hardware feature which breaks highres/dynticks.
Add the same quirk as we have for 32 bit already.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:07 +02:00
Thomas Gleixner
4e77ae3e10 x86: Fix irq0 / local apic timer accounting
The clock events merge introduced a change to the nmi watchdog code to
handle the not longer increasing local apic timer count in the
broadcast mode. This is fine for UP, but on SMP it pampers over a
stuck CPU which is not handling the broadcast interrupt due to the
unconditional sum up of local apic timer count and irq0 count.

To cover all cases we need to keep track on which CPU irq0 is
handled. In theory this is CPU#0 due to the explicit disabling of irq
balancing for irq0, but there are systems which ignore this on the
hardware level. The per cpu irq0 accounting allows us to remove the
irq0 to CPU0 binding as well.

Add a per cpu counter for irq0 and evaluate this instead of the global
irq0 count in the nmi watchdog code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:07 +02:00
Thomas Gleixner
b8ce335906 x86_64: convert to clock events
Finally switch to the clockevents code. Share code with i386 for
hpet and PIT.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:07 +02:00
Thomas Gleixner
ba7eda4c60 x86_64: Add (not yet used) clock event functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:07 +02:00
Chris Wright
0229068334 x86_64: prepare idle loop for dynamic ticks
Add tick_nohz_{stop,restart}_sched_tick to idle loop in prepartion for turning
on dynticks.  These are just noops until NO_HZ is enabled.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:07 +02:00
Thomas Gleixner
c4d58cbd15 x86_64: remove nested irq disables
setup_APIC_timer disables interrupts anyway. So no need to do the same
in setup_boot_APIC_clock and setup_secondary_APIC_clock. Disable
interrupts explicit in the calibration code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:07 +02:00
Thomas Gleixner
abc63fcd3c x86_64: apic change setup_APIC_timer calling convention
setup_APIC_timer takes the file global calibration result as an argument.
Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:07 +02:00
Thomas Gleixner
b58eb00df7 x86_64: Remove APIC_DIVISOR
APIC_DIVISOR is rather useless. It makes the calibration result more
accurate in the first place, but we discard this later when we write
the value to the APIC timer by dividing the calibration value by
APIC_DIVISOR.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:06 +02:00
Thomas Gleixner
d03030e917 x86_64: Move apic calibration code around
Let the calibration code fill in calibration_result directly and
move the variable on top of the file.

Fixup a printk w/o log level while at it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:06 +02:00
Thomas Gleixner
500ff08b76 x86_64: remove pit synchronization
The APIC timer setup code synchronizes the local APIC timer to the
PIT/HPET. This is pointless as the PIT and the local APIC timer
frequency are not correlated and the APIC timer calibration can never
be accurate enough to avoid that the local APIC timer and the PIT/HPET
drift apart.

Simply remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:06 +02:00
Thomas Gleixner
801740971a x86_64: prepare apic code for clock events
Change __setup_APIC_LVTT so it takes the arguments which are necessary
for the later clock events switch.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:06 +02:00
Yinghai Lu
7ffeeb1e03 x86: remove never used apic_mapped
[ tglx: arch/x86 adaptation ]

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-12 23:04:06 +02:00
Thomas Gleixner
0190dae54d i386: prepare sharing the PIT code
PIT clock events work already and the PIT handling is the same for
i386 and x86_64. x86_64 does not support PIT as a clock source, so
disable the PIT clocksource for x86_64.

Use the i386 i8253.h include file for x86_64 as well to share the
exports and the PIT constants.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:06 +02:00
Thomas Gleixner
f5e0e93faf i386: prepare sharing the PIT code
PIT clock events work already and the PIT handling is the same for
i386 and x86_64. x86_64 does not support PIT as a clock source, so
disable the PIT clocksource for x86_64.

Prepare i8253.h to be shared with x8664

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:06 +02:00
Thomas Gleixner
28769149c2 i386: prepare sharing the hpet code with x86_64
Add the x8664 specific bits (mapping) to share the hpet code later.

Move the reserve_platform_timer call to late init. This is necessary
for x86_64, as hpet enable() is called before memory is setup. i386
calls it in late_time_init, but it does not hurt to do it later for
both.

Pull in the x8664 hpet disable command line option as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:06 +02:00
Thomas Gleixner
06a24dec10 i386: prepare sharing the hpet code with x86_64
The hpet implementations of i386 and x8664 has been mostly the same
before the clock events conversion of i386. The clock events
conversion of i386 hpet is already done. So it makes sense to share
the code for the x86_64 clock events conversion.

Abstract out the mapping functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:06 +02:00
Thomas Gleixner
d371698efd x86_64: Consolidate tsc calibration
Move the TSC calibration code to tsc.c. Reimplement it so the
pm timer can be used as a reference as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:06 +02:00
Andres Salomon
8f36881b3c x86: Geode MFGPT clock event device support
Add support for an MFGPT clock event device; this allows us to use MFGPTs as
the basis for high-resolution timers.

Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-12 23:04:06 +02:00
Andres Salomon
83d7384f8d x86: Geode Multi-Function General Purpose Timers support
This adds support for Multi-Function General Purpose Timers.  It detects the
available timers during southbridge init, and provides an API for allocating
and setting the timers.  They're higher resolution than the standard PIT, so
the MFGPTs come in handy for quite a few things.

Note that we never clobber the timers that the BIOS might have opted to use;
we just check for unused timers.

Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-12 23:04:06 +02:00
Venki Pallipadi
5fa3a246ea x86: block irq balancing for timer
Disable irq balancing on IRQ0.  Several SIS chipsets lock up when you try to
change affinity of IRQ #0.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:06 +02:00
Thomas Gleixner
3c9aea4742 x86: Fix irq0 / local apic timer accounting
The clock events merge introduced a change to the nmi watchdog code to
handle the not longer increasing local apic timer count in the
broadcast mode. This is fine for UP, but on SMP it pampers over a
stuck CPU which is not handling the broadcast interrupt due to the
unconditional sum up of local apic timer count and irq0 count.

To cover all cases we need to keep track on which CPU irq0 is
handled. In theory this is CPU#0 due to the explicit disabling of irq
balancing for irq0, but there are systems which ignore this on the
hardware level. The per cpu irq0 accounting allows us to remove the
irq0 to CPU0 binding as well.

Add a per cpu counter for irq0 and evaluate this instead of the global
irq0 count in the nmi watchdog code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2007-10-12 23:04:06 +02:00
Linus Torvalds
038a5008b2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (867 commits)
  [SKY2]: status polling loop (post merge)
  [NET]: Fix NAPI completion handling in some drivers.
  [TCP]: Limit processing lost_retrans loop to work-to-do cases
  [TCP]: Fix lost_retrans loop vs fastpath problems
  [TCP]: No need to re-count fackets_out/sacked_out at RTO
  [TCP]: Extract tcp_match_queue_to_sack from sacktag code
  [TCP]: Kill almost unused variable pcount from sacktag
  [TCP]: Fix mark_head_lost to ignore R-bit when trying to mark L
  [TCP]: Add bytes_acked (ABC) clearing to FRTO too
  [IPv6]: Update setsockopt(IPV6_MULTICAST_IF) to support RFC 3493, try2
  [NETFILTER]: x_tables: add missing ip6t_modulename aliases
  [NETFILTER]: nf_conntrack_tcp: fix connection reopening
  [QETH]: fix qeth_main.c
  [NETLINK]: fib_frontend build fixes
  [IPv6]: Export userland ND options through netlink (RDNSS support)
  [9P]: build fix with !CONFIG_SYSCTL
  [NET]: Fix dev_put() and dev_hold() comments
  [NET]: make netlink user -> kernel interface synchronious
  [NET]: unify netlink kernel socket recognition
  [NET]: cleanup 3rd argument in netlink_sendskb
  ...

Fix up conflicts manually in Documentation/feature-removal-schedule.txt
and my new least favourite crap, the "mod_devicetable" support in the
files include/linux/mod_devicetable.h and scripts/mod/file2alias.c.

(The latter files seem to be explicitly _designed_ to get conflicts when
different subsystems work with them - that have an absolutely horrid
lack of subsystem separation!)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-11 19:40:14 -07:00
Linus Torvalds
19ad7ae47e Merge branch 'dmi-const' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/misc-2.6
* 'dmi-const' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
  drivers/firmware: const-ify DMI API and internals
2007-10-11 19:18:45 -07:00
Peter Zijlstra
58dfe883d3 lockdep: annotate kprobes irq fiddling
kprobes disables irqs for jprobes, but does not tell lockdep about it.

This resolves this warning during an allyesconfig bzImage bootup test:

 [  423.670337] WARNING: at kernel/lockdep.c:2658 check_flags()
 [  423.670341]  [<c0107f01>] show_trace_log_lvl+0x19/0x2e
 [  423.670348]  [<c0107ffa>] show_trace+0x12/0x14
 [  423.670350]  [<c0108010>] dump_stack+0x14/0x16
 [  423.670353]  [<c015249d>] check_flags+0x95/0x142
 [  423.670357]  [<c0155576>] lock_acquire+0x52/0xb8
 [  423.670360]  [<c1313c90>] _spin_lock+0x2e/0x58
 [  423.670365]  [<c11b72f9>] jtcp_rcv_established+0x6e/0x189
 [  423.670369]  [<c11810da>] tcp_v4_do_rcv+0x30b/0x620
 [  423.670373]  [<c1181c8c>] tcp_v4_rcv+0x89d/0x8fa
 [  423.670376]  [<c1167dba>] ip_local_deliver+0x17d/0x225
 [  423.670380]  [<c11682f5>] ip_rcv+0x493/0x4ce
 [  423.670383]  [<c11177ef>] netif_receive_skb+0x347/0x365
 [  423.670388]  [<c07b6e7b>] nv_napi_poll+0x501/0x6c3
 [  423.670393]  [<c1115b1a>] net_rx_action+0xa3/0x1b6
 [  423.670396]  [<c013bdee>] __do_softirq+0x76/0xfb
 [  423.670400]  [<c0109189>] do_softirq+0x75/0xf3

[ akpm: checkpatch.pl cleanups ]

Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-10-11 22:25:25 +02:00
Peter Zijlstra
10cd706d18 lockdep: x86_64: connect the sysexit hook
Run the lockdep_sys_exit hook after all other C code on the syscall
return path.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 22:11:12 +02:00
Peter Zijlstra
c7e872e7da lockdep: i386: connect the sysexit hook
Run the lockdep_sys_exit hook after all other C code on the syscall
return path.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 22:11:12 +02:00
Thomas Gleixner
89039b37be x86: force timer broadcast on late AMD C1E detection
The 64bit SMP bootup is slightly different to the 32bit one. It enables
the boot CPU local APIC timer before all CPUs are brought up. Some AMD C1E
systems have the C1E feature flag only set in the secondary CPU. Due to
the early enable of the boot CPU local APIC timer the APIC timer is
registered as a fully functional device. When we detect the wreckage during
the bringup of the secondary CPU, we need to force the boot CPU into
broadcast mode. 

Check the C1E caused APIC timer disable, when the secondary APIC timer is
initialized. If the boot CPU APIC timer was registered as a functional
clock event device, then fix this up and utilize the
CLOCK_EVT_NOTIFY_BROADCAST_FORCE mechanism to force the already
registered boot CPU APIC timer into broadcast mode.

Tested by force injecting the failure mode.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-14 22:57:45 +02:00
Thomas Gleixner
3ac508be76 x86: move local APIC timer init to the end of start_secondary()
Preparatory patch for the AMD C1E wreckage fixup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-14 22:57:45 +02:00
Dave Jones
b097976e8d x86: fix missing include for vsyscall
> Maybe I just picked a bad time to try, but...
 > 
 > arch/x86/kernel/alternative.c: In function 'apply_alternatives':
 > arch/x86/kernel/alternative.c:191: error: 'VSYSCALL_START' undeclared (first use in this function)
 > arch/x86/kernel/alternative.c:191: error: (Each undeclared identifier is reported only once
 > arch/x86/kernel/alternative.c:191: error: for each function it appears in.)
 > arch/x86/kernel/alternative.c:191: error: 'VSYSCALL_END' undeclared (first use in this function)
 > make[1]: *** [arch/x86/kernel/alternative.o] Error 1
 > make: *** [arch/x86/kernel] Error 2

Try this.

Include missing header for vsyscall.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-14 22:57:45 +02:00
Al Viro
64b33619a3 long vs. unsigned long - low-hanging fruits in drivers
deal with signedness of the stuff passed to set_bit() et.al.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-14 12:41:51 -07:00
Dave Jones
835c34a168 Delete filenames in comments.
Since the x86 merge, lots of files that referenced their own filenames
are no longer correct.  Rather than keep them up to date, just delete
them, as they add no real value.

Additionally:
- fix up comment formatting in scx200_32.c
- Remove a credit from myself in setup_64.c from a time when we had no SCM
- remove longwinded history from tsc_32.c which can be figured out from
  git.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-13 10:01:23 -07:00
Thomas Gleixner
96a388de5d i386/x86_64: move headers to include/asm-x86
Move the headers to include/asm-x86 and fixup the
header install make rules

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:20:03 +02:00
Thomas Gleixner
27bd0c9556 x86: sanitize pathes arch/x86/kernel/cpu/Makefile
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:17:26 +02:00
Thomas Gleixner
45bc6cd119 x86: sanitize pathes arch/x86/kernel/Makefile_64
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:17:25 +02:00
Thomas Gleixner
c668ce14f5 x86: sanitize pathes arch/x86/kernel/Makefile_32
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:17:25 +02:00
Thomas Gleixner
250c22777f x86_64: move kernel
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:17:24 +02:00
Thomas Gleixner
51b2833060 x86_64: move kernel/cpufreq
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:17:06 +02:00
Thomas Gleixner
e8d08eb1b5 x86_64: move kernel/acpi
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:17:05 +02:00
Thomas Gleixner
9a163ed8e0 i386: move kernel
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:17:01 +02:00
Thomas Gleixner
f7627e2513 i386: move kernel/cpu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:16:58 +02:00
Thomas Gleixner
2ec1df4130 i386: move kernel/cpu/mtrr
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:16:28 +02:00
Thomas Gleixner
ee580dc91e i386: move kernel/cpu/cpufreq
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:16:27 +02:00
Thomas Gleixner
c18db0d7e2 i386: move kernel/cpu/mcheck
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:16:25 +02:00
Thomas Gleixner
23d6f82bd1 i386: move kernel/acpi
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-11 11:16:23 +02:00