Impact: fix bug with irq-descriptor moving when logical flat
Rusty observed:
> The effect of setting desc->affinity (ie. from userspace via sysfs) has varied
> over time. In 2.6.27, the 32-bit code anded the value with cpu_online_map,
> and both 32 and 64-bit did that anding whenever a cpu was unplugged.
>
> 2.6.29 consolidated this into one routine (and fixed hotplug) but introduced
> another variation: anding the affinity with cfg->domain. Is this right, or
> should we just set it to what the user said? Or as now, indicate that we're
> restricting it.
Eric pointed out that desc->affinity should be what the user requested,
if it is at all possible to honor the user space request.
This bug got introduced by commit 22f65d31b "x86: Update io_apic.c to use
new cpumask API".
Fix it by moving the masking to before the descriptor moving ...
Reported-by: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <49C94134.4000408@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While looking at the issue in the thread:
http://marc.info/?l=dri-devel&m=123606627824556&w=2
noticed a bug in pci PAT code and memory type setting.
PCI mmap code did not set the proper protection in vma, when it
inherited protection in reserve_memtype. This bug only affects
the case where there exists a WC mapping before X does an mmap
with /proc or /sys pci interface. This will cause X userlevel
mmap from /proc or /sysfs to fail on fork.
Reported-by: Kevin Winchester <kjwinchester@gmail.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: <stable@kernel.org>
LKML-Reference: <20090323190720.GA16831@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: section mismatch fix
Ingo reports these warnings:
> WARNING: vmlinux.o(.text+0x6a288e): Section mismatch in reference from
> the function dmi_alloc() to the function .init.text:extend_brk()
> The function dmi_alloc() references
> the function __init extend_brk().
> This is often because dmi_alloc lacks a __init annotation or the
> annotation of extend_brk is wrong.
dmi_alloc() is a static inline, and so should be immune to this
kind of error. But force it to be inlined and make it __init
anyway, just to be extra sure.
All of dmi_alloc()'s callers are already __init.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49C6B23C.2040308@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Grant picked up the wrong version of "Respect _PAGE_COHERENT on classic
ppc32 SW" (commit a4bd6a93c3)
It was missing the code to actually deal with the fixup of
_PAGE_COHERENT based on the CPU feature.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* 'fix-includes' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: merge the non-MMU and MMU versions of siginfo.h
m68k: use the MMU version of unistd.h for all m68k platforms
m68k: merge the non-MMU and MMU versions of signal.h
m68k: merge the non-MMU and MMU versions of ptrace.h
m68k: use MMU version of setup.h for both MMU and non-MMU
m68k: merge the non-MMU and MMU versions of sigcontext.h
m68k: merge the non-MMU and MMU versions of swab.h
m68k: merge the non-MMU and MMU versions of param.h
Impact: cleanup
This patch fixes the following sparse warnings:
arch/x86/kernel/apic/io_apic.c:3602:17: warning: symbol 'hpet_msi_type'
was not declared. Should it be static?
arch/x86/kernel/apic/io_apic.c:3467:30: warning: Using plain integer as
NULL pointer
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.com>
LKML-Reference: <1237741871-5827-2-git-send-email-dmitri.vorobiev@movial.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts commit 698609bdcd.
69860 breaks Xen booting, as it relies on head*.S to set up the fixmap
pagetables (as a side-effect of initializing the USB debug port).
Xen, however, does not boot via head*.S, and so the fixmap area is
not initialized.
The specific symptom of the crash is a fault in dmi_scan(), because
the pointer that early_ioremap returns is not actually present.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <49C43A8E.5090203@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
To reduce the size of the oversized function __get_smp_config()
There should be no impact to functionality.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Impact: fix incorrect error message
- IO APIC resource allocation error message contains one too many "be".
- Print the error message iff there are IO APICs in the system.
I've seen this error message for some time on my x86-32 laptop...
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Alan Bartlett <ajb.stxsl@googlemail.com>
LKML-Reference: <200903202100.30789.bzolnier@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Check alternate signal stack overflow with proper stack pointer.
The stack pointer of the next signal frame is different if that
task has i387 state.
On x86_64, redzone would be included.
No need to check SA_ONSTACK if we're already using alternate signal stack.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Roland McGrath <roland@redhat.com>
LKML-Reference: <49C2874D.3080002@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add new interfaces:
set_pages_array_uc()
set_pages_array_wb()
that can be used change the page attribute for a bunch of pages with
flush etc done once at the end of all the changes. These interfaces
are similar to existing set_memory_array_uc() and set_memory_array_wc().
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: arjan@infradead.org
Cc: eric@anholt.net
Cc: airlied@redhat.com
LKML-Reference: <20090319215358.901545000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add struct page array pointer to cpa struct and CPA_PAGES_ARRAY.
With that we can add change_page_attr_set_clr() a parameter to pass
struct page array pointer and that can be handled by the underlying
cpa code.
cpa_flush_array() is also changed to support both addr array or
struct page pointer array, depending on the flag.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: arjan@infradead.org
Cc: eric@anholt.net
Cc: airlied@redhat.com
LKML-Reference: <20090319215358.758513000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change change_page_attr_set_clr() array parameter to a flag. This helps
following patches which adds an interface to change attr to uc/wb over a
set of pages referred by struct page.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: arjan@infradead.org
Cc: eric@anholt.net
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: airlied@redhat.com
LKML-Reference: <20090319215358.611346000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] make page table upgrade work again
[S390] make page table walking more robust
[S390] Dont check for pfn_valid() in uaccess_pt.c
[S390] ftrace/mcount: fix kernel stack backchain
[S390] topology: define SD_MC_INIT to fix performance regression
[S390] __div64_31 broken for CONFIG_MARCH_G5
Impact: cleanup, remove last user of set_pte_present
set_pte_vaddr() is only used to install ptes in fixmaps, and
should never be used to overwrite a present mapping.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Xen-devel <xen-devel@lists.xensource.com>
LKML-Reference: <1237406613-2929-1-git-send-email-jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When you compile kernel on Sparc64 with heap memory checking and type
"cat /proc/iomem", you get a crash, because pointers in struct
resource are uninitialized.
Most code fills struct resource with zeros, so I assume that it is
responsibility of the caller of request_resource to initialized it,
not the responsibility of request_resource functuion.
After 2.6.29 is out, there could be a check for uninitialized fields
added to request_resource to avoid crashes like this.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise it might interrupt switch_to() midstream and use
half-cooked register window state.
Reported-by: Chris Torek <chris.torek@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Impact: fix rarely-used feature
The VGA Miscellaneous Output Register is read from address 0x3CC but
written to address 0x3C2. This was missed when this code was
converted from assembly to C. While we're at it, clean up the code by
making the overflow bits and the math used to set the bits explicit.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Impact: cleanup
Refactor the MP-table parsing code via the introduction of the
following helper functions:
skip_entry()
smp_reserve_bootmem()
check_irq_src()
check_slot()
To simplify the code flow and to reduce the size of the
following oversized functions: smp_read_mpc(), smp_scan_config().
There should be no impact to functionality.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
After TASK_SIZE now gives the current size of the address space the
upgrade of a 64 bit process from 3 to 4 levels of page table needs
to use the arch_mmap_check hook to catch large mmap lengths. The
get_unmapped_area* functions need to check for -ENOMEM from the
arch_get_unmapped_area*, upgrade the page table and retry.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Make page table walking on s390 more robust. The current code requires
that the pgd/pud/pmd/pte loop is only done for address ranges that are
below the end address of the last vma of the address space. But this
is not always true, e.g. the generic page table walker does not guarantee
this. Change TASK_SIZE/TASK_SIZE_OF to reflect the current size of the
address space. This makes the generic page table walker happy but it
breaks the upgrade of a 3 level page table to a 4 level page table.
To make the upgrade work again another fix is required.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
pfn_valid() actually checks for a valid struct page and not for a
valid pfn. Using xip mappings w/o struct pages, this will result in
-EFAULT returned by the (page table walk) user copy functions,
even though there is valid memory. Those user copy functions don't
need a struct page, so this patch just removes the pfn_valid() check.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With packed stack the backchain is at a different location.
Just use __SF_BACKCHAIN as an offset to store the backchain.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The default values for SD_MC_INIT cause an additional cpu usage of up
to 40% on some network benchmarks compared to the plain SD_CPU_INIT
values. So just define SD_MC_INIT to SD_CPU_INIT.
More tuning needs to be done.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The implementation of __div64_31 for G5 machines is broken. The comments
in __div64_31 are correct, only the code does not do what the comments
say. The part "If the remainder has overflown subtract base and increase
the quotient" is only partially realized, the base is subtracted correctly
but the quotient is only increased if the dividend had the last bit set.
Using the correct instruction fixes the problem.
Cc: stable@kernel.org
Reported-by: Frans Pop <elendil@planet.nl>
Tested-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
arch/x86/kernel/cpu/mtrr/cleanup.c:197: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘long unsigned int’
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1237378015.13488.1.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix boot crash on UV systems
Commit 76ba0ecda0 "cpumask: use
cpumask_var_t in uv_flush_tlb_others" used cur_cpu as an iterator;
it was supposed to be zero for the code below it.
Reported-by: Cliff Wickman <cpw@sgi.com>
Original-From: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Mike Travis <travis@sgi.com>
Cc: steiner@sgi.com
Cc: <stable@kernel.org>
LKML-Reference: <200903180822.31196.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: optimize APIC IPI related barriers
Uncached MMIO accesses for xapic are inherently serializing and hence
we don't need explicit barriers for xapic IPI paths.
x2apic MSR writes/reads don't have serializing semantics and hence need
a serializing instruction or mfence, to make all the previous memory
stores globally visisble before the x2apic msr write for IPI.
Add x2apic_wrmsr_fence() in flush tlb path to x2apic specific paths.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "steiner@sgi.com" <steiner@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
LKML-Reference: <1237313814.27006.203.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Attempting to rid us of the problematic work_on_cpu(). Just use
smp_call_function_single() here.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <20090318042217.EF3F1DDF39@ozlabs.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix spurious IRQs
During irq migration, we send a low priority interrupt to the previous
irq destination. This happens in non interrupt-remapping case after interrupt
starts arriving at new destination and in interrupt-remapping case after
modifying and flushing the interrupt-remapping table entry caches.
This low priority irq cleanup handler can cleanup multiple vectors, as
multiple irq's can be migrated at almost the same time. While
there will be multiple invocations of irq cleanup handler (one cleanup
IPI for each irq migration), first invocation of the cleanup handler
can potentially cleanup more than one vector (as the first invocation can
see the requests for more than vector cleanup). When we cleanup multiple
vectors during the first invocation of the smp_irq_move_cleanup_interrupt(),
other vectors that are to be cleanedup can still be pending in the local
cpu's IRR (as smp_irq_move_cleanup_interrupt() runs with interrupts disabled).
When we are ready to unhook a vector corresponding to an irq, check if that
vector is registered in the local cpu's IRR. If so skip that cleanup and
do a self IPI with the cleanup vector, so that we give a chance to
service the pending vector interrupt and then cleanup that vector
allocation once we execute the lowest priority handler.
This fixes spurious interrupts seen when migrating multiple vectors
at the same time.
[ This is apparently possible even on conventional xapic, although to
the best of our knowledge it has never been seen. The stable
maintainers may wish to consider this one for -stable. ]
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: stable@kernel.org