This is a big irq cleanup patch.
* Use set_irq_chip() to register irq_chip.
* Initialize .mask, .unmask, .mask_ack field. Functions for these
method are already exist in most case.
* Do not initialize .startup, .shutdown, .enable, .disable fields if
default routines provided by irq_chip_set_defaults() were suitable.
* Remove redundant irq_desc initializations.
* Remove unnecessary local_irq_save/local_irq_restore, spin_lock.
With this cleanup, it would be easy to switch to slightly lightwait
irq flow handlers (handle_level_irq(), etc.) instead of __do_IRQ().
Though whole this patch is quite large, changes in each irq_chip are
not quite simple. Please review and test on your platform. Thanks.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Currently nobody outside time.c require mips_hpt_init(). Remove it
and call c0_hpt_timer_init() directly if R4k counter was used for
timer interrupt.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch has rewritten GALILEO_INL/GALILEO_OUTL using GT_READ/GT_WRITE.
This patch tested on Cobalt Qube2.
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This would get rid of some warnings about "long" vs. "long long".
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch introduces __pa_symbol() macro which should be used to
calculate the physical address of kernel symbols. It also relies
on RELOC_HIDE() to avoid any compiler's oddities when doing
arithmetics on symbols.
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
During early boot mem init, some configs couldn't use __pa() to
convert virtual into physical addresses. Specially for 64 bit
kernel cases when CONFIG_BUILD_ELF64=n. This patch make __pa()
work for _all_ configs and thus make CPHYSADDR() useless.
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
__pa() was used by virt_to_page() and virt_addr_valid(). These
latter are used when kernel is initialised so __pa() is not
appropriate, we use virt_to_phys() instead.
Futhermore __pa() is going to take care of CKSEG0/XKPHYS
address mix for 64 bit kernels. This makes __pa() more complex
than virt_to_phys() and this extra work is not needed by
virt_to_page() and virt_addr_valid().
Eventually it consolidates virt_to_phys() prototype by making
its argument 'const'. this avoids some warnings that was due
to some virt_to_page() usages which pass const pointer.
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The ib_cm_establish() function is replaced with a more generic
ib_cm_notify(). This routine is used to notify the CM that failover
has occurred, so that future CM messages (LAP, DREQ) reach the remote
CM. (Currently, we continue to use the original path) This bumps the
userspace CM ABI.
New alternate path information is captured when a LAP message is sent
or received. This allows QP attributes to be initialized for the user
when a new path is loaded after failover occurs.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Move declaration of struct pxa2xx_udc_mach_info from
include/asm-arm/arch-pxa/udc.h to new file
include/asm-arm/mach/udc_pxa2xx.h.
This allow us to use this structure with
multiple platforms - pxa and ixp4xx. USB
device controller used in pxa25x is the same
as controller used in ixp4xx.
Signed-off-by: Milan Svoboda <msvoboda@ra.rockwell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
MAX_HEADER is either set to LL_MAX_HEADER or LL_MAX_HEADER + 48, and
this is controlled by a set of CONFIG_* ifdef tests.
It is trying to use LL_MAX_HEADER + 48 when any of the tunnels are
enabled which set hard_header_len like this:
dev->hard_header_len = LL_MAX_HEADER + sizeof(struct xxx);
The correct set of tunnel drivers which do this are:
ipip
ip_gre
ip6_tunnel
sit
so make the ifdef test match.
Noticed by Patrick McHardy and with help from Herbert Xu.
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6:
[PATCH] x86-64: Use stricter in process stack check for unwinder
[PATCH] i386: Fix compilation with UP genericarch
[PATCH] x86-64: Fix warning in io_apic.c
[PATCH] x86-64: work around gcc4 issue with -Os in Dwarf2 stack unwind
[PATCH] x86_64: Align data segment to PAGE_SIZE boundary
* 'linus' of master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa:
[ALSA] version 1.0.13
[ALSA] snd-emu10k1: Fix capture for one variant.
[ALSA] Fix hang-up at disconnection of usb-audio
[ALSA] hda: fix typo for xw4400 PCI sub-ID
[ALSA] hda: fix sigmatel dell system detection
[ALSA] Enable stereo line input for TAS codec
[ALSA] rtctimer: handle RTC interrupts with a tasklet
include/scsi/libsas.h:479: error: field 'smp_req' has incomplete type
include/scsi/libsas.h:480: error: field 'smp_resp' has incomplete type
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix
arch/i386/mach-generic/built-in.o: In function `apicid_to_node':
summit.c:(.text+0x2f): undefined reference to `apicid_2_node'
with CONFIG_GENERICH_ARCH and !CONFIG_SMP
Signed-off-by: Andi Kleen <ak@suse.de>
You wouldn't think that doing an ALIGN() macro that aligns something up
to a power-of-two boundary would be likely to have bugs, would you?
But hey, in the wonderful world of mixing integer types, you have to be
careful. This just makes sure that the alignment is interpreted in the
same type as the thing to be aligned.
Thanks to Roland Dreier, who noticed that the amso1100 driver got broken
by the previous fix (that just extended the mask to "unsigned long", but
was still broken in "unsigned long long" - it just happened to be the
same on 64-bit architectures).
See commit 4c8bd7eeee for the history of
bugs here...
Acked-by: Roland Dreier <rdreier@cisco.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: David Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I still think using BUILD_BUG_ON() is unacceptable, especially given how
vague the error message was.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
[ And I already removed gthe BUILD_BUG_ON() in the previous commit ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This reverts commit ee3ce191e8, since it
broke on at least ARM, MIPS and PA-RISC due to complicated header file
dependencies.
Conflicts in include/linux/spinlock.h (due to the "nested" variety
fixes) fixed up by hand.
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Russell King <rmk+lkml@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Restoring old, correct comment for sk_filter_release, moving it to
where it should actually be, and changing new comment into proper
comment for sk_filter_rcu_free, where it actually makes sense.
The original fix submitted for this on Oct 23 mistakenly documented
the wrong function.
Signed-off-by: Paul Bonser <misterpib@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make it break or warn if you pass to spin_lock_irqsave() and friends
something different from "unsigned long flags;". Suprisingly large amount
of these was caught by recent commit
c53421b18f and others.
Idea is largely from FRV typechecking. Suggestions from Andrew Morton.
All stupid typos in first version fixed.
Passes allmodconfig on i386, x86_64, alpha, arm as well as my usual config.
Note #1: checking with sparse is still needed, because a driver can save
and pass around flags or something. So far patch is very intrusive.
Note #2: techically, we should break only if
sizeof(flags) < sizeof(unsigned long),
however, the more pain for getting suspicious code into kernel,
the better.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 3941/1: [Jornada7xx] - Addition to MAINTAINERS
[ARM] 3942/1: ARM: comment: consistent_sync should not be called directly
[ARM] ebsa110: fix warnings generated by asm/arch/io.h
[ARM] 3933/1: Source drivers/ata/Kconfig
The Au1xx IDE controller driver doesn't compile:
CC drivers/ide/mips/au1xxx-ide.o
/linux-2.6.19-rc6-work/drivers/ide/mips/au1xxx-ide.c:480: error: conflicting types for 'auide_ddma_tx_callback'
include2/asm/mach-au1x00/au1xxx_ide.h:174: error: previous declaration of 'auide_ddma_tx_callback' was here
/linux-2.6.19-rc6-work/drivers/ide/mips/au1xxx-ide.c:486: error: conflicting types for 'auide_ddma_rx_callback'
include2/asm/mach-au1x00/au1xxx_ide.h:176: error: previous declaration of 'auide_ddma_rx_callback' was here
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
/*
* Note: Drivers should NOT use this function directly, as it will break
* platforms with CONFIG_DMABOUNCE.
* Use the driver DMA support - see dma-mapping.h (dma_sync_*)
*/
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The IGMPV3_EXP() macro doesn't correctly shift the normalization bit, so
time-out values are longer than they should be.
Thanks to Dirk Ooms for finding the problem in IGMPv3 - MLDv2 had a
similar problem that was already fixed a year ago. :-(
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a quick hack to overcome the fact that SRCU currently does not
allow static initializers, and we need to sometimes initialize those
things before any other initializers (even "core" ones) can do so.
Currently we don't allow this at all for modules, and the only user that
needs is right now is cpufreq. As reported by Thomas Gleixner:
"Commit b4dfdbb3c7 ("[PATCH] cpufreq:
make the transition_notifier chain use SRCU breaks cpu frequency
notification users, which register the callback > on core_init
level."
Cc: Thomas Gleixner <tglx@timesys.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Andrew Morton <akpm@osdl.org>,
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Switch to using irq_handler_t for interrupt function handler pointers.
Change name of m68knommu's irq_hanlder_t data structure so it doesn't
clash with the common type (include/linux/interrupt.h).
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove two warnings:
drivers/serial/8250_early.c:136: warning: unused variable 'mapsize'
include/linux/io.h:47: warning: passing argument 1 of '__readb' discards qualifiers from pointer target type
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch has removed one too many semicolon in crypto.h.
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[TG3]: Disable TSO on 5906 if CLKREQ is enabled.
[TCP]: Fix up sysctl_tcp_mem initialization.
[NETFILTER]: ip6_tables: use correct nexthdr value in ipv6_find_hdr()
[NETFILTER]: ip6_tables: fixed conflicted optname for getsockopt
[NETFILTER]: Use pskb_trim in {ip,ip6,nfnetlink}_queue
[NETFILTER]: nfnetlink_log: fix byteorder of NFULA_SEQ_GLOBAL
[TG3]: Increase 5906 firmware poll time.
This adds fat_getattr() for setting stat->blksize. (FAT uses the size
of cluster for proper I/O)
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Due to hardware errata, TSO must be disabled if the PCI Express clock
request is enabled on 5906. The chip may hang when transmitting TSO
frames if CLKREQ is enabled.
Update version to 3.69.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
66 and 67 for getsockopt on IPv6 socket is doubly used for IPv6 Advanced
API and ip6tables. This moves numbers for ip6tables to 68 and 69.
This also kills XT_SO_* because {ip,ip6,arp}_tables doesn't have so much
common numbers now.
The old userland tools keep to behave as ever, because old kernel always
calls functions of IPv6 Advanced API for their numbers.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
All the infrastructure is already in place for this, so we only need
to allocate a syscall number and hook it up.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds the /sys/devices/system/cpu/*/topology/thread_siblings
files on powerpc. These files are already available on other
architectures.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6:
[PATCH] x86-64: Fix race in exit_idle
[PATCH] x86-64: Fix vgetcpu when CONFIG_HOTPLUG_CPU is disabled
[PATCH] x86: Add acpi_user_timer_override option for Asus boards
[PATCH] x86-64: setup saved_max_pfn correctly (kdump)
[PATCH] x86-64: Handle reserve_bootmem_generic beyond end_pfn
[PATCH] x86-64: shorten the x86_64 boot setup GDT to what the comment says
[PATCH] x86-64: Fix PTRACE_[SG]ET_THREAD_AREA regression with ia32 emulation.
[PATCH] x86-64: Fix partial page check to ensure unusable memory is not being marked usable.
Revert "[PATCH] MMCONFIG and new Intel motherboards"
(David:)
If hugetlbfs_file_mmap() returns a failure to do_mmap_pgoff() - for example,
because the given file offset is not hugepage aligned - then do_mmap_pgoff
will go to the unmap_and_free_vma backout path.
But at this stage the vma hasn't been marked as hugepage, and the backout path
will call unmap_region() on it. That will eventually call down to the
non-hugepage version of unmap_page_range(). On ppc64, at least, that will
cause serious problems if there are any existing hugepage pagetable entries in
the vicinity - for example if there are any other hugepage mappings under the
same PUD. unmap_page_range() will trigger a bad_pud() on the hugepage pud
entries. I suspect this will also cause bad problems on ia64, though I don't
have a machine to test it on.
(Hugh:)
prepare_hugepage_range() should check file offset alignment when it checks
virtual address and length, to stop MAP_FIXED with a bad huge offset from
unmapping before it fails further down. PowerPC should apply the same
prepare_hugepage_range alignment checks as ia64 and all the others do.
Then none of the alignment checks in hugetlbfs_file_mmap are required (nor
is the check for too small a mapping); but even so, move up setting of
VM_HUGETLB and add a comment to warn of what David Gibson discovered - if
hugetlbfs_file_mmap fails before setting it, do_mmap_pgoff's unmap_region
when unwinding from error will go the non-huge way, which may cause bad
behaviour on architectures (powerpc and ia64) which segregate their huge
mappings into a separate region of the address space.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When another interrupt happens in exit_idle the exit idle notifier
could be called an incorrect number of times.
Add a test_and_clear_bit_pda and use it handle the bit
atomically against interrupts to avoid this.
Pointed out by Stephane Eranian
Signed-off-by: Andi Kleen <ak@suse.de>
The vgetcpu per CPU initialization previously relied on CPU hotplug
events for all CPUs to initialize the per CPU state. That only
worked only on kernels with CONFIG_HOTPLUG_CPU enabled. On the
others some CPUs didn't get their state initialized properly
and vgetcpu wouldn't work.
Change the initialization sequence to instead run in a normal
initcall (which runs after the normal CPU bootup) and initialize
all running CPUs there. Later hotplug CPUs are still handled
with an hotplug notifier.
This actually simplifies the code somewhat.
Signed-off-by: Andi Kleen <ak@suse.de>
Timer overrides are normally disabled on Nvidia board because
they are commonly wrong, except on new ones with HPET support.
Unfortunately there are quite some Asus boards around that
don't have HPET, but need a timer override.
We don't know yet how to handle this transparently,
but at least add a command line option to force the timer override
and let them boot.
Cc: len.brown@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
If you call set_personality() with an expression such as:
set_personality(foo ? PERS_FOO1 : PERS_FOO2);
then this evaluates to:
((current->personality == foo ? PERS_FOO1 : PERS_FOO2) ? ...
which is obviously not the intended result. Add the missing parents
to ensure this gets evaluated as expected:
((current->personality == (foo ? PERS_FOO1 : PERS_FOO2)) ? ...
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix MSPEC driver to build for non SN2 enabled configs as the driver should
work in cached and uncached modes (no fetchop) on these systems. In
addition make MSPEC select IA64_UNCACHED_ALLOCATOR, which is required for
it and move it to arch/ia64/Kconfig to avoid warnings on non ia64
architectures running allmodconfig. Once the Kconfig code is fixed, we can
move it back.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Cc: Fernando Luis Vzquez Cao <fernando@oss.ntt.co.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- reorder 'struct vm_struct' to speedup lookups on CPUS with small cache
lines. The fields 'next,addr,size' should be now in the same cache line,
to speedup lookups.
- One minor cleanup in __get_vm_area_node()
- Bugfixes in vmalloc_user() and vmalloc_32_user() NULL returns from
__vmalloc() and __find_vm_area() were not tested.
[akpm@osdl.org: remove redundant BUG_ONs]
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This change makes __beXX available to user-space applications, such as
ipvsadm, which include ip_vs.h
Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a variant of ht_create_irq __ht_create_irq that takes an
aditional parameter update that is a function that is called whenever we want
to write to a drivers htirq configuration registers.
This is needed to support the ipath_iba6110 because it's registers in the
proper location are not actually conected to the hardware that controlls
interrupt delivery.
[bos@serpentine.com: fixes]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Cc: <olson@pathscale.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This refactoring actually optimizes the code a little by caching the value
that we think the device is programmed with instead of reading it back from
the hardware. Which simplifies the code a little and should speed things up a
bit.
This patch introduces the concept of a ht_irq_msg and modifies the
architecture read/write routines to update this code.
There is a minor consistency fix here as well as x86_64 forgot to initialize
the htirq as masked.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Acked-by: Bryan O'Sullivan <bos@pathscale.com>
Cc: <olson@pathscale.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some more errors from the IPMI send message command are retryable, but are not
being retried by the IPMI code. Make sure they get retried.
Signed-off-by: Corey Minyard <minyard@acm.org>
Cc: Frederic Lelievre <Frederic.Lelievre@ca.kontron.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In the case where an open creates the file, we shouldn't be rechecking
permissions to open the file; the open succeeds regardless of what the new
file's mode bits say.
This patch fixes the problem, but only by introducing yet another parameter
to nfsd_create_v3. This is ugly. This will be fixed by later patches.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Neil Brown <neilb@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is just commit 130fe05dbc ported to
x86-64, for all the same reasons. It cleans up the IO-APIC accesses in
order to then fix the ordering issues.
We move the accessor functions (that were only used by io_apic.c) out of
a header file, and use proper memory-mapped accesses rather than making
up our own "volatile" pointers.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'for-linus' of git://www.atmel.no/~hskinnemoen/linux/kernel/avr32:
AVR32: Add missing return instruction in __raw_writesb
AVR32: Wire up sys_epoll_pwait
AVR32: Fix thinko in generic_find_next_zero_le_bit()
AVR32: Get rid of board_early_init
This patch takes the CTL_UNNUMBERD concept from NFS and makes it available to
all new sysctl users.
At the same time the sysctl binary interface maintenance documentation is
updated to mention and to describe what is needed to successfully maintain the
sysctl binary interface.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Since it is becoming clear that there are just enough users of the binary
sysctl interface that completely removing the binary interface from the kernel
will not be an option for foreseeable future, we need to find a way to address
the sysctl maintenance issues.
The basic problem is that sysctl requires one central authority to allocate
sysctl numbers, or else conflicts and ABI breakage occur. The proc interface
to sysctl does not have that problem, as names are not densely allocated.
By not terminating a sysctl table until I have neither a ctl_name nor a
procname, it becomes simple to add sysctl entries that don't show up in the
binary sysctl interface. Which allows people to avoid allocating a binary
sysctl value when not needed.
I have audited the kernel code and in my reading I have not found a single
sysctl table that wasn't terminated by a completely zero filled entry. So
this change in behavior should not affect anything.
I think this mechanism eases the pain enough that combined with a little
disciple we can solve the reoccurring sysctl ABI breakage.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When I added the entries for the robust futex syscall entries, I
forgot to bump NR_SYSCALLS. The current situation is error-prone
because NR_SYSCALLS lives in entry.S where the system call limit
checks are enforced. Move the definition to asm/unistd.h in order to
make this mistake much more difficult to make.
And wire up sys_migrate_pages since the powerpc folks implemented the
compat wrapper for us.
Signed-off-by: David S. Miller <davem@davemloft.net>
__constant_htons(2<<4) is not a replacement for
htonl(2<<20).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calculation of IPX checksum got buggered about 2.4.0. The old variant
mangled the packet; that got fixed, but calculation itself got buggered.
Restored the correct logics, fixed a subtle breakage we used to have even
back then: if the sum is 0 mod 0xffff, we want to return 0, not 0xffff.
The latter has special meaning for IPX (cheksum disabled). Observation
(and obvious fix) nicked from history of FreeBSD ipx_cksum.c...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is needed on bigendian 64bit architectures.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a swsusp debugging mode. This does everything that's needed for a suspend
except for actually suspending. So we can look in the log messages and work
out a) what code is being slow and b) which drivers are misbehaving.
(1)
# echo testproc > /sys/power/disk
# echo disk > /sys/power/state
This should turn off the non-boot CPU, freeze all processes, wait for 5
seconds and then thaw the processes and the CPU.
(2)
# echo test > /sys/power/disk
# echo disk > /sys/power/state
This should turn off the non-boot CPU, freeze all processes, shrink
memory, suspend all devices, wait for 5 seconds, resume the devices etc.
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Stefan Seyfried <seife@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
printk_ratelimit() has global state which makes it not useful for callers
which wish to perform ratelimiting at a particular frequency.
Add a printk_timed_ratelimit() which utilises caller-provided state storage to
permit more flexibility.
This function can in fact be used for things other than printk ratelimiting
and is perhaps poorly named.
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ufs2 fails to mount on x86_64, claiming bad magic. This is because
ufs_super_block_third's fs_un1 member is padded out by 4 bytes for 8-byte
alignment, pushing down the rest of the struct.
Forcing this to be packed solves it. I took a quick look over other
on-disk structures and didn't immediately find other problems. I was able
to mount & ls a populated ufs2 filesystem w/ this change.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fixed definition of some CIF registers; see PXA27x Developer\'s Manual.
Signed-off-by: Enrico Scholz <enrico.scholz@sigma-chemnitz.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
RDMA/addr: Use client registration to fix module unload race
IB/mthca: Fix MAD extended header format for MAD_IFC firmware command
IB/uverbs: Return sq_draining value in query_qp response
IB/amso1100: Fix incorrect pr_debug()
IB/amso1100: Use dma_alloc_coherent() instead of kmalloc/dma_map_single
IB/ehca: Fix eHCA driver compilation for uniprocessor
RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count
IB/iser: Start connection after enabling iSER
Require registration with ib_addr module to prevent caller from
unloading while a callback is in progress.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] Do not use -msym32 option for modules.
[MIPS] Don't use R10000 llsc workaround version for all llsc-full processors.
[MIPS] Ocelot G: Fix : "CURRENTLY_UNUSED" is not defined warning.
[MIPS] Fix warning about init_initrd() call if !CONFIG_BLK_DEV_INITRD.
[MIPS] IP27: Allow SMP ;-) Another changeset messed up by patch.
[MIPS] Fix merge screwup by patch(1)
Revert "[MIPS] Make SPARSEMEM selectable on QEMU."
I copied the logic from ll/sc arch implementations, but that
was wrong and makes no sense at all. Just do a straight
compare-exchange instruction, just like x86.
Based upon bug reports from Dennis Gilmore and Fabio Massimo.
Signed-off-by: David S. Miller <davem@davemloft.net>
This is preparation for fixing the ordering of the accesses that
got broken by the commit cf4c6a2f27 when
factoring out the "common" io apic routing entry accesses.
Move the accessor function (that were only used by io_apic.c) out
of a header file, and use proper memory-mapped accesses rather than
making up our own "volatile" pointers.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Make alignment exception always check exception table
[POWERPC] Disallow kprobes on emulate_step and branch_taken
[POWERPC] Make mmiowb's io_sync preempt safe
[POWERPC] Make high hugepage areas preempt safe
[POWERPC] Make current preempt-safe
[POWERPC] qe_lib: qe_issue_cmd writes wrong value to CECDR
[POWERPC] Use 4kB iommu pages even on 64kB-page systems
[POWERPC] Fix oprofile support for e500 in arch/powerpc
[POWERPC] Fix rmb() for e500-based machines it
[POWERPC] Fix various offb issues
If mmiowb() is always used prior to releasing spinlock as Doc suggests,
then it's safe against preemption; but I'm not convinced that's always
the case. If preemption occurs between sync and get_paca()->io_sync = 0,
I believe there's no problem. But in the unlikely event that gcc does
the store relative to another register than r13 (as it did with current),
then there's a small danger of setting another cpu's io_sync to 0, after
it had just set it to 1. Rewrite ppc64 mmiowb to prevent that.
The remaining io_sync assignments in io.h all get_paca()->io_sync = 1,
which is harmless even if preempted to the wrong cpu (the context switch
itself syncs); and those in spinlock.h are while preemption is disabled.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Repeated -j20 kernel builds on a G5 Quad running an SMP PREEMPT kernel
would often collapse within a day, some exec failing with "Bad address".
In each case examined, load_elf_binary was doing a kernel_read, but
generic_file_aio_read's access_ok saw current->thread.fs.seg as USER_DS
instead of KERNEL_DS.
objdump of filemap.o shows gcc 4.1.0 emitting "mr r5,r13 ... ld r9,416(r5)"
here for get_paca()->__current, instead of the expected and much more usual
"ld r9,416(r13)"; I've seen other gcc4s do the same, but perhaps not gcc3s.
So, if the task is preempted and rescheduled on a different cpu in between
the mr and the ld, r5 will be looking at a different paca_struct from the
one it's now on, pick up the wrong __current, and perhaps the wrong seg.
Presumably much worse could happen elsewhere, though that split is rare.
Other architectures appear to be safe (x86_64's read_pda is more limiting
than get_paca), but ppc64 needs to force "current" into one instruction.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 10Gigabit ethernet device drivers appear to be able to chew
up all 256MB of TCE mappings on pSeries systems, as evidenced by
numerous error messages:
iommu_alloc failed, tbl c0000000010d5c48 vaddr c0000000d875eff0 npages 1
Some experimentation indicates that this is essentially because
one 1500 byte ethernet MTU gets mapped as a 64K DMA region when
the large 64K pages are enabled. Thus, it doesn't take much to
exhaust all of the available DMA mappings for a high-speed card.
This patch changes the iommu allocator to work with its own
unique, distinct page size. Although the patch is long, its
actually quite simple: it just #defines a distinct IOMMU_PAGE_SIZE
and then uses this in all the places that matter.
As a side effect, it also dramatically improves network performance
on platforms with H-calls on iommu translation inserts/removes (since
we no longer call it 16 times for a 1500 bytes packet when the iommu HW
is still 4k).
In the future, we might want to make the IOMMU_PAGE_SIZE a variable
in the iommu_table instance, thus allowing support for different HW
page sizes in the iommu itself.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fixed a compile error in building the 85xx support with oprofile, and in
the process cleaned up some issues with the fsl_booke performance monitor
code.
* Reorganized FSL Book-E performance monitoring code so that the 7450
wouldn't be built if the e500 was, and cleaned it up so it was more
self-contained.
* Added a cpu_setup function for FSL Book-E. The original
cpu_setup function prototype had no arguments, assuming that
the reg_setup function would copy the required information into
variables which represented the registers. This was silly for
e500, since it has 1 register per counter (rather than 3 for
all counters), so the code has been restructured to have
cpu_setup take the current counter config array as an argument,
with op_powerpc_setup() invoking op_powerpc_cpu_setup() through
on_each_cpu(), and op_powerpc_cpu_setup() invoking the
model-specific cpu_setup function with an argument. The
argument is ignored on all other platforms at present.
* Fixed a confusing line where a trinary operator only had two
arguments
Signed-off-by: Andrew Fleming <afleming@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The e500 core generates an illegal instruction exception when it tries
to execute the lwsync instruction, which we currently use for rmb().
This fixes it by using the LWSYNC macro, which turns into a plain sync
on 32-bit machines.
Signed-off-by: Andrew Fleming <afleming@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
ata_dev_revalidate() isn't used outside of libata core. Unexport it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* 'release' of master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Correct definition of handle_IPI
[IA64] move SAL_CACHE_FLUSH check later in boot
[IA64] MCA recovery: Montecito support
[IA64] cpu-hotplug: Fixing confliction between CPU hot-add and IPI
[IA64] don't double >> PAGE_SHIFT pointer for /dev/kmem access
The check to see if the firmware drops interrupts during a
SAL_CACHE_FLUSH is done to early in the boot. SAL_CACHE_FLUSH expects
to be able to make PAL calls in virtual mode, on some cell based
machines a fault occurs causing a MCA. This patch moves the check
after mmu_context_init so the TLB and VHPT are properly setup.
Signed-off-by Troy Heber <troy.heber@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Since we already moved to GENERIC_TIME, we should implement alternatives
of old do_gettimeoffset routines to get sub-jiffies resolution from
gettimeofday(). This patch includes:
* MIPS clocksource support (based on works by Manish Lachwani).
* remove unused gettimeoffset routines and related codes.
* remove unised 64bit do_div64_32().
* simplify mips_hpt_init. (no argument needed, __init tag)
* simplify c0_hpt_timer_init. (no need to write to c0_count)
* remove some hpt_init routines.
* mips_hpt_mask variable to specify bitmask of hpt value.
* convert jmr3927_do_gettimeoffset to jmr3927_hpt_read.
* convert ip27_do_gettimeoffset to ip27_hpt_read.
* convert bcm1480_do_gettimeoffset to bcm1480_hpt_read.
* simplify sb1250 hpt functions. (no need to subtract and shift)
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This is the UML piece of the INITCALLS tidying.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Return the sq_draining value back to user space for query_qp instead
of the en_sqd_async notify value, which is valid only for
modify_qp. For query_qp, the draining status should returned.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The conversion from IPR-IRQ to IRQ-chip resulted in the
ipr data being allocated in a local variable in
make_ipr_irq - breaking anything using IPR interrupts.
This changes all of the callers of make_ipr_irq to
allocate a static structure containing the IPR data which
is then passed to make_ipr_irq. This removes the need for
make_ipr_irq to allocate any additional space for the IPR
information.
Signed-off-by: Jamie Lenehan <lenehan@twibble.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>