Commit Graph

75319 Commits

Author SHA1 Message Date
Benjamin Herrenschmidt
69c0785112 [POWERPC] 4xx: PCI support for Ebony board
This wires up the 4xx PCI support & device tree bits for
440GP based Ebony platform.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-12-23 13:12:52 -06:00
Benjamin Herrenschmidt
a2d2e1ec07 [POWERPC] 4xx: PLB to PCI Express support
This adds to the previous 2 patches the support for the 4xx PCI Express
cells as found in the 440SPe revA, revB and 405EX.

Unfortunately, due to significant differences between these, and other
interesting "features" of those pieces of HW, the code isn't as simple
as it is for PCI and PCI-X and some of the functions differ significantly
between the 3 implementations. Thus, not only this code can only support
those 3 implementations for now and will refuse to operate on any other,
but there are added ifdef's to avoid the bloat of building a fairly large
amount of code on platforms that don't need it.

Also, this code currently only supports fully initializing root complex
nodes, not endpoint. Some more code will have to be lifted from the
arch/ppc implementation to add the endpoint support, though it's mostly
differences in memory mapping, and the question on how to represent
endpoint mode PCI in the device-tree is thus open.

Many thanks to Stefan Roese for testing & fixing up the 405EX bits !

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-12-23 13:12:34 -06:00
Benjamin Herrenschmidt
c839e0eff5 [POWERPC] 4xx: PLB to PCI 2.x support
This adds to the previous patch the support for the 4xx PCI 2.x
bridges.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-12-23 13:12:27 -06:00
Benjamin Herrenschmidt
5738ec6d00 [POWERPC] 4xx: PLB to PCI-X support
This adds base support code for the 4xx PCI-X bridge. It also provides
placeholders for the PCI and PCI-E version but they aren't supported
with this patch.

The bridges are configured based on device-tree properties.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-12-23 13:12:20 -06:00
Benjamin Herrenschmidt
0e6140a56f [POWERPC] 4xx: Improve support for 4xx indirect DCRs
Accessing indirect DCRs is done via a pair of address/data DCRs.

Such accesses are thus inherently racy, vs. interrupts, preemption
and possibly SMP if 4xx SMP cores are ever used.

This updates the mfdcri/mtdcri macros in dcr-native.h (which were
so far unused) to use a spinlock.

In addition, add some common definitions to a new dcr-regs.h file.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-12-23 13:12:11 -06:00
Benjamin Herrenschmidt
47c0bd1ae2 [POWERPC] Reworking machine check handling and Fix 440/440A
This adds a cputable function pointer for the CPU-side machine
check handling. The semantic is still the same as the old one,
the one in ppc_md. overrides the one in cputable, though
ultimately we'll want to change that so the CPU gets first.

This removes CONFIG_440A which was a problem for multiplatform
kernels and instead fixes up the IVOR at runtime from a setup_cpu
function. The "A" version of the machine check also tweaks the
regs->trap value to differenciate the 2 versions at the C level.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-12-23 13:11:59 -06:00
Paul Mackerras
c2a7dcad9f Merge branch 'linux-2.6' 2007-12-21 22:21:08 +11:00
Stephen Rothwell
373a6da165 [POWERPC] Make non-PCI build work again
Maple and pasemi both require PCI as does CONFIG_OF_PLATFORM_PCI.
The default setting of CONFIG_ISA_DMA_API is set to match the protection
around the relevant routines in asm/dma.h.

I also had to remove the PMAC platform from the combined build.  The
precis is that to build a 64 bit kernel with no PCI, you can only include
pSeries and iSeries.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 22:14:08 +11:00
Stephen Rothwell
70fbb93883 [POWERPC] Pointers marked as __iomem do not need to be volatile
Fixes this warning:

arch/powerpc/platforms/powermac/pci.c: In function 'u3_ht_cfg_access':
arch/powerpc/platforms/powermac/pci.c:354: warning: return discards qualifiers from pointer target type
arch/powerpc/platforms/powermac/pci.c:358: warning: return discards qualifiers from pointer target type

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 22:14:07 +11:00
Stephen Rothwell
b91bdd1517 [POWERPC] Constify the of_device_id passed to of_platform_bus_probe
This will allow us to declare const all the statically declared arrrays
of these.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 22:14:07 +11:00
Stephen Rothwell
92d1616ec0 [POWERPC] The builtin matches for ibmebus.c can be __initdata
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 22:14:07 +11:00
Stephen Rothwell
1ce890e036 [POWERPC] Add EHEA and EHCA as modules in the ppc64_defconfig
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 22:14:07 +11:00
Benjamin Herrenschmidt
b1b166b7ea [POWERPC] Fix possible NULL deref in ppc32 PCI
The 32-bit PCI code tests if "bus" is non-NULL after calling
pci_scan_bus_parented() in one place but not another before
dereferencing it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 22:14:07 +11:00
Benjamin Herrenschmidt
0094f2cdcf [POWERPC] Fix for via-pmu based backlight control
This fixes a few issues with via-pmu based backlight control.

First, it fixes a sign problem with the setup of the backlight
curve since the `range' value there -can- (and will) go negative.

Then, it reworks the interaction between this and the via-pmu sleep
code to properly restore backlight on wakeup from sleep.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 22:14:07 +11:00
Scott Wood
7ac5dde99e [POWERPC] Implement arch disable/enable irq hooks.
These hooks ensure that a decrementer interrupt is not pending when
suspending; otherwise, problems may occur on 6xx/7xx/7xxx-based
systems (except for powermacs, which use a separate suspend path).
For example, with deep sleep on the 831x, a pending decrementer will
cause a system freeze because the SoC thinks the decrementer interrupt
would have woken the system, but the core must have interrupts
disabled due to the setup required for deep sleep.

Changed via-pmu.c to use the new ppc_md hooks, and made the arch_*
functions call the generic_* functions unconditionally.  -- paulus

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 22:13:35 +11:00
Jeremy Kerr
cbea92383d [POWERPC] spufs: Don't leak kernel stack through an empty {i,m}box_info read
Based on an original patch from Arnd Bergmann
<arnd.bergmann@de.ibm.com>

If there's no entry in the mailbox, then a read on the _info file will
return data from an uninitialised variable.

This change returns EOF if there's no mailbox info available instead.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:22 +11:00
Andre Detsch
18789fb1c3 [POWERPC] spufs: DMA Restart after SIGSEGV
This fixes the behavior of spufs when a spu tries a DMA operation
based on a wrong / unavailable address.

Instead of just generating a SIGBUS signal, spufs now
generates a SIGSEGV signal and restarts the problematic DMA operation
after the execution of the application's signal handler.  This allows
applications to employ user-level paging systems.

Although the restart_dma function is called before the application's
signal handler, the operation is not actually performed at this time,
since the spu context is already stopped.  The operation only takes
place when spu_run is restarted (which happens automatically).

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:21 +11:00
Aegis Lin
90608a2928 [POWERPC] spufs: Use separate timer for /proc/spu_loadavg calculation
The original spusched_timer was designed to take effect only when
a context is waiting in the runqueue.

This change adds an additional lower-freq timer has been added to
purely handle the spu_load updates. The new timer will be triggered
per LOAD_FREQ ticks.

Signed-off-by: Aegis Lin <aegislin@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:21 +11:00
Christoph Hellwig
c9101bdb1b [POWERPC] spufs: make state_mutex interruptible
Make most places that use spu_acquire/spu_acquire_saved interruptible,
this allows getting out of the spufs code when e.g. pressing ctrl+c.
There are a few places where we get called e.g. from spufs teardown
routines were we can't simply err out so these are left with a comment.
For now I've also not touched the poll routines because it's open what
libspe would expect in terms of interrupted system calls.

Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:21 +11:00
Christoph Hellwig
197b1a8263 [POWERPC] spufs: add enchanced simple attr macros
The simple attr macros currently used by spufs can't deal with the
handlers returning errors, which is required to make the state_mutex
interruptible.  This adds a local copy that allows for an error
return from the get/set handlers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:21 +11:00
Luke Browning
e65c2f6fce [POWERPC] spufs: decouple spu scheduler from spufs_spu_run (asynchronous scheduling)
Change spufs_spu_run so that the context is queued directly to the
scheduler and the controlling thread advances directly to spufs_wait()
for spe errors and exceptions.

nosched contexts are treated the same as before.

Fixes from Christoph Hellwig <hch@lst.de>

Signed-off-by: Luke Browning <lukebr@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:21 +11:00
Masato Noguchi
9476141c18 [POWERPC] spufs: don't set reserved bits in spu interrupt status
This changes the spu context switch code to not write to reserved bits
of spu interrupt status register.
The architecture book says the reserved fields should be set to zero.

Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:20 +11:00
Luke Browning
b192541b39 [POWERPC] spufs: spu_find_victim may choose wrong victim
Need to re-check priority after dropping lock.  Otherwise, a
more favored context may be preempted.

Signed-off-by: Luke Browning <lukebr@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:20 +11:00
Luke Browning
91569531d1 [POWERPC] spufs: reorganize spu_run_init
This cleans up spu_run_init so that it does all of the spu
initialization for spufs_run_spu.  It initializes the spu context as
much as possible before it activates the spu and writes the runcntl
register.

Signed-off-by: Luke Browning <lukebr@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:20 +11:00
Jeremy Kerr
d6ad39bc53 [POWERPC] spufs: rework class 0 and 1 interrupt handling
Based on original patches from
 Arnd Bergmann <arnd.bergman@de.ibm.com>; and
 Luke Browning <lukebr@linux.vnet.ibm.com>

Currently, spu contexts need to be loaded to the SPU in order to take
class 0 and class 1 exceptions.

This change makes the actual interrupt-handlers much simpler (ie,
set the exception information in the context save area), and defers the
handling code to the spufs_handle_class[01] functions, called from
spufs_run_spu.

This should improve the concurrency of the spu scheduling leading to
greater SPU utilization when SPUs are overcommited.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:20 +11:00
Jeremy Kerr
8af30675c3 [POWERPC] spufs: use #defines for SPU class [012] exception status
Add a few #defines for the class 0, 1 and 2 interrupt status bits, and
use them instead of magic numbers when we're setting or checking for
these interrupts.

Also, add a #define for the class 2 mailbox threshold interrupt mask.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:20 +11:00
Jeremy Kerr
c40aa47104 [POWERPC] spufs: fix incorrect interrupt status clearing in backing mbox stat poll
When doing a poll on the mbox stat file of a swapped-out context, we
clear the class 0 interrupt status, rather than the class 2 interrupt
status.

This change corrects the poll operation to clear the correct interrupt.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:19 +11:00
Luke Browning
cc210b3ec5 [POWERPC] spufs: add backing ops for privcntl register
This change encapsulates the spu_privcntl_RW register so that it can
be written through backing ops.  This is necessary so that spu contexts
can be initialized and queued to the scheduler in spufs_run_spu.

Signed-off-by: Luke Browning <lukebr@linux.vnet.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:19 +11:00
Arnd Bergmann
33bfd7a738 [POWERPC] spufs: block fault handlers in spu_acquire_runnable
This change disables the logic that faults-in spu contexts under the
covers from the page fault handler.  When a fault requires a runnable
context, the handler will block until the context is scheduled by
other means.

Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:19 +11:00
Jeremy Kerr
7cd58e4381 [POWERPC] spufs: move fault, lscsa_alloc and switch code to spufs module
Currently, part of the spufs code (switch.o, lscsa_alloc.o and fault.o)
is compiled directly into the kernel.

This change moves these components of spufs into the kernel.

The lscsa and switch objects are fairly straightforward to move in.

For the fault.o module, we split the fault-handling code into two
parts: a/p/p/c/spu_fault.c and a/p/p/c/spufs/fault.c. The former is for
the in-kernel spu_handle_mm_fault function, and we move the rest of the
fault-handling code into spufs.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:19 +11:00
Julio M. Merino Vidal
9b1d21f858 [POWERPC] spufs: fix typos in sched.c comments
Fix a few typos in the spufs scheduler comments

Signed-off-by: Julio M. Merino Vidal <jmerino@ac.upc.edu>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:46:18 +11:00
Masato Noguchi
c25620d766 [POWERPC] cell: wrap master run control bit
Add platform specific SPU run control routines to the spufs.  The current
spufs implementation uses the SPU master run control bit (MFC_SR1[S]) to
control SPE execution, but the PS3 hypervisor does not support the use of
this feature.

This change adds the run control wrapper routies spu_enable_spu() and
spu_disable_spu().  The bare metal routines use the master run control
bit, and the PS3 specific routines use the priv2 run control register.

An outstanding enhancement for the PS3 would be to add a guard to check
for incorrect access to the spu problem state when the spu context is
disabled.  This check could be implemented with a flag added to the spu
context that would inhibit mapping problem state pages, and a routine
to unmap spu problem state pages.  When the spu is enabled with
ps3_enable_spu() the flag would be set allowing pages to be mapped,
and when the spu is disabled with ps3_disable_spu() the flag would be
cleared and mapped problem state pages would be unmapped.

Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 19:45:05 +11:00
Emil Medve
eda09fbdcd [POWERPC] Optimize counting distinct entries in the relocation sections
When a module has relocation sections with tens of thousands of
entries, counting the distinct/unique entries only (i.e. no
duplicates) at load time can take tens of seconds and up to minutes.
The sore point is the count_relocs() function which is called as part
of the architecture specific module loading processing path:

	-> load_module()			generic
	   -> module_frob_arch_sections()	arch specific
	      -> get_plt_size()		32-bit
	      -> get_stubs_size()	64-bit
		 -> count_relocs()

Here count_relocs is being called to find out how many distinct
targets of R_PPC_REL24 relocations there are, since each distinct
target needs a PLT entry or a stub created for it.

The previous counting algorithm has O(n^2) complexity.  Basically two
solutions were proposed on the e-mail list: a hash based approach and
a sort based approach.

The hash based approach is the fastest (O(n)) but the has it needs
additional memory and for certain corner cases it could take lots of
memory due to the degeneration of the hash.  One such proposal was
submitted here:

http://ozlabs.org/pipermail/linuxppc-dev/2007-June/037641.html

The sort based approach is slower (O(n * log n + n)) but if the
sorting is done "in place" it doesn't need additional memory.
This has O(n + n * log n) complexity with no additional memory
requirements.

This commit implements the in-place sort option.

Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-12-21 15:05:58 +11:00
Linus Torvalds
ea67db4cdb Linux 2.6.24-rc6 2007-12-20 17:25:48 -08:00
Linus Torvalds
4bde57094b Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: intel_cacheinfo.c: cpu cache info entry for Intel Tolapai
  x86: fix die() to not be preemptible
2007-12-20 17:02:37 -08:00
Linus Torvalds
2b5baad165 Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
  [XFS] Initialise current offset in xfs_file_readdir correctly
  [XFS] Fix mknod regression
2007-12-20 17:02:22 -08:00
Lachlan McIlroy
4743e0ec12 [XFS] Initialise current offset in xfs_file_readdir correctly
After reading the directory contents into the temporary buffer, we grab
each dirent and pass it to filldir witht eh current offset of the dirent.
The current offset was not being set for the first dirent in the temporary
buffer, which coul dresult in bad offsets being set in the f_pos field
result in looping and duplicate entries being returned from readdir.

SGI-PV: 974905
SGI-Modid: xfs-linux-melb:xfs-kern:30282a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2007-12-21 11:40:05 +11:00
Christoph Hellwig
bad60fdd14 [XFS] Fix mknod regression
This was broken by my '[XFS] simplify xfs_create/mknod/symlink prototype',
which assigned the re-shuffled ondisk dev_t back to the rdev variable in
xfs_vn_mknod. Because of that i_rdev is set to the ondisk dev_t instead of
the linux dev_t later down the function.

Fortunately the fix for it is trivial: we can just remove the assignment
because xfs_revalidate_inode has done the proper job before unlocking the
inode.

SGI-PV: 974873
SGI-Modid: xfs-linux-melb:xfs-kern:30273a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2007-12-21 11:39:58 +11:00
Jason Gaston
04fa11ea17 x86: intel_cacheinfo.c: cpu cache info entry for Intel Tolapai
This patch adds a cpu cache info entry for the Intel Tolapai cpu.

Signed-off-by: Jason Gaston <jason.d.gaston@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-21 01:27:19 +01:00
Ingo Molnar
c0a698b744 x86: fix die() to not be preemptible
Andrew "Eagle Eye" Morton noticed that we use raw_local_save_flags()
instead of raw_local_irq_save(flags) in die(). This allows the
preemption of oopsing contexts - which is highly undesirable. It also
causes CONFIG_DEBUG_PREEMPT to complain, as reported by Miles Lane.

this bug was introduced via:

  commit 39743c9ef7
  Author: Andi Kleen <ak@suse.de>
  Date:   Fri Oct 19 20:35:03 2007 +0200

      x86: use raw locks during oopses

-               spin_lock_irqsave(&die.lock, flags);
+               __raw_spin_lock(&die.lock);
+               raw_local_save_flags(flags);

that is not a correct open-coding of spin_lock_irqsave(): both the
ordering is wrong (irqs should be disabled _first_), and the wrong
flags-saving API was used.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-12-21 01:27:19 +01:00
Linus Torvalds
faa4877f02 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  debug: add end-of-oops marker
  sched: rt: account the cpu time during the tick
2007-12-20 11:53:41 -08:00
Linus Torvalds
17eb2c3b56 Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
  dm crypt: use bio_add_page
  dm: merge max_hw_sector
  dm: trigger change uevent on rename
  dm crypt: fix write endio
  dm mpath: hp requires scsi
  dm: table detect io beyond device
2007-12-20 11:25:22 -08:00
Milan Broz
91e1062592 dm crypt: use bio_add_page
Fix possible max_phys_segments violation in cloned dm-crypt bio.

In write operation dm-crypt needs to allocate new bio request
and run crypto operation on this clone. Cloned request has always
the same size, but number of physical segments can be increased
and violate max_phys_segments restriction.

This can lead to data corruption and serious hardware malfunction.
This was observed when using XFS over dm-crypt and at least
two HBA controller drivers (arcmsr, cciss) recently.

Fix it by using bio_add_page() call (which tests for other
restrictions too) instead of constructing own biovec.

All versions of dm-crypt are affected by this bug.

Cc: stable@kernel.org
Cc:  dm-crypt@saout.de
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-12-20 17:32:13 +00:00
Neil Brown
91212507f9 dm: merge max_hw_sector
Make sure dm honours max_hw_sectors of underlying devices

  We still have no firm testing evidence in support of this patch but
  believe it may help to resolve some bug reports.  - agk

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-12-20 17:32:12 +00:00
Alasdair G Kergon
69267a30be dm: trigger change uevent on rename
Insert a missing KOBJ_CHANGE notification when a device is renamed.

Cc: Scott James Remnant <scott@ubuntu.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-12-20 17:32:11 +00:00
Milan Broz
adfe47702c dm crypt: fix write endio
Fix BIO_UPTODATE test for write io.

Cc: stable@kernel.org
Cc: dm-crypt@saout.de
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-12-20 17:32:10 +00:00
Paul Mundt
d1622e8909 dm mpath: hp requires scsi
With CONFIG_SCSI=n __scsi_print_sense() is never linked in.

drivers/built-in.o: In function `hp_sw_end_io':
dm-mpath-hp-sw.c:(.text+0x914f8): undefined reference to `__scsi_print_sense'

Caught with a randconfig on current git.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-12-20 17:32:09 +00:00
Jun'ichi Nomura
512875bd96 dm: table detect io beyond device
This patch fixes a panic on shrinking a DM device if there is
outstanding I/O to the part of the device that is being removed.
(Normally this doesn't happen - a filesystem would be resized first,
for example.)

The bug is that __clone_and_map() assumes dm_table_find_target()
always returns a valid pointer.  It may fail if a bio arrives from the
block layer but its target sector is no longer included in the DM
btree.

This patch appends an empty entry to table->targets[] which will
be returned by a lookup beyond the end of the device.

After calling dm_table_find_target(), __clone_and_map() and target_message()
check for this condition using
dm_target_is_valid().

Sample test script to trigger oops:
2007-12-20 17:32:08 +00:00
Ivan Kokshaysky
3c378158d4 mm: fix exit_mmap BUG() on a.out binary exit
The problem was introduced by commit "mm: variable length argument
support" (b6a2fea393)
as it didn't update fs/binfmt_aout.c like other binfmt's.

I noticed that on alpha when accidentally launched old OSF/1
Acrobat Reader binary. Obviously, other architectures are affected
as well.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Ollie Wild <aaw@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-20 07:49:53 -08:00
Arjan van de Ven
2c3b20e91f debug: add end-of-oops marker
Right now it's nearly impossible for parsers that collect kernel crashes
from logs or emails (such as www.kerneloops.org) to detect the
end-of-oops condition. In addition, it's not currently possible to
detect whether or not 2 oopses that look alike are actually the same
oops reported twice, or are truly two unique oopses.

This patch adds an end-of-oops marker, and makes the end marker include
a very simple 64-bit random ID to be able to detect duplicate reports.

Normally, this ID is calculated as a late_initcall() (in the hope that
at that time there is enough entropy to get a unique enough ID); however
for early oopses the oops_exit() function needs to generate the ID on
the fly.

We do this all at the _end_ of an oops printout, so this does not impact
our ability to get the most important portions of a crash out to the
console first.

[ Sidenote: the already existing oopses-since-bootup counter we print
  during crashes serves as the differentiator between multiple oopses
  that trigger during the same bootup. ]

Tested on 32-bit and 64-bit x86. Artificially injected very early
crashes as well, as expected they result in this constant ID after
multiple bootups:

  ---[ end trace ca143223eefdc828 ]---
  ---[ end trace ca143223eefdc828 ]---

because the random pools are still all zero. But it all still works
fine and causes no additional problems (which is the main goal of
instrumentation code).

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-12-20 15:01:17 +01:00