Commit Graph

42510 Commits

Author SHA1 Message Date
Ed L. Cashin
cf446f0dba aoe: eliminate goto and improve readability
Adam Richter suggested eliminating this goto.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:31 -08:00
Ed L. Cashin
1eb0da4cea aoe: mac_addr: avoid 64-bit arch compiler warnings
By returning unsigned long long, mac_addr does not generate compiler warnings
on 64-bit architectures.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:31 -08:00
Ed L. Cashin
68e0d42f39 aoe: handle multiple network paths to AoE device
A remote AoE device is something can process ATA commands and is identified by
an AoE shelf number and an AoE slot number.  Such a device might have more
than one network interface, and it might be reachable by more than one local
network interface.  This patch tracks the available network paths available to
each AoE device, allowing them to be used more efficiently.

Andrew Morton asked about the call to msleep_interruptible in the revalidate
function.  Yes, if a signal is pending, then msleep_interruptible will not
return 0.  That means we will not loop but will call aoenet_xmit with a NULL
skb, which is a noop.  If the system is too low on memory or the aoe driver is
too low on frames, then the user can hit control-C to interrupt the attempt to
do a revalidate.  I have added a comment to the code summarizing that.

Andrew Morton asked whether the allocation performed inside addtgt could use a
more relaxed allocation like GFP_KERNEL, but addtgt is called when the aoedev
lock has been locked with spin_lock_irqsave.  It would be nice to allocate the
memory under fewer restrictions, but targets are only added when the device is
being discovered, and if the target can't be added right now, we can try again
in a minute when then next AoE config query broadcast goes out.

Andrew Morton pointed out that the "too many targets" message could be printed
for failing GFP_ATOMIC allocations.  The last patch in this series makes the
messages more specific.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:31 -08:00
Ed L. Cashin
8911ef4dc9 aoe: bring driver version number to 47
Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:31 -08:00
Nick Piggin
75acb9cd2e rd: support XIP
Support direct_access XIP method with brd.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:30 -08:00
Nick Piggin
9db5579be4 rewrite rd
This is a rewrite of the ramdisk block device driver.

The old one is really difficult because it effectively implements a block
device which serves data out of its own buffer cache.  It relies on the dirty
bit being set, to pin its backing store in cache, however there are non
trivial paths which can clear the dirty bit (eg.  try_to_free_buffers()),
which had recently lead to data corruption.  And in general it is completely
wrong for a block device driver to do this.

The new one is more like a regular block device driver.  It has no idea about
vm/vfs stuff.  It's backing store is similar to the buffer cache (a simple
radix-tree of pages), but it doesn't know anything about page cache (the pages
in the radix tree are not pagecache pages).

There is one slight downside -- direct block device access and filesystem
metadata access goes through an extra copy and gets stored in RAM twice.
However, this downside is only slight, because the real buffercache of the
device is now reclaimable (because we're not playing crazy games with it), so
under memory intensive situations, footprint should effectively be the same --
maybe even a slight advantage to the new driver because it can also reclaim
buffer heads.

The fact that it now goes through all the regular vm/fs paths makes it
much more useful for testing, too.

   text    data     bss     dec     hex filename
   2837     849     384    4070     fe6 drivers/block/rd.o
   3528     371      12    3911     f47 drivers/block/brd.o

Text is larger, but data and bss are smaller, making total size smaller.

A few other nice things about it:
- Similar structure and layout to the new loop device handlinag.
- Dynamic ramdisk creation.
- Runtime flexible buffer head size (because it is no longer part of the
  ramdisk code).
- Boot / load time flexible ramdisk size, which could easily be extended
  to a per-ramdisk runtime changeable size (eg. with an ioctl).
- Can use highmem for the backing store.

[akpm@linux-foundation.org: fix build]
[byron.bbradley@gmail.com: make rd_size non-static]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Byron Bradley <byron.bbradley@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:30 -08:00
David Howells
b920de1b77 mn10300: add the MN10300/AM33 architecture to the kernel
Add architecture support for the MN10300/AM33 CPUs produced by MEI to the
kernel.

This patch also adds board support for the ASB2303 with the ASB2308 daughter
board, and the ASB2305.  The only processor supported is the MN103E010, which
is an AM33v2 core plus on-chip devices.

[akpm@linux-foundation.org: nuke cvs control strings]
Signed-off-by: Masakazu Urade <urade.masakazu@jp.panasonic.com>
Signed-off-by: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:30 -08:00
David Howells
62fb44b962 usb: net2280 can't have a function called show_registers()
net2280 can't have a function called show_registers() because this can produce
a namespace clash with an arch function of the same name.

All this driver's functions and variables should really be prefixed with
"net2280_" to avoid such a problem in future.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Greg KH <greg@kroah.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:30 -08:00
David Howells
1eb1141123 aout: remove unnecessary inclusions of {asm, linux}/a.out.h
Remove now unnecessary inclusions of {asm,linux}/a.out.h.

[akpm@linux-foundation.org: fix alpha build]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:30 -08:00
Alan Cox
a46c999424 serial_core: bring mostly into line with coding style
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
b4c502a709 8250: enable rate reporting via termios
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
6f803cd08f serial8250: coding style
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
5756ee9996 8250_pci: coding style
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
3b0fd36ddc 8250_hub6: codding style
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
0fe1c13713 8250_hp300: coding style
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
8b4a483be5 8250_gsc: coding style
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
ce2e204f0c 8250_early: coding style
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
355d95a1c8 tty_ioctl: drag screaming into compliance with the coding style
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
37bdfb074e tty_io: drag screaming into coding style compliance
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
66c6ceae39 tty_audit: fix checkpatch complaint
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
4129a6454d rocket: don't let random users reset the controller
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
6df3526b66 rocket: first pass at termios reporting
Also removes a cflag comparison that caused some mode changes to get wrongly
ignored

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
4edf1827ea n_tty: clean up old code to follow coding style and (mostly) checkpatch
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:25 -08:00
Alan Cox
db1acaa632 moxa: first pass at termios reporting
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:24 -08:00
mark gross
d94afc6ccf intel-iommu: fault_reason index cleanup
Fix an off by one bug in the fault reason string reporting function, and
clean up some of the code around this buglet.

[akpm@linux-foundation.org: cleanup]
Signed-off-by: mark gross <mgross@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:24 -08:00
mark gross
f8bab73515 intel-iommu: PMEN support
Add support for protected memory enable bits by clearing them if they are
set at startup time.  Some future boot loaders or firmware could have this
bit set after it loads the kernel, and it needs to be cleared if DMA's are
going to happen effectively.

Signed-off-by: mark gross <mgross@intel.com>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:24 -08:00
Jerome Marchand
a890d62b9e Enhanced partition statistics: aoe fix
Updates the enhanced partition statistics in ATA over Ethernet driver
(not tested).

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2008-02-08 12:41:57 +01:00
Jesper Nilsson
5efa1d1c94 CRIS v10: drivers/net/cris/eth_v10.c rename LED defines to CRIS_LED to avoid name clash. 2008-02-08 11:16:44 +01:00
Benjamin Herrenschmidt
592a607bbc [POWERPC] Disable G5 NAP mode during SMU commands on U3
It appears that with the U3 northbridge, if the processor is in NAP
mode the whole time while waiting for an SMU command to complete,
then the SMU will fail.  It could be related to the weird backward
mechanism the SMU uses to get to system memory via i2c to the
northbridge that doesn't operate properly when the said bridge is
in napping along with the CPU.  That is on U3 at least, U4 doesn't
seem to be affected.

This didn't show before NO_HZ as the timer wakeup was enough to make
it work it seems, but that is no longer the case.

This fixes it by disabling NAP mode on those machines while
an SMU command is in flight.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-02-08 19:52:35 +11:00
Frank Seidel
882c49164d mmc: extend ricoh_mmc to support Ricoh RL5c476
This patch adds support for the Ricoh RL5c476 chip: with this
the mmc adapter that needs this disabler (R5C843) can also be
handled correctly when it sits on a RL5c476.

Signed-off-by: Frank Seidel <fseidel@suse.de>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-02-08 09:02:47 +01:00
David Brownell
6e996ee8e7 at91_mci: use generic GPIO calls
Update the AT91 MMC driver to use the generic GPIO calls instead of the
AT91-specific calls; and to request (and release) those GPIO signals.

That required updating the probe() fault cleanup codepaths.  Now there
is a single sequence for freeing resources, in reverse order of their
allocation.  Also that code uses use dev_*() for messaging, and has less
abuse of KERN_ERR.

Likewise with updating remove() cleanup.  This had to free the GPIOs,
and while adding that code I noticed and fixed two other problems:  it
was poking at a workqueue owned by the mmc core; and in one (rare)
case would try freeing an IRQ that it didn't allocate.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-02-08 09:02:47 +01:00
Feng Tang
541ceb5b8b sdhci: add num index for multi controllers case
Some devices have several controllers; need add the index info to
device slot name host->slot_desc[]

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-02-08 09:02:47 +01:00
Pierre Ossman
34671dc2e6 mmc: remove sdhci and mmc_spi experimental markers
Both of these drivers work well (although some hardware still has
its problems) and are not in the "alpha" quality that EXPERIMENTAL
suggests.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-02-08 09:02:46 +01:00
Philip Langdale
1f090bf524 mmc: Handle suspend/resume in Ricoh MMC disabler
As pci config space is reinitialised on a suspend/resume cycle, the
disabler needs to work its magic at resume time. For symmetry this
change also explicitly enables the controller at suspend time but
it's not strictly necessary.

Signed-off-by: Philipl Langdale <philipl@overt.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-02-08 09:02:46 +01:00
Len Brown
2e6c4e5101 Merge branches 'release', 'dmi' and 'misc' into release 2008-02-08 01:22:26 -05:00
Len Brown
4a507d93fa acer-wmi, tc1100-wmi: select ACPI_WMI
It is safe for these Kconfig entries to use select because
they select ACPI_WMI, which already has its dependencies
satisfied.  This makes Kconfig more user friendly, since
the user selects the driver they want and the dependency
is met for them.  Otherwise, the user would have to find
and enable ACPI_WMI to make enabling these drivers possible.

Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-08 00:37:16 -05:00
Carlos Corbacho
20b4514799 ACPI: WMI: Improve Kconfig description
As Pavel Machek has pointed out, the Kconfig entry for WMI is pretty
non-descriptive.

Rewrite it so that it explains what ACPI-WMI is, and why anyone
would want to enable it.

Many thanks to Ray Lee for ideas on this.

Signed-off-by: Carlos Corbacho <carlos@strangeworlds.co.uk>
CC: Pavel Machek <pavel@ucw.cz>
CC: Ray Lee <ray-lk@madrabbit.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-08 00:36:49 -05:00
Len Brown
446b1dfc4c ACPI: DMI: add Panasonic CF-52 and Thinpad X61
Add Lenovo X61
Add Panasonic Toughbook CF-52

Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-08 00:11:09 -05:00
Len Brown
543a956140 ACPI: thermal: syntax, spelling, kernel-doc
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-07 23:48:04 -05:00
Linus Torvalds
a4ffc0a0b2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: (44 commits)
  dm raid1: report fault status
  dm raid1: handle read failures
  dm raid1: fix EIO after log failure
  dm raid1: handle recovery failures
  dm raid1: handle write failures
  dm snapshot: combine consecutive exceptions in memory
  dm: stripe enhanced status return
  dm: stripe trigger event on failure
  dm log: auto load modules
  dm: move deferred bio flushing to workqueue
  dm crypt: use async crypto
  dm crypt: prepare async callback fn
  dm crypt: add completion for async
  dm crypt: add async request mempool
  dm crypt: extract scatterlist processing
  dm crypt: tidy io ref counting
  dm crypt: introduce crypt_write_io_loop
  dm crypt: abstract crypt_write_done
  dm crypt: store sector mapping in dm_crypt_io
  dm crypt: move queue functions
  ...
2008-02-07 19:30:50 -08:00
Linus Torvalds
d7511ec811 Merge branch 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6
* 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6: (59 commits)
  hwmon: (lm80) Add individual alarm files
  hwmon: (lm80) De-macro the sysfs callbacks
  hwmon: (lm80) Various cleanups
  hwmon: (w83627hf) Refactor beep enable handling
  hwmon: (w83627hf) Add individual alarm and beep files
  hwmon: (w83627hf) Enable VBAT monitoring
  hwmon: (w83627ehf) The W83627DHG has 8 VID pins
  hwmon: (asb100) Add individual alarm files
  hwmon: (asb100) De-macro the sysfs callbacks
  hwmon: (asb100) Various cleanups
  hwmon: VRM is not written to registers
  hwmon: (dme1737) fix Super-IO device ID override
  hwmon: (dme1737) fix divide-by-0
  hwmon: (abituguru3) Add AUX4 fan input for Abit IP35 Pro
  hwmon: Add support for Texas Instruments/Burr-Brown ADS7828
  hwmon: (adm9240) Add individual alarm files
  hwmon: (lm77) Add individual alarm files
  hwmon: Discard useless I2C driver IDs
  hwmon: (lm85) Make the pwmN_enable files writable
  hwmon: (lm85) Return standard values in pwmN_enable
  ...
2008-02-07 19:15:38 -08:00
Nick Piggin
a13ff0bb3f Convert SG from nopage to fault.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Douglas Gilbert <dougg@torque.net>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 19:09:22 -08:00
Sam Ravnborg
054b0e2b2d [ISDN]: fix section mismatch warning in enpci_card_msg
Fix following warnings:
WARNING: drivers/isdn/hisax/built-in.o(.text+0x3cf50): Section mismatch in reference from the function enpci_card_msg() to the function .devinit.text:Amd7930_init()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x3cf85): Section mismatch in reference from the function enpci_card_msg() to the function .devinit.text:Amd7930_init()

enpci_card_msg() can be called outside __devinit context
referenced function should not be annotated __devinit.

Remove annotation of Amd7930_init to fix this.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:20:29 -08:00
Jonathan Brassow
af195ac82e dm raid1: report fault status
This patch adds extra information to the mirror status output, so that
it can be determined which device(s) have failed.  For each mirror device,
a character is printed indicating the most severe error encountered.  The
characters are:
 *    A => Alive - No failures
 *    D => Dead - A write failure occurred leaving mirror out-of-sync
 *    S => Sync - A sychronization failure occurred, mirror out-of-sync
 *    R => Read - A read failure occurred, mirror data unaffected
This allows userspace to properly reconfigure the mirror set.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-02-08 02:11:39 +00:00
Jonathan Brassow
06386bbfd2 dm raid1: handle read failures
This patch gives the ability to respond-to/record device failures
that happen during read operations.  It also adds the ability to
read from mirror devices that are not the primary if they are
in-sync.

There are essentially two read paths in mirroring; the direct path
and the queued path.  When a read request is mapped, if the region
is 'in-sync' the direct path is taken; otherwise the queued path
is taken.

If the direct path is taken, we must record bio information so that
if the read fails we can retry it.  We then discover the status of
a direct read through mirror_end_io.  If the read has failed, we will
mark the device from which the read was attempted as failed (so we
don't try to read from it again), restore the bio and try again.

If the queued path is taken, we discover the results of the read
from 'read_callback'.  If the device failed, we will mark the device
as failed and attempt the read again if there is another device
where this region is known to be 'in-sync'.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-02-08 02:11:37 +00:00
Jonathan Brassow
b80aa7a0c2 dm raid1: fix EIO after log failure
This patch adds the ability to requeue write I/O to
core device-mapper when there is a log device failure.

If a write to the log produces and error, the pending writes are
put on the "failures" list.  Since the log is marked as failed,
they will stay on the failures list until a suspend happens.

Suspends come in two phases, presuspend and postsuspend.  We must
make sure that all the writes on the failures list are requeued
in the presuspend phase (a requirement of dm core).  This means
that recovery must be complete (because writes may be delayed
behind it) and the failures list must be requeued before we
return from presuspend.

The mechanisms to ensure recovery is complete (or stopped) was
already in place, but needed to be moved from postsuspend to
presuspend.  We rely on 'flush_workqueue' to ensure that the
mirror thread is complete and therefore, has requeued all writes
in the failures list.

Because we are using flush_workqueue, we must ensure that no
additional 'queue_work' calls will produce additional I/O
that we need to requeue (because once we return from
presuspend, we are unable to do anything about it).  'queue_work'
is called in response to the following functions:
- complete_resync_work = NA, recovery is stopped
- rh_dec (mirror_end_io) = NA, only calls 'queue_work' if it
                           is ready to recover the region
                           (recovery is stopped) or it needs
                           to clear the region in the log*
                           **this doesn't get called while
                           suspending**
- rh_recovery_end = NA, recovery is stopped
- rh_recovery_start = NA, recovery is stopped
- write_callback = 1) Writes w/o failures simply call
                   bio_endio -> mirror_end_io -> rh_dec
                   (see rh_dec above)
                   2) Writes with failures are put on
                   the failures list and queue_work is
                   called**
                   ** write_callbacks don't happen
                   during suspend **
- do_failures = NA, 'queue_work' not called if suspending
- add_mirror (initialization) = NA, only done on mirror creation
- queue_bio = NA, 1) delayed I/O scheduled before flush_workqueue
              is called.  2) No more I/Os are being issued.
              3) Re-attempted READs can still be handled.
              (Write completions are handled through rh_dec/
              write_callback - mention above - and do not
              use queue_bio.)

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-02-08 02:11:35 +00:00
Jonathan Brassow
8f0205b798 dm raid1: handle recovery failures
This patch adds the calls to 'fail_mirror' if an error occurs during
mirror recovery (aka resynchronization).  'fail_mirror' is responsible
for recording the type of error by mirror device and ensuring an event
gets raised for the purpose of notifying userspace.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-02-08 02:11:32 +00:00
Jonathan Brassow
72f4b31410 dm raid1: handle write failures
This patch gives mirror the ability to handle device failures
during normal write operations.

The 'write_callback' function is called when a write completes.
If all the writes failed or succeeded, we report failure or
success respectively.  If some of the writes failed, we call
fail_mirror; which increments the error count for the device, notes
the type of error encountered (DM_RAID1_WRITE_ERROR),  and
selects a new primary (if necessary).  Note that the primary
device can never change while the mirror is not in-sync (IOW,
while recovery is happening.)  This means that the scenario
where a failed write changes the primary and gives
recovery_complete a chance to misread the primary never happens.
The fact that the primary can change has necessitated the change
to the default_mirror field.  We need to protect against reading
garbage while the primary changes.  We then add the bio to a new
list in the mirror set, 'failures'.  For every bio in the 'failures'
list, we call a new function, '__bio_mark_nosync', where we mark
the region 'not-in-sync' in the log and properly set the region
state as, RH_NOSYNC.  Userspace must also be notified of the
failure.  This is done by 'raising an event' (dm_table_event()).
If fail_mirror is called in process context the event can be raised
right away.  If in interrupt context, the event is deferred to the
kmirrord thread - which raises the event if 'event_waiting' is set.

Backwards compatibility is maintained by ignoring errors if
the DM_FEATURES_HANDLE_ERRORS flag is not present.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-02-08 02:11:29 +00:00
Milan Broz
d74f81f8ad dm snapshot: combine consecutive exceptions in memory
Provided sector_t is 64 bits, reduce the in-memory footprint of the
snapshot exception table by the simple method of using unused bits of
the chunk number to combine consecutive entries.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-02-08 02:11:27 +00:00
Brian Wood
4f7f5c675f dm: stripe enhanced status return
This patch adds additional information to the status line. It is added at the
end of the returned text so it will not interfere with existing
implementations using this data. The addition of this information will allow
for a common return interface to match that returned with the dm-raid1.c
status line (with Jonathan Brassow's patches).

Here is a sample of what is returned with a mirror "status" call:
isw_eeaaabgfg_mirror: 0 488390920 mirror 2 8:16 8:32 3727/3727 1 AA 1 core

Here's what's returned with this patch for a stripe "status" call:
isw_dheeijjdej_stripe: 0 976783872 striped 2 8:16 8:32 1 AA

Signed-off-by: Brian Wood <brian.j.wood@intel.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-02-08 02:11:24 +00:00