A remote AoE device is something can process ATA commands and is identified by
an AoE shelf number and an AoE slot number. Such a device might have more
than one network interface, and it might be reachable by more than one local
network interface. This patch tracks the available network paths available to
each AoE device, allowing them to be used more efficiently.
Andrew Morton asked about the call to msleep_interruptible in the revalidate
function. Yes, if a signal is pending, then msleep_interruptible will not
return 0. That means we will not loop but will call aoenet_xmit with a NULL
skb, which is a noop. If the system is too low on memory or the aoe driver is
too low on frames, then the user can hit control-C to interrupt the attempt to
do a revalidate. I have added a comment to the code summarizing that.
Andrew Morton asked whether the allocation performed inside addtgt could use a
more relaxed allocation like GFP_KERNEL, but addtgt is called when the aoedev
lock has been locked with spin_lock_irqsave. It would be nice to allocate the
memory under fewer restrictions, but targets are only added when the device is
being discovered, and if the target can't be added right now, we can try again
in a minute when then next AoE config query broadcast goes out.
Andrew Morton pointed out that the "too many targets" message could be printed
for failing GFP_ATOMIC allocations. The last patch in this series makes the
messages more specific.
Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a rewrite of the ramdisk block device driver.
The old one is really difficult because it effectively implements a block
device which serves data out of its own buffer cache. It relies on the dirty
bit being set, to pin its backing store in cache, however there are non
trivial paths which can clear the dirty bit (eg. try_to_free_buffers()),
which had recently lead to data corruption. And in general it is completely
wrong for a block device driver to do this.
The new one is more like a regular block device driver. It has no idea about
vm/vfs stuff. It's backing store is similar to the buffer cache (a simple
radix-tree of pages), but it doesn't know anything about page cache (the pages
in the radix tree are not pagecache pages).
There is one slight downside -- direct block device access and filesystem
metadata access goes through an extra copy and gets stored in RAM twice.
However, this downside is only slight, because the real buffercache of the
device is now reclaimable (because we're not playing crazy games with it), so
under memory intensive situations, footprint should effectively be the same --
maybe even a slight advantage to the new driver because it can also reclaim
buffer heads.
The fact that it now goes through all the regular vm/fs paths makes it
much more useful for testing, too.
text data bss dec hex filename
2837 849 384 4070 fe6 drivers/block/rd.o
3528 371 12 3911 f47 drivers/block/brd.o
Text is larger, but data and bss are smaller, making total size smaller.
A few other nice things about it:
- Similar structure and layout to the new loop device handlinag.
- Dynamic ramdisk creation.
- Runtime flexible buffer head size (because it is no longer part of the
ramdisk code).
- Boot / load time flexible ramdisk size, which could easily be extended
to a per-ramdisk runtime changeable size (eg. with an ioctl).
- Can use highmem for the backing store.
[akpm@linux-foundation.org: fix build]
[byron.bbradley@gmail.com: make rd_size non-static]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Byron Bradley <byron.bbradley@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add architecture support for the MN10300/AM33 CPUs produced by MEI to the
kernel.
This patch also adds board support for the ASB2303 with the ASB2308 daughter
board, and the ASB2305. The only processor supported is the MN103E010, which
is an AM33v2 core plus on-chip devices.
[akpm@linux-foundation.org: nuke cvs control strings]
Signed-off-by: Masakazu Urade <urade.masakazu@jp.panasonic.com>
Signed-off-by: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
net2280 can't have a function called show_registers() because this can produce
a namespace clash with an arch function of the same name.
All this driver's functions and variables should really be prefixed with
"net2280_" to avoid such a problem in future.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Greg KH <greg@kroah.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix an off by one bug in the fault reason string reporting function, and
clean up some of the code around this buglet.
[akpm@linux-foundation.org: cleanup]
Signed-off-by: mark gross <mgross@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for protected memory enable bits by clearing them if they are
set at startup time. Some future boot loaders or firmware could have this
bit set after it loads the kernel, and it needs to be cleared if DMA's are
going to happen effectively.
Signed-off-by: mark gross <mgross@intel.com>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It appears that with the U3 northbridge, if the processor is in NAP
mode the whole time while waiting for an SMU command to complete,
then the SMU will fail. It could be related to the weird backward
mechanism the SMU uses to get to system memory via i2c to the
northbridge that doesn't operate properly when the said bridge is
in napping along with the CPU. That is on U3 at least, U4 doesn't
seem to be affected.
This didn't show before NO_HZ as the timer wakeup was enough to make
it work it seems, but that is no longer the case.
This fixes it by disabling NAP mode on those machines while
an SMU command is in flight.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds support for the Ricoh RL5c476 chip: with this
the mmc adapter that needs this disabler (R5C843) can also be
handled correctly when it sits on a RL5c476.
Signed-off-by: Frank Seidel <fseidel@suse.de>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Update the AT91 MMC driver to use the generic GPIO calls instead of the
AT91-specific calls; and to request (and release) those GPIO signals.
That required updating the probe() fault cleanup codepaths. Now there
is a single sequence for freeing resources, in reverse order of their
allocation. Also that code uses use dev_*() for messaging, and has less
abuse of KERN_ERR.
Likewise with updating remove() cleanup. This had to free the GPIOs,
and while adding that code I noticed and fixed two other problems: it
was poking at a workqueue owned by the mmc core; and in one (rare)
case would try freeing an IRQ that it didn't allocate.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Some devices have several controllers; need add the index info to
device slot name host->slot_desc[]
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Both of these drivers work well (although some hardware still has
its problems) and are not in the "alpha" quality that EXPERIMENTAL
suggests.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
As pci config space is reinitialised on a suspend/resume cycle, the
disabler needs to work its magic at resume time. For symmetry this
change also explicitly enables the controller at suspend time but
it's not strictly necessary.
Signed-off-by: Philipl Langdale <philipl@overt.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
It is safe for these Kconfig entries to use select because
they select ACPI_WMI, which already has its dependencies
satisfied. This makes Kconfig more user friendly, since
the user selects the driver they want and the dependency
is met for them. Otherwise, the user would have to find
and enable ACPI_WMI to make enabling these drivers possible.
Signed-off-by: Len Brown <len.brown@intel.com>
As Pavel Machek has pointed out, the Kconfig entry for WMI is pretty
non-descriptive.
Rewrite it so that it explains what ACPI-WMI is, and why anyone
would want to enable it.
Many thanks to Ray Lee for ideas on this.
Signed-off-by: Carlos Corbacho <carlos@strangeworlds.co.uk>
CC: Pavel Machek <pavel@ucw.cz>
CC: Ray Lee <ray-lk@madrabbit.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Fix following warnings:
WARNING: drivers/isdn/hisax/built-in.o(.text+0x3cf50): Section mismatch in reference from the function enpci_card_msg() to the function .devinit.text:Amd7930_init()
WARNING: drivers/isdn/hisax/built-in.o(.text+0x3cf85): Section mismatch in reference from the function enpci_card_msg() to the function .devinit.text:Amd7930_init()
enpci_card_msg() can be called outside __devinit context
referenced function should not be annotated __devinit.
Remove annotation of Amd7930_init to fix this.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds extra information to the mirror status output, so that
it can be determined which device(s) have failed. For each mirror device,
a character is printed indicating the most severe error encountered. The
characters are:
* A => Alive - No failures
* D => Dead - A write failure occurred leaving mirror out-of-sync
* S => Sync - A sychronization failure occurred, mirror out-of-sync
* R => Read - A read failure occurred, mirror data unaffected
This allows userspace to properly reconfigure the mirror set.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This patch gives the ability to respond-to/record device failures
that happen during read operations. It also adds the ability to
read from mirror devices that are not the primary if they are
in-sync.
There are essentially two read paths in mirroring; the direct path
and the queued path. When a read request is mapped, if the region
is 'in-sync' the direct path is taken; otherwise the queued path
is taken.
If the direct path is taken, we must record bio information so that
if the read fails we can retry it. We then discover the status of
a direct read through mirror_end_io. If the read has failed, we will
mark the device from which the read was attempted as failed (so we
don't try to read from it again), restore the bio and try again.
If the queued path is taken, we discover the results of the read
from 'read_callback'. If the device failed, we will mark the device
as failed and attempt the read again if there is another device
where this region is known to be 'in-sync'.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This patch adds the ability to requeue write I/O to
core device-mapper when there is a log device failure.
If a write to the log produces and error, the pending writes are
put on the "failures" list. Since the log is marked as failed,
they will stay on the failures list until a suspend happens.
Suspends come in two phases, presuspend and postsuspend. We must
make sure that all the writes on the failures list are requeued
in the presuspend phase (a requirement of dm core). This means
that recovery must be complete (because writes may be delayed
behind it) and the failures list must be requeued before we
return from presuspend.
The mechanisms to ensure recovery is complete (or stopped) was
already in place, but needed to be moved from postsuspend to
presuspend. We rely on 'flush_workqueue' to ensure that the
mirror thread is complete and therefore, has requeued all writes
in the failures list.
Because we are using flush_workqueue, we must ensure that no
additional 'queue_work' calls will produce additional I/O
that we need to requeue (because once we return from
presuspend, we are unable to do anything about it). 'queue_work'
is called in response to the following functions:
- complete_resync_work = NA, recovery is stopped
- rh_dec (mirror_end_io) = NA, only calls 'queue_work' if it
is ready to recover the region
(recovery is stopped) or it needs
to clear the region in the log*
**this doesn't get called while
suspending**
- rh_recovery_end = NA, recovery is stopped
- rh_recovery_start = NA, recovery is stopped
- write_callback = 1) Writes w/o failures simply call
bio_endio -> mirror_end_io -> rh_dec
(see rh_dec above)
2) Writes with failures are put on
the failures list and queue_work is
called**
** write_callbacks don't happen
during suspend **
- do_failures = NA, 'queue_work' not called if suspending
- add_mirror (initialization) = NA, only done on mirror creation
- queue_bio = NA, 1) delayed I/O scheduled before flush_workqueue
is called. 2) No more I/Os are being issued.
3) Re-attempted READs can still be handled.
(Write completions are handled through rh_dec/
write_callback - mention above - and do not
use queue_bio.)
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This patch adds the calls to 'fail_mirror' if an error occurs during
mirror recovery (aka resynchronization). 'fail_mirror' is responsible
for recording the type of error by mirror device and ensuring an event
gets raised for the purpose of notifying userspace.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This patch gives mirror the ability to handle device failures
during normal write operations.
The 'write_callback' function is called when a write completes.
If all the writes failed or succeeded, we report failure or
success respectively. If some of the writes failed, we call
fail_mirror; which increments the error count for the device, notes
the type of error encountered (DM_RAID1_WRITE_ERROR), and
selects a new primary (if necessary). Note that the primary
device can never change while the mirror is not in-sync (IOW,
while recovery is happening.) This means that the scenario
where a failed write changes the primary and gives
recovery_complete a chance to misread the primary never happens.
The fact that the primary can change has necessitated the change
to the default_mirror field. We need to protect against reading
garbage while the primary changes. We then add the bio to a new
list in the mirror set, 'failures'. For every bio in the 'failures'
list, we call a new function, '__bio_mark_nosync', where we mark
the region 'not-in-sync' in the log and properly set the region
state as, RH_NOSYNC. Userspace must also be notified of the
failure. This is done by 'raising an event' (dm_table_event()).
If fail_mirror is called in process context the event can be raised
right away. If in interrupt context, the event is deferred to the
kmirrord thread - which raises the event if 'event_waiting' is set.
Backwards compatibility is maintained by ignoring errors if
the DM_FEATURES_HANDLE_ERRORS flag is not present.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Provided sector_t is 64 bits, reduce the in-memory footprint of the
snapshot exception table by the simple method of using unused bits of
the chunk number to combine consecutive entries.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This patch adds additional information to the status line. It is added at the
end of the returned text so it will not interfere with existing
implementations using this data. The addition of this information will allow
for a common return interface to match that returned with the dm-raid1.c
status line (with Jonathan Brassow's patches).
Here is a sample of what is returned with a mirror "status" call:
isw_eeaaabgfg_mirror: 0 488390920 mirror 2 8:16 8:32 3727/3727 1 AA 1 core
Here's what's returned with this patch for a stripe "status" call:
isw_dheeijjdej_stripe: 0 976783872 striped 2 8:16 8:32 1 AA
Signed-off-by: Brian Wood <brian.j.wood@intel.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>