If put_char() routine of a hvc console backend returns 0, then the
hvc console starts looping in the following scenarios:
1. hvc_console_print()
If put_char() returns 0 then the while loop may loop forever.
I have added the missing check for 0 to throw away console messages.
2. khvcd may loop:
The thread calls hvc_poll() --> hvc_push()... if there are still
buffered data then the HVC_POLL_WRITE bit is set and causes the
khvcd thread to loop (if yield() returns immediately).
However, instead of looping, the khvcd thread could sleep for
MIN_TIMEOUT (doing the same as for get_chars()).
The MIN_TIMEOUT is set if hvc_push() was not able to write
data to the backend. If data has been written, the timeout is
set to 0 to immediately re-schedule hvc_poll().
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> (virtio_console)
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
After a tty hangup() or close() operation, processes might not reset the
termio settings to a sane state. In order to reset the termios to its
default settings the tty driver flag TTY_DRIVER_RESET_TERMIOS has been added.
TTY driver flag description from include/linux/tty_driver.h:
TTY_DRIVER_RESET_TERMIOS --- requests the tty layer to reset the
termios setting when the last process has closed the device.
Used for PTY's, in particular.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I have added a hangup notifier that can be used by hvc console
backends to handle a tty hangup. The default irq hangup notifier
calls the notifier_del_irq() for compatibility.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
sm501_devdata->irq is unsigned, while platform_get_irq() returns a
signed int.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Use the newly introduced pci_ioremap_bar() function in drivers/mfd.
pci_ioremap_bar() just takes a pci device and a bar number, with the goal
of making it really hard to get wrong, while also having a central place
to stick sanity checks.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This makes the contents of the cache clearer and fixes incorrect
initialisation of the cache for partially volatile registers.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
October 10th linux-next build (powerpc allyesconfig) failed like this:
drivers/mfd/wm8350-core.c:1131: error: __ksymtab_wm8350_create_cache causes a section type conflict
Caused by commit 89b4012bef ("mfd: Core
support for the WM8350 AudioPlus PMIC"). wm8350_create_cache is not used
elsewhere, so remove the EXPORT_SYMBOL.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This adds basic support for the GPIOs in the twl4030 power management
chip. That includes two open drain LED drivers, and the use of GPIO-0
(and GPIO-1) as MMC/SD card detect switches which can control whether
the VMMC1 (and VMMC2) regulators are active.
This version of the code has a debounce call that will probably be
replaced before long, when a more generic interface exists.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This adds a driver for the RTC inside the TWL4030 multi-function device.
It's a fairly basic RTC, with a wake-capable alarm.
Note that many of the pre-release Overo boards now in circulation can't
effectively use this RTC, because of a wiring error that puts its TWL
chip into "secure" mode. (As in "secure yourself against tampering".)
This isn't an issue on other OMAP3 boards now supported in mainline,
such as Beagle and Labrador.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
- Move it into a separate file; clean and streamline it
- Restructure the init code for reuse during secondary dispatch
- Support both levels (primary, secondary) of IRQ dispatch
- Use a workqueue for irq mask/unmask and trigger configuration
Code for two subchips currently share that secondary handler code.
One is the power subchip; its IRQs are now handled by this core,
courtesy of this patch. The other is the GPIO module, which will
be supported through a later patch.
There are also minor changes to the header file, mostly related
to GPIO support; nothing yet in mainline cares about those. A
few references to OMAP-specific symbols are disabled; when they
can all be removed, the TWL4030 support ceases being OMAP-specific.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This patch fixes the section mismatch warning from
sa1111.o at buildtime.
CC arch/arm/common/sa1111.o
LD arch/arm/common/built-in.o
LD vmlinux.o
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x87f4): Section mismatch in reference from the function sa1111_probe() to the function .devinit.text:sa1110_mb_enable()
The function sa1111_probe() references
the function __devinit sa1110_mb_enable().
This is often because sa1111_probe lacks a __devinit
annotation or the annotation of sa1110_mb_enable is wrong.
Signed-off-by: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch tidies local_init() in preparation for request-based dm.
No functional change.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This patch removes the DM_WQ_FLUSH_ALL state that is unnecessary.
The dm_queue_flush(md, DM_WQ_FLUSH_ALL, NULL) in dm_suspend()
is never invoked because:
- 'goto flush_and_out' is the same as 'goto out' because
the 'goto flush_and_out' is called only when '!noflush'
- If r is non-zero, then the code above will invoke 'goto out'
and skip this code.
No functional change.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Separate the region hash code from raid1 so it can be shared by forthcoming
targets. Use BUG_ON() for failed async dm_io() calls.
Signed-off-by: Heinz Mauelshagen <hjm@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
When a bio gets split, mark its fragments with the BIO_CLONED flag.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Remove waitqueue no longer needed with the async crypto interface.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
When writing io, dm-crypt has to allocate a new cloned bio
and encrypt the data into newly-allocated pages attached to this bio.
In rare cases, because of hw restrictions (e.g. physical segment limit)
or memory pressure, sometimes more than one cloned bio has to be used,
each processing a different fragment of the original.
Currently there is one waitqueue which waits for one fragment to finish
and continues processing the next fragment.
But when using asynchronous crypto this doesn't work, because several
fragments may be processed asynchronously or in parallel and there is
only one crypt context that cannot be shared between the bio fragments.
The result may be corruption of the data contained in the encrypted bio.
The patch fixes this by allocating new dm_crypt_io structs (with new
crypto contexts) and running them independently.
The fragments contains a pointer to the base dm_crypt_io struct to
handle reference counting, so the base one is properly deallocated
after all the fragments are finished.
In a low memory situation, this only uses one additional object from the
mempool. If the mempool is empty, the next allocation simple waits for
previous fragments to complete.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Prepare local sector variable (offset) for later patch.
Do not update io->sector for still-running I/O.
No functional change.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Change #include "dm.h" to #include <linux/device-mapper.h> in all targets.
Targets should not need direct access to internal DM structures.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Move array_too_big to include/linux/device-mapper.h because it is
used by targets.
Remove the test from dm-raid1 as the number of mirror legs is limited
such that it can never fail. (Even for stripes it seems rather
unlikely.)
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
We must zero the next chunk on disk *before* writing out the current chunk, not
after. Otherwise if the machine crashes at the wrong time, the "end of metadata"
marker may be missing.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@kernel.org
Use a separate buffer for writing zeroes to the on-disk snapshot
exception store, make the updating of ps->current_area explicit and
refactor the code in preparation for the fix in the next patch.
No functional change.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
The last_percent field is unused - remove it.
(It dates from when events were triggered as each X% filled up.)
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Fix a race condition with primary_pe ref_count handling.
put_pending_exception runs under dm_snapshot->lock, it does atomic_dec_and_test
on primary_pe->ref_count, and later does atomic_read primary_pe->ref_count.
__origin_write does atomic_dec_and_test on primary_pe->ref_count without holding
dm_snapshot->lock.
This opens the following race condition:
Assume two CPUs, CPU1 is executing put_pending_exception (and holding
dm_snapshot->lock). CPU2 is executing __origin_write in parallel.
primary_pe->ref_count == 2.
CPU1:
if (primary_pe && atomic_dec_and_test(&primary_pe->ref_count))
origin_bios = bio_list_get(&primary_pe->origin_bios);
... decrements primary_pe->ref_count to 1. Doesn't load origin_bios
CPU2:
if (first && atomic_dec_and_test(&primary_pe->ref_count)) {
flush_bios(bio_list_get(&primary_pe->origin_bios));
free_pending_exception(primary_pe);
/* If we got here, pe_queue is necessarily empty. */
return r;
}
... decrements primary_pe->ref_count to 0, submits pending bios, frees
primary_pe.
CPU1:
if (!primary_pe || primary_pe != pe)
free_pending_exception(pe);
... this has no effect.
if (primary_pe && !atomic_read(&primary_pe->ref_count))
free_pending_exception(primary_pe);
... sees ref_count == 0 (written by CPU 2), does double free !!
This bug can happen only if someone is simultaneously writing to both the
origin and the snapshot.
If someone is writing only to the origin, __origin_write will submit kcopyd
request after it decrements primary_pe->ref_count (so it can't happen that the
finished copy races with primary_pe->ref_count decrementation).
If someone is writing only to the snapshot, __origin_write isn't invoked at all
and the race can't happen.
The race happens when someone writes to the snapshot --- this creates
pending_exception with primary_pe == NULL and starts copying. Then, someone
writes to the same chunk in the snapshot, and __origin_write races with
termination of already submitted request in pending_complete (that calls
put_pending_exception).
This race may be reason for bugs:
http://bugzilla.kernel.org/show_bug.cgi?id=11636https://bugzilla.redhat.com/show_bug.cgi?id=465825
The patch fixes the code to make sure that:
1. If atomic_dec_and_test(&primary_pe->ref_count) returns false, the process
must no longer dereference primary_pe (because someone else may free it under
us).
2. If atomic_dec_and_test(&primary_pe->ref_count) returns true, the process
is responsible for freeing primary_pe.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@kernel.org
Write throughput to LVM snapshot origin volume is an order
of magnitude slower than those to LV without snapshots or
snapshot target volumes, especially in the case of sequential
writes with O_SYNC on.
The following patch originally written by Kevin Jamieson and
Jan Blunck and slightly modified for the current RCs by myself
tries to improve the performance by modifying the behaviour
of kcopyd, so that it pushes back an I/O job to the head of
the job queue instead of the tail as process_jobs() currently
does when it has to wait for free pages. This way, write
requests aren't shuffled to cause extra seeks.
I tested the patch against 2.6.27-rc5 and got the following results.
The test is a dd command writing to snapshot origin followed by fsync
to the file just created/updated. A couple of filesystem benchmarks
gave me similar results in case of sequential writes, while random
writes didn't suffer much.
dd if=/dev/zero of=<somewhere on snapshot origin> bs=4096 count=...
[conv=notrunc when updating]
1) linux 2.6.27-rc5 without the patch, write to snapshot origin,
average throughput (MB/s)
10M 100M 1000M
create,dd 511.46 610.72 11.81
create,dd+fsync 7.10 6.77 8.13
update,dd 431.63 917.41 12.75
update,dd+fsync 7.79 7.43 8.12
compared with write throughput to LV without any snapshots,
all dd+fsync and 1000 MiB writes perform very poorly.
10M 100M 1000M
create,dd 555.03 608.98 123.29
create,dd+fsync 114.27 72.78 76.65
update,dd 152.34 1267.27 124.04
update,dd+fsync 130.56 77.81 77.84
2) linux 2.6.27-rc5 with the patch, write to snapshot origin,
average throughput (MB/s)
10M 100M 1000M
create,dd 537.06 589.44 46.21
create,dd+fsync 31.63 29.19 29.23
update,dd 487.59 897.65 37.76
update,dd+fsync 34.12 30.07 26.85
Although still not on par with plain LV performance -
cannot be avoided because it's copy on write anyway -
this simple patch successfully improves throughtput
of dd+fsync while not affecting the rest.
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Kazuo Ito <ito.kazuo@oss.ntt.co.jp>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@kernel.org
The uncached area starts at e000.0000 and spans 1GB. Also add an IOADDR
macro to determine the bypass region to access the IO space.
Signed-off-by: Chris Zankel <chris@zankel.net>
current rcu_barrier_bh() is like this:
void rcu_barrier_bh(void)
{
BUG_ON(in_interrupt());
/* Take cpucontrol mutex to protect against CPU hotplug */
mutex_lock(&rcu_barrier_mutex);
init_completion(&rcu_barrier_completion);
atomic_set(&rcu_barrier_cpu_count, 0);
/*
* The queueing of callbacks in all CPUs must be atomic with
* respect to RCU, otherwise one CPU may queue a callback,
* wait for a grace period, decrement barrier count and call
* complete(), while other CPUs have not yet queued anything.
* So, we need to make sure that grace periods cannot complete
* until all the callbacks are queued.
*/
rcu_read_lock();
on_each_cpu(rcu_barrier_func, (void *)RCU_BARRIER_BH, 1);
rcu_read_unlock();
wait_for_completion(&rcu_barrier_completion);
mutex_unlock(&rcu_barrier_mutex);
}
The inconsistency of the code and the comments show a bug here.
rcu_read_lock() cannot make sure that "grace periods for RCU_BH
cannot complete until all the callbacks are queued".
it only make sure that race periods for RCU cannot complete
until all the callbacks are queued.
so we must use rcu_read_lock_bh() for rcu_barrier_bh().
like this:
void rcu_barrier_bh(void)
{
......
rcu_read_lock_bh();
on_each_cpu(rcu_barrier_func, (void *)RCU_BARRIER_BH, 1);
rcu_read_unlock_bh();
......
}
and also rcu_barrier() rcu_barrier_sched() are implemented like this.
it will bring a lot of duplicate code. My patch uses another way to
fix this bug, please see the comment of my patch.
Thank Paul E. McKenney for he rewrote the comment.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If the member 'name' of the irq_desc structure happens to point to a
character string that is resident within a kernel module, problems ensue
if that module is rmmod'd (at which time dynamic_irq_cleanup() is called)
and then later show_interrupts() is called by someone.
It is also not a good thing if the character string resided in kmalloc'd
space that has been kfree'd (after having called dynamic_irq_cleanup()).
dynamic_irq_cleanup() fails to NULL the 'name' member and
show_interrupts() references it on a few architectures (like h8300, sh and
x86).
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix off-by-one in for_each_irq_desc_reverse().
Impact is near zero in practice, because nothing substantial wants to
iterate down to IRQ#0 - but fix it nevertheless.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Impact: fix boot hang on a G5
In set_irq_type() we want to pass the type rather than the current
interrupt state.
Signed-off-by: Chris Friesen <cfriesen@nortel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The RTC is sitting on the I2C2 bus at address 0x68. RTC interrupt signal
is connected to the IPIC's EXT2 interrupt line, the line is shared with
Vitesse 8201 Ethernet PHY.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
MCU is an external Freescale MC9S08QG8 microcontroller, mainly used to
provide soft power-off function, but also exports two GPIOs (wired to
the LEDs and also available from the external headers).
Added the MCU on mpc8349emitx, mpc837xrdb and mpc8315erdb boards.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Change the top-level #address-cells and #size-cells to <2> so the
mpc8572ds.dts is easier to deal with both a true 32-bit physical
or 36-bit physical address space.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The patch allows to specify that an SPI device needs an active high chip
select.
Signed-off-by: Wolfgang Ocker <weo@reccoware.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
We don't want to encourage the bogus device_type usage.
The device type isn't used in the code, so we can simply remove it from
the documentation and dts files.
Boards should specify proper compatible entries instead.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Similar to commit 618b26d528, also remove
automatic probing for this i2c controller. Might need updates to dts files
using it.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Jochen Friedrich <jochen@scram.de>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
These functions should have been static, and inspection shows they
are no longer used. (We used to parse mem= but we now defer that
to early_param).
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit 9b09c6d909 ("powerpc: Change the
default link address for pSeries zImage kernels") changed the
real-base value in the CHRP note added by addnote to the zImage from
12MB to 32MB. It turns out that this causes unnecessary extra reboots
on old 32-bit CHRP machines. This therefore adds a -r flag to addnote
to allow us to specify what real-base value it should put in the CHRP
note, and adjusts the wrapper script to pass -r c00000 to addnote when
making a zImage for a CHRP machine. Also, CHRP machines ignore the
RPA note, so we don't need to arrange for it to be the same as the
kernel's.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
numa_enforce_memory_limit tried to be smart and only call lmb_end_of_DRAM
when a memory limit was set via mem= on the command line. However,
the early boot code will also limit memory added to the lmb system
when iommu=off is specified. When this happens, the page allocator
is given pages not in the linear mapping and this results in a fatal
data reference to the unmapped page.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We used to assume that even numbered threads were the primary
threads, ie those that would be listed and started as a cpu from
open firmware. Replace a left over is even (% 2) check with a check
for it being a primary thread and update the comments.
Tested with a debug print on pseries, identical code found for cell.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
64 bit powerpc requires the kexec user space tools avoid overwriting
the static kernel image and translation hash table when choosing
where to put memory image data because it copies the data into place
using the kernels virtual memory system. Kexec userspace determines
these and other areas blocked by reading properties the kernel adds,
but does not filter these properties when creating the device tree
for the next kernel.
When the second kernel tries to add its values for these properties,
the export via /proc/device-tree is hidden by the pre-existing but
stale values from the flat tree. Kexec userspace reads the old
property, allocates the new kernel at the old kernel's end, and
gets rejected by the overlap check.
Search and remove these stale properties before adding the new values.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There are no users of PPC_MERGE in tree so we can get rid of it.
It was a hold over from the arch/ppc days.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>