Commit Graph

168814 Commits

Author SHA1 Message Date
Anton Vorontsov
e0ad2cd8ff ucc_geth: Fix NULL pointer dereference in uec_get_ethtool_stats()
In commit 3e73fc9a12 ("ucc_geth: Fix IO
memory (un)mapping code") I fixed ug_regs IO memory leak by properly
freeing the allocated memory. But ethtool_stats() callback doesn't
check for ug_regs being NULL, and that causes following oops if
'ethtool -S' is executed on a closed eth device:

  Unable to handle kernel paging request for data at address 0x00000180
  Faulting instruction address: 0xc0208228
  Oops: Kernel access of bad area, sig: 11 [#1]
  ...
  NIP [c0208228] uec_get_ethtool_stats+0x38/0x140
  LR [c02559a0] ethtool_get_stats+0xf8/0x23c
  Call Trace:
  [ef87bcd0] [c025597c] ethtool_get_stats+0xd4/0x23c (unreliable)
  [ef87bd00] [c025706c] dev_ethtool+0xfe8/0x11bc
  [ef87be00] [c0252b5c] dev_ioctl+0x454/0x6a8
  ...
  ---[ end trace 77fff1162a9586b0 ]---
  Segmentation fault

This patch fixes the issue.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-30 21:51:29 -07:00
Paul Mundt
2f6dafc5fc sh: unwinder: Fix up uninitialized variable warnings on sh2a build.
A couple of these popped up on the sh2a build, causing build failures.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2009-08-31 13:47:06 +09:00
David S. Miller
b9caaabb99 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-next-2.6 2009-08-30 21:30:39 -07:00
Ryusuke Konishi
b1f1b8ce0a nilfs2: fix preempt count underflow in nilfs_btnode_prepare_change_key
This will fix the following preempt count underflow reported from
users with the title "[NILFS users] segctord problem" (Message-ID:
<949415.6494.qm@web58808.mail.re1.yahoo.com> and Message-ID:
<debc30fc0908270825v747c1734xa59126623cfd5b05@mail.gmail.com>):

 WARNING: at kernel/sched.c:4890 sub_preempt_count+0x95/0xa0()
 Hardware name: HP Compaq 6530b (KR980UT#ABC)
 Modules linked in: bridge stp llc bnep rfcomm l2cap xfs exportfs nilfs2 cowloop loop vboxnetadp vboxnetflt vboxdrv btusb bluetooth uvcvideo videodev v4l1_compat v4l2_compat_ioctl32 arc4 snd_hda_codec_analog ecb iwlagn iwlcore rfkill lib80211 mac80211 snd_hda_intel snd_hda_codec ehci_hcd uhci_hcd usbcore snd_hwdep snd_pcm tg3 cfg80211 psmouse snd_timer joydev libphy ohci1394 snd_page_alloc hp_accel lis3lv02d ieee1394 led_class i915 drm i2c_algo_bit video backlight output i2c_core dm_crypt dm_mod
 Pid: 4197, comm: segctord Not tainted 2.6.30-gentoo-r4-64 #7
 Call Trace:
  [<ffffffff8023fa05>] ? sub_preempt_count+0x95/0xa0
  [<ffffffff802470f8>] warn_slowpath_common+0x78/0xd0
  [<ffffffff8024715f>] warn_slowpath_null+0xf/0x20
  [<ffffffff8023fa05>] sub_preempt_count+0x95/0xa0
  [<ffffffffa04ce4db>] nilfs_btnode_prepare_change_key+0x11b/0x190 [nilfs2]
  [<ffffffffa04d01ad>] nilfs_btree_assign_p+0x19d/0x1e0 [nilfs2]
  [<ffffffffa04d10ad>] nilfs_btree_assign+0xbd/0x130 [nilfs2]
  [<ffffffffa04cead7>] nilfs_bmap_assign+0x47/0x70 [nilfs2]
  [<ffffffffa04d9bc6>] nilfs_segctor_do_construct+0x956/0x20f0 [nilfs2]
  [<ffffffff805ac8e2>] ? _spin_unlock_irqrestore+0x12/0x40
  [<ffffffff803c06e0>] ? __up_write+0xe0/0x150
  [<ffffffff80262959>] ? up_write+0x9/0x10
  [<ffffffffa04ce9f3>] ? nilfs_bmap_test_and_clear_dirty+0x43/0x60 [nilfs2]
  [<ffffffffa04cd627>] ? nilfs_mdt_fetch_dirty+0x27/0x60 [nilfs2]
  [<ffffffffa04db5fc>] nilfs_segctor_construct+0x8c/0xd0 [nilfs2]
  [<ffffffffa04dc3dc>] nilfs_segctor_thread+0x15c/0x3a0 [nilfs2]
  [<ffffffffa04dbe20>] ? nilfs_construction_timeout+0x0/0x10 [nilfs2]
  [<ffffffff80252633>] ? add_timer+0x13/0x20
  [<ffffffff802370da>] ? __wake_up_common+0x5a/0x90
  [<ffffffff8025e960>] ? autoremove_wake_function+0x0/0x40
  [<ffffffffa04dc280>] ? nilfs_segctor_thread+0x0/0x3a0 [nilfs2]
  [<ffffffffa04dc280>] ? nilfs_segctor_thread+0x0/0x3a0 [nilfs2]
  [<ffffffff8025e556>] kthread+0x56/0x90
  [<ffffffff8020cdea>] child_rip+0xa/0x20
  [<ffffffff8025e500>] ? kthread+0x0/0x90
  [<ffffffff8020cde0>] ? child_rip+0x0/0x20

This problem was caused due to a missing radix_tree_preload() call in
the retry path of nilfs_btnode_prepare_change_key() function.

Reported-by: Eric A <eric225125@yahoo.com>
Reported-by: Jerome Poulin <jeromepoulin@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Jerome Poulin <jeromepoulin@gmail.com>
Cc: stable@kernel.org
2009-08-31 12:03:06 +09:00
Dave Airlie
3420e74262 drm: fix two issues with fb consolidation.
Set accel to none, we really don't want anyone thinking
fb is an accel interface.
Pass pitch not depth to function for intel.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-31 10:33:29 +10:00
Alexey Dobriyan
3b51096f95 drm: use proc_create_data()
airlied: fixup race against drm info by filling out
tmp before adding it to proc.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-31 09:37:22 +10:00
Zhao Yakui
882f021951 drm/kms: Parse the detailed time info in CEA-EDID
Sometimes we can obtain the EDID with multiple blocks from the display device.
For example: HDMI monitor.
When the CEA-EDID block is detected, we should also parse the detailed timing
info from it. Otherwise we will lose some modes for the display device.

The first step is check whether the CEA EDID block is found. If it exists,
it will skip the CEA-data block and parse the detailed timing info.

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-31 09:22:43 +10:00
Dave Airlie
785b93ef8c drm/kms: move driver specific fb common code to helper functions (v2)
Initially I always meant this code to be shared, but things
ran away from me before I got to it.

This refactors the i915 and radeon kms fbdev interaction layers
out into generic helpers + driver specific pieces.

It moves all the panic/sysrq enhancements to the core file,
and stores a linked list of kernel fbs. This could possibly be
improved to only store the fb which has fbcon on it for panics
etc.

radeon retains some specific codes used for a big endian
workaround.

changes:
fix oops in v1
fix freeing path for crtc_info

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-31 09:09:31 +10:00
Ben Hutchings
7dc482dfee drm/r128: Add test for initialisation to all ioctls that require it
Almost all r128's private ioctls require that the CCE state has
already been initialised.  However, most do not test that this has
been done, and will proceed to dereference a null pointer.  This may
result in a security vulnerability, since some ioctls are
unprivileged.

This adds a macro for the common initialisation test and changes all
ioctl implementations that require prior initialisation to use that
macro.

Also, r128_do_init_cce() does not test that the CCE state has not
been initialised already.  Repeated initialisation may lead to a crash
or resource leak.  This adds that test.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-31 09:09:30 +10:00
Ben Hutchings
70967ab9c0 radeon: Use request_firmware()
Loosely based on a patch by
Jaswinder Singh Rajput <jaswinderlinux@gmail.com>.

KMS support by Dave Airlie <airlied@redhat.com>.

For Radeon 100- to 500-series, firmware blobs look like:
    struct {
        __be32 datah;
        __be32 datal;
    } cp_ucode[256];

For Radeon 600-series, there are two separate firmware blobs:
    __be32 me_ucode[PM4_UCODE_SIZE * 3];
    __be32 pfp_ucode[PFP_UCODE_SIZE];

For Radeon 700-series, likewise:
    __be32 me_ucode[R700_PM4_UCODE_SIZE];
    __be32 pfp_ucode[R700_PFP_UCODE_SIZE];

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-31 09:09:30 +10:00
Roel Kluin
1ae70072f0 drm: dereference of tmp in drm_proc_create_files()
tmp allocation may fail, prevent a dereference.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-31 09:09:29 +10:00
Maarten Maathuis
ff846ab7f7 drm/crtc_helper: NULL encoder->crtc when switching encoders
- Previously the old encoder would be called during modeset and without a connector bad things happened.

Signed-off-by: Maarten Maathuis <madman2003@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-31 09:09:29 +10:00
Maarten Maathuis
f380ef8691 drm/crtc_helper: place drm_helper_encoder_in_use() in the header file
- The symbol was already exported.

Signed-off-by: Maarten Maathuis <madman2003@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-31 09:09:29 +10:00
Maarten Maathuis
e67aae79f9 drm/crtc_helper: replace modeset fail path with something simpler
- The previous system was not very transparent, nor flexible.
- This is needed to be able to fix a few bugs in the mechanism.

Signed-off-by: Maarten Maathuis <madman2003@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-31 09:09:29 +10:00
Dave Airlie
689d7c2a11 drm/radeon: cleanup mkregtable.c
This cleans up the code in mkregtable.c to be more kernel style.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-08-31 09:09:28 +10:00
Randy Dunlap
e500011ffa timers: Drop a function prototype
Drop prototype for non-existent next_timer_interrupt() function.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: akpm <akpm@linux-foundation.org>
LKML-Reference: <4A9ADEC0.70306@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-30 22:26:34 +02:00
Julia Lawall
903b9124ea Input: w90p910_keypad - move a dereference below a NULL test
We should first check whether platform data is NULL or not, before
dereferencing it to get the keymap.

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-08-30 11:55:47 -07:00
Michal Schmidt
23386d63bb x86: Detect stack protector for i386 builds on x86_64
Stack protector support was not detected when building with
ARCH=i386 on x86_64 systems:

  arch/x86/Makefile:80: stack protector enabled but no compiler support

The "-m32" argument needs to be passed to the detection script.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Arjan van de Ven <arjan@infradead.org>
LKML-Reference: <20090829182718.10f566b1@leela>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
--
2009-08-30 20:39:48 +02:00
Troy Heber
8211a7b585 pci/dmar: correct off-by-one error in dmar_fault()
DMAR faults are recorded into a ring of "fault recording registers".
fault_index is a 0-based index into the ring. The code allows the
0-based fault_index to be equal to the total number of fault registers
available from the cap_num_fault_regs() macro, which causes access
beyond the last available register.

Signed-off-by Troy Heber <troy.heber@hp.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-30 19:05:04 +01:00
Chris Wright
6faf17f6f1 PCI SR-IOV: correct broken resource alignment calculations
An SR-IOV capable device includes an SR-IOV PCIe capability which
describes the Virtual Function (VF) BAR requirements.  A typical SR-IOV
device can support multiple VFs whose BARs must be in a contiguous region,
effectively an array of VF BARs.  The BAR reports the size requirement
for a single VF.  We calculate the full range needed by simply multiplying
the VF BAR size with the number of possible VFs and create a resource
spanning the full range.

This all seems sane enough except it artificially inflates the alignment
requirement for the VF BAR.  The VF BAR need only be aligned to the size
of a single BAR not the contiguous range of VF BARs.  This can cause us
to fail to allocate resources for the BAR despite the fact that we
actually have enough space.

This patch adds a thin PCI specific layer over the generic
resource_alignment() function which is aware of the special nature of
VF BARs and does sorting and allocation based on the smaller alignment
requirement.

I recognize that while resource_alignment is generic, it's basically a
PCI helper.  An alternative to this patch is to add PCI VF BAR specific
information to struct resource.  I opted for the extra layer rather than
adding such PCI specific information to struct resource.  This does
have the slight downside that we don't cache the BAR size and re-read
for each alignment query (happens a small handful of times during boot
for each VF BAR).

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Yu Zhao <yu.zhao@intel.com>
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-08-30 08:37:25 -07:00
Aaro Koskinen
acdfcd04d9 SLUB: fix ARCH_KMALLOC_MINALIGN cases 64 and 256
If the minalign is 64 bytes, then the 96 byte cache should not be created
because it would conflict with the 128 byte cache.

If the minalign is 256 bytes, patching the size_index table should not
result in a buffer overrun.

The calculation "(i - 1) / 8" used to access size_index[] is moved to
a separate function as suggested by Christoph Lameter.

Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-08-30 14:56:48 +03:00
Chen Liqin
125ec616f4 score: make init_thread_union align to THREAD_SIZE 2009-08-30 12:36:41 +08:00
Chen Liqin
798983b2d5 score: update files according to review comments. 2009-08-30 12:34:08 +08:00
Chen Liqin
cf52c46835 score: add old syscall support 2009-08-30 12:33:30 +08:00
Chen Liqin
324f40fbb0 score: add MEMORY_START and MEMORY_SIZE define, to make the code clear 2009-08-30 12:31:58 +08:00
Chen Liqin
ffa818b4b0 score: update inconsistent declare after .c was changed 2009-08-30 12:30:16 +08:00
Chen Liqin
d8aa899bb2 score: remove unused code, add include files in .c 2009-08-30 12:26:32 +08:00
Dmitry Torokhov
4a703a8fe5 ACPI: video - rename cdev to cooling_dev -- syntax only
Cdev name is normally used for ether class devices or character
devices so rename member to avoid confusion for casual reader
of the code.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-08-29 23:03:16 -04:00
Dmitry Torokhov
4b4fe3b62e ACPI: video - fix potential crash when unloading
thermal_cooling_device_register() returns error encoded in a pointer
when it fails in which case we need to explictly set device->cdev
to NULL so we don't try to unregister it when unloading.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-08-29 22:58:19 -04:00
Len Brown
eb0ca84986 ACPI: sleep: another HP DMI entry for init_set_sci_en_on_resume
DMI_MATCH(DMI_PRODUCT_NAME, "HP Pavilion dv3 Notebook PC")

http://bugzilla.kernel.org/show_bug.cgi?id=13745

Signed-off-by: Len Brown <len.brown@intel.com>
2009-08-29 22:39:06 -04:00
Chen Liqin
d27eadc761 Merge branch 'score' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic into for-linus
* 'score' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  score: clean up mm/init.c
  score: make irq.h definitions local
  score: cleanups: dead code, 0 as pointer, shadowed variables
  score: fix function prototypes
  score: add address space annotations
  score: add missing #includes
  score: move save arg5 and arg6 instruction in front of enable_irq
  score: add prototypes for wrapped syscalls
  score: remove init_mm
  score: add generic sys_call_table
  score: remove __{put,get}_user_unknown
  score: unset __ARCH_WANT_IPC_PARSE_VERSION
  score: update files according to review comments
  score: add maintainers for score architecture
  score: Add support for Sunplus S+core architecture
2009-08-30 10:26:23 +08:00
Dan Williams
07a3b417dc md/raid456: distribute raid processing over multiple cores
Now that the resources to handle stripe_head operations are allocated
percpu it is possible for raid5d to distribute stripe handling over
multiple cores.  This conversion also adds a call to cond_resched() in
the non-multicore case to prevent one core from getting monopolized for
raid operations.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:13:13 -07:00
Yuri Tikhonov
b774ef491b md/raid6: remove synchronous infrastructure
These routines have been replaced by there asynchronous counterparts.

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:13:13 -07:00
Yuri Tikhonov
6c0069c0ae md/raid6: asynchronous handle_stripe6
1/ Use STRIPE_OP_BIOFILL to offload completion of read requests to
   raid_run_ops
2/ Implement a handler for sh->reconstruct_state similar to the raid5 case
   (adds handling of Q parity)
3/ Prevent handle_parity_checks6 from running concurrently with 'compute'
   operations
4/ Hook up raid_run_ops

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:13:13 -07:00
Dan Williams
d82dfee0ad md/raid6: asynchronous handle_parity_check6
[ Based on an original patch by Yuri Tikhonov ]

Implement the state machine for handling the RAID-6 parities check and
repair functionality.  Note that the raid6 case does not need to check
for new failures, like raid5, as it will always writeback the correct
disks.  The raid5 case can be updated to check zero_sum_result to avoid
getting confused by new failures rather than retrying the entire check
operation.

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:13:13 -07:00
Yuri Tikhonov
a9b39a741a md/raid6: asynchronous handle_stripe_dirtying6
In the synchronous implementation of stripe dirtying we processed a
degraded stripe with one call to handle_stripe_dirtying6().  I.e.
compute the missing blocks from the other drives, then copy in the new
data and reconstruct the parities.

In the asynchronous case we do not perform stripe operations directly.
Instead, operations are scheduled with flags to be later serviced by
raid_run_ops.  So, for the degraded case the final reconstruction step
can only be carried out after all blocks have been brought up to date by
being read, or computed.  Like the raid5 case schedule_reconstruction()
sets STRIPE_OP_RECONSTRUCT to request a parity generation pass and
through operation chaining can handle compute and reconstruct in a
single raid_run_ops pass.

[dan.j.williams@intel.com: fixup handle_stripe_dirtying6 gating]
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:13:12 -07:00
Yuri Tikhonov
5599becca4 md/raid6: asynchronous handle_stripe_fill6
Modify handle_stripe_fill6 to work asynchronously by introducing
fetch_block6 as the raid6 analog of fetch_block5 (schedule compute
operations for missing/out-of-sync disks).

[dan.j.williams@intel.com: compute D+Q in one pass]
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:13:12 -07:00
Yuri Tikhonov
c0f7bddbe6 md/raid5,6: common schedule_reconstruction for raid5/6
Extend schedule_reconstruction5 for reuse by the raid6 path.  Add
support for generating Q and BUG() if a request is made to perform
'prexor'.

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:13:12 -07:00
Dan Williams
ac6b53b6e6 md/raid6: asynchronous raid6 operations
[ Based on an original patch by Yuri Tikhonov ]

The raid_run_ops routine uses the asynchronous offload api and
the stripe_operations member of a stripe_head to carry out xor+pq+copy
operations asynchronously, outside the lock.

The operations performed by RAID-6 are the same as in the RAID-5 case
except for no support of STRIPE_OP_PREXOR operations. All the others
are supported:
STRIPE_OP_BIOFILL
 - copy data into request buffers to satisfy a read request
STRIPE_OP_COMPUTE_BLK
 - generate missing blocks (1 or 2) in the cache from the other blocks
STRIPE_OP_BIODRAIN
 - copy data out of request buffers to satisfy a write request
STRIPE_OP_RECONSTRUCT
 - recalculate parity for new data that has entered the cache
STRIPE_OP_CHECK
 - verify that the parity is correct

The flow is the same as in the RAID-5 case, and reuses some routines, namely:
1/ ops_complete_postxor (renamed to ops_complete_reconstruct)
2/ ops_complete_compute (updated to set up to 2 targets uptodate)
3/ ops_run_check (renamed to ops_run_check_p for xor parity checks)

[neilb@suse.de: fixes to get it to pass mdadm regression suite]
Reviewed-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:13:12 -07:00
Dan Williams
4e7d2c0aef md/raid5: factor out mark_uptodate from ops_complete_compute5
ops_complete_compute5 can be reused in the raid6 path if it is updated to
generically handle a second target.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:13:11 -07:00
Dan Williams
f6dbf65161 iop-adma: P+Q self test
Even though the intent is to extend dmatest with P+Q tests there is
still value in having an always-on sanity check to prevent an
unintentionally broken driver from registering.

This depends on raid6_pq.ko for verification, the side effect being that
PQ capable channels will fail to register when raid6 is disabled.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:12:40 -07:00
Dan Williams
7bf649aee8 iop-adma: P+Q support for iop13xx adma engines
iop33x support is not included because that engine is a bit more awkward
to handle in that it can either be in xor mode or pq mode.  The
dmaengine/async_tx layers currently only comprehend static capabilities.

Note iop13xx does not support hardware PQ continuation so the driver
must handle the DMA_PREP_CONTINUE flag for operations across > 16
sources. From the comment for dma_maxpq:

/* When an engine does not support native continuation we need 3 extra
 * source slots to reuse P and Q with the following coefficients:
 * 1/ {00} * P : remove P from Q', but use it as a source for P'
 * 2/ {01} * Q : use Q to continue Q' calculation
 * 3/ {00} * Q : subtract Q from P' to cancel (2)
 */

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:12:39 -07:00
Dan Williams
72be12f0c3 iop-adma: fix lockdep false positive
lockdep correctly identifies a potential recursive locking case for
iop_chan->lock, but in the dependency submission case we expect that the same
class will be acquired for both the parent dependency and the child channel.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:12:39 -07:00
Dan Williams
507fbec4cf iop-adma: cleanup iop_adma_run_tx_complete_actions
Replace 'desc->async_tx.' with 'tx->'

[ Impact: pure cleanup ]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:12:39 -07:00
Dan Williams
cb3c82992f async_tx: raid6 recovery self test
Port drivers/md/raid6test/test.c to use the async raid6 recovery
routines.  This is meant as a unit test for raid6 acceleration drivers.  In
addition to the 16-drive test case this implements tests for the 4-disk and
5-disk special cases (dma devices can not generically handle less than 2
sources), and adds a test for the D+Q case.

Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:28 -07:00
Dan Williams
58691d64c4 dmatest: add pq support
Test raid6 p+q operations with a simple "always multiply by 1" q
calculation to fit into dmatest's current destination verification
scheme.

Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:27 -07:00
Dan Williams
0a82a6239b async_tx: add support for asynchronous RAID6 recovery operations
async_raid6_2data_recov() recovers two data disk failures

 async_raid6_datap_recov() recovers a data disk and the P disk

These routines are a port of the synchronous versions found in
drivers/md/raid6recov.c.  The primary difference is breaking out the xor
operations into separate calls to async_xor.  Two helper routines are
introduced to perform scalar multiplication where needed.
async_sum_product() multiplies two sources by scalar coefficients and
then sums (xor) the result.  async_mult() simply multiplies a single
source by a scalar.

This implemention also includes, in contrast to the original
synchronous-only code, special case handling for the 4-disk and 5-disk
array cases.  In these situations the default N-disk algorithm will
present 0-source or 1-source operations to dma devices.  To cover for
dma devices where the minimum source count is 2 we implement 4-disk and
5-disk handling in the recovery code.

[ Impact: asynchronous raid6 recovery routines for 2data and datap cases ]

Cc: Yuri Tikhonov <yur@emcraft.com>
Cc: Ilya Yanok <yanok@emcraft.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:27 -07:00
Dan Williams
b2f46fd8ef async_tx: add support for asynchronous GF multiplication
[ Based on an original patch by Yuri Tikhonov ]

This adds support for doing asynchronous GF multiplication by adding
two additional functions to the async_tx API:

 async_gen_syndrome() does simultaneous XOR and Galois field
    multiplication of sources.

 async_syndrome_val() validates the given source buffers against known P
    and Q values.

When a request is made to run async_pq against more than the hardware
maximum number of supported sources we need to reuse the previous
generated P and Q values as sources into the next operation.  Care must
be taken to remove Q from P' and P from Q'.  For example to perform a 5
source pq op with hardware that only supports 4 sources at a time the
following approach is taken:

p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08}))
p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10}))

p' = p + q + q + src4 = p + src4
q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4

Note: 4 is the minimum acceptable maxpq otherwise we punt to
synchronous-software path.

The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as
sources (in the above manner) and fill the remaining slots up to maxpq
with the new sources/coefficients.

Note1: Some devices have native support for P+Q continuation and can skip
this extra work.  Devices with this capability can advertise it with
dma_set_maxpq.  It is up to each driver how to handle the
DMA_PREP_CONTINUE flag.

Note2: The api supports disabling the generation of P when generating Q,
this is ignored by the synchronous path but is implemented by some dma
devices to save unnecessary writes.  In this case the continuation
algorithm is simplified to only reuse Q as a source.

Cc: H. Peter Anvin <hpa@zytor.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:27 -07:00
Dan Williams
95475e5711 async_tx: remove walk of tx->parent chain in dma_wait_for_async_tx
We currently walk the parent chain when waiting for a given tx to
complete however this walk may race with the driver cleanup routine.
The routines in async_raid6_recov.c may fall back to the synchronous
path at any point so we need to be prepared to call async_tx_quiesce()
(which calls  dma_wait_for_async_tx).  To remove the ->parent walk we
guarantee that every time a dependency is attached ->issue_pending() is
invoked, then we can simply poll the initial descriptor until
completion.

This also allows for a lighter weight 'issue pending' implementation as
there is no longer a requirement to iterate through all the channels'
->issue_pending() routines as long as operations have been submitted in
an ordered chain.  async_tx_issue_pending() is added for this case.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:27 -07:00
Dan Williams
af1f951eb6 async_tx: kill needless module_{init|exit}
If module_init and module_exit are nops then neither need to be defined.

[ Impact: pure cleanup ]

Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29 19:09:26 -07:00