Commit Graph

454969 Commits

Author SHA1 Message Date
Chen Gang
71d932d99d crypto: qce - Let 'DEV_QCE' depend on both HAS_DMA and HAS_IOMEM
'DEV_QCE' needs both HAS_DMA and HAS_IOMEM, so let it depend on them.

The related error (with allmodconfig under score):

    MODPOST 1365 modules
  ERROR: "devm_ioremap_resource" [drivers/crypto/qce/qcrypto.ko] undefined!
  ERROR: "dma_map_sg" [drivers/crypto/qce/qcrypto.ko] undefined!
  ERROR: "dma_set_mask" [drivers/crypto/qce/qcrypto.ko] undefined!
  ERROR: "dma_supported" [drivers/crypto/qce/qcrypto.ko] undefined!
  ERROR: "dma_unmap_sg" [drivers/crypto/qce/qcrypto.ko] undefined!

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:31:37 +08:00
Horia Geanta
a60384dfff crypto: caam - set DK (Decrypt Key) bit only for AES accelerator
AES currently shares descriptor creation functions with DES and 3DES.
DK bit is set in all cases, however it is valid only for
the AES accelerator.

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:31:36 +08:00
Horia Geanta
de0e35ec2b crypto: caam - fix uninitialized state->buf_dma field
state->buf_dma not being initialized can cause try_buf_map_to_sec4_sg
to try to free unallocated DMA memory:

caam_jr ffe301000.jr: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x000000002eb15068] [size=0 bytes]
WARNING: at lib/dma-debug.c:1080
Modules linked in: caamhash(+) [last unloaded: caamhash]
CPU: 0 PID: 1387 Comm: cryptomgr_test Tainted: G        W     3.16.0-rc1 #23
task: eed24e90 ti: eebd0000 task.ti: eebd0000
NIP: c02889fc LR: c02889fc CTR: c02d7020
REGS: eebd1a50 TRAP: 0700   Tainted: G        W      (3.16.0-rc1)
MSR: 00029002 <CE,EE,ME>  CR: 44042082  XER: 00000000

GPR00: c02889fc eebd1b00 eed24e90 0000008d c1de3478 c1de382c 00000000 00029002
GPR08: 00000007 00000000 01660000 00000000 24042082 00000000 c07a1900 eeda2a40
GPR16: 005d62a0 c078ad4c 00000000 eeb15068 c07e1e10 c0da1180 00029002 c0d97408
GPR24: c62497a0 00000014 eebd1b58 00000000 c078ad4c ee130210 00000000 2eb15068
NIP [c02889fc] check_unmap+0x8ac/0xab0
LR [c02889fc] check_unmap+0x8ac/0xab0
Call Trace:
[eebd1b00] [c02889fc] check_unmap+0x8ac/0xab0 (unreliable)
--- Exception: 0 at   (null)
    LR =   (null)
[eebd1b50] [c0288c78] debug_dma_unmap_page+0x78/0x90 (unreliable)
[eebd1bd0] [f956f738] ahash_final_ctx+0x6d8/0x7b0 [caamhash]
[eebd1c30] [c022ff4c] __test_hash+0x2ac/0x6c0
[eebd1de0] [c0230388] test_hash+0x28/0xb0
[eebd1e00] [c02304a4] alg_test_hash+0x94/0xc0
[eebd1e20] [c022fa94] alg_test+0x114/0x2e0
[eebd1ea0] [c022cd1c] cryptomgr_test+0x4c/0x60
[eebd1eb0] [c00497a4] kthread+0xc4/0xe0
[eebd1f40] [c000f2fc] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
41de01c8 80a9002c 2f850000 40fe0008 80a90008 80fa0018 3c60c06d 811a001c
3863f4a4 813a0020 815a0024 4830cd01 <0fe00000> 81340048 2f890000 40feff48

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:31:36 +08:00
Horia Geanta
76b99080cc crypto: caam - fix uninitialized edesc->dst_dma field
dst_dma not being properly initialized causes ahash_done_ctx_dst
to try to free unallocated DMA memory:

caam_jr ffe301000.jr: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000006513340] [size=28 bytes]
WARNING: at lib/dma-debug.c:1080
Modules linked in: caamhash(+) [last unloaded: caamhash]
CPU: 0 PID: 1373 Comm: cryptomgr_test Tainted: G        W     3.16.0-rc1 #23
task: ee23e350 ti: effd2000 task.ti: ee1f6000
NIP: c02889fc LR: c02889fc CTR: c02d7020
REGS: effd3d50 TRAP: 0700   Tainted: G        W      (3.16.0-rc1)
MSR: 00029002 <CE,EE,ME>  CR: 44048082  XER: 00000000

GPR00: c02889fc effd3e00 ee23e350 0000008e c1de3478 c1de382c 00000000 00029002
GPR08: 00000007 00000000 01660000 00000000 24048082 00000000 00000018 c07db080
GPR16: 00000006 00000100 0000002c eeb4a7e0 c07e1e10 c0da1180 00029002 c0d9b3c8
GPR24: eeb4a7c0 00000000 effd3e58 00000000 c078ad4c ee130210 00000000 06513340
NIP [c02889fc] check_unmap+0x8ac/0xab0
LR [c02889fc] check_unmap+0x8ac/0xab0
Call Trace:
[effd3e00] [c02889fc] check_unmap+0x8ac/0xab0 (unreliable)
[effd3e50] [c0288c78] debug_dma_unmap_page+0x78/0x90
[effd3ed0] [f94b89ec] ahash_done_ctx_dst+0x11c/0x200 [caamhash]
[effd3f00] [c0429640] caam_jr_dequeue+0x1c0/0x280
[effd3f50] [c002c94c] tasklet_action+0xcc/0x1a0
[effd3f80] [c002cb30] __do_softirq+0x110/0x220
[effd3fe0] [c002cf34] irq_exit+0xa4/0xe0
[effd3ff0] [c000d834] call_do_irq+0x24/0x3c
[ee1f7ae0] [c000489c] do_IRQ+0x8c/0x110
[ee1f7b00] [c000f86c] ret_from_except+0x0/0x18
--- Exception: 501 at _raw_spin_unlock_irq+0x30/0x50
    LR = _raw_spin_unlock_irq+0x2c/0x50
[ee1f7bd0] [c0590158] wait_for_common+0xb8/0x170
[ee1f7c10] [c059024c] wait_for_completion_interruptible+0x1c/0x40
[ee1f7c20] [c022fc78] do_one_async_hash_op.isra.2.part.3+0x18/0x40
[ee1f7c30] [c022ffb8] __test_hash+0x318/0x6c0
[ee1f7de0] [c0230388] test_hash+0x28/0xb0
[ee1f7e00] [c02304a4] alg_test_hash+0x94/0xc0
[ee1f7e20] [c022fa94] alg_test+0x114/0x2e0
[ee1f7ea0] [c022cd1c] cryptomgr_test+0x4c/0x60
[ee1f7eb0] [c00497a4] kthread+0xc4/0xe0
[ee1f7f40] [c000f2fc] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
41de01c8 80a9002c 2f850000 40fe0008 80a90008 80fa0018 3c60c06d 811a001c
3863f4a4 813a0020 815a0024 4830cd01 <0fe00000> 81340048 2f890000 40feff48

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:31:36 +08:00
Horia Geanta
45e9af78b1 crypto: caam - fix uninitialized S/G table size in ahash_digest
Not initializing edesc->sec4_sg_bytes correctly causes ahash_done
callback to free unallocated DMA memory:

caam_jr ffe301000.jr: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x300900000000b44d] [size=46158 bytes]
WARNING: at lib/dma-debug.c:1080
Modules linked in: caamhash(+) [last unloaded: caamhash]
CPU: 0 PID: 1358 Comm: cryptomgr_test Tainted: G        W     3.16.0-rc1 #23
task: eed04250 ti: effd2000 task.ti: c6046000
NIP: c02889fc LR: c02889fc CTR: c02d7020
REGS: effd3d50 TRAP: 0700   Tainted: G        W      (3.16.0-rc1)
MSR: 00029002 <CE,EE,ME>  CR: 44048082  XER: 00000000

GPR00: c02889fc effd3e00 eed04250 00000091 c1de3478 c1de382c 00000000 00029002
GPR08: 00000007 00000000 01660000 00000000 22048082 00000000 00000018 c07db080
GPR16: 00000006 00000100 0000002c ee2497e0 c07e1e10 c0da1180 00029002 c0d912c8
GPR24: 00000014 ee2497c0 effd3e58 00000000 c078ad4c ee130210 30090000 0000b44d
NIP [c02889fc] check_unmap+0x8ac/0xab0
LR [c02889fc] check_unmap+0x8ac/0xab0
Call Trace:
[effd3e00] [c02889fc] check_unmap+0x8ac/0xab0 (unreliable)
[effd3e50] [c0288c78] debug_dma_unmap_page+0x78/0x90
[effd3ed0] [f9404fec] ahash_done+0x11c/0x190 [caamhash]
[effd3f00] [c0429640] caam_jr_dequeue+0x1c0/0x280
[effd3f50] [c002c94c] tasklet_action+0xcc/0x1a0
[effd3f80] [c002cb30] __do_softirq+0x110/0x220
[effd3fe0] [c002cf34] irq_exit+0xa4/0xe0
[effd3ff0] [c000d834] call_do_irq+0x24/0x3c
[c6047ae0] [c000489c] do_IRQ+0x8c/0x110
[c6047b00] [c000f86c] ret_from_except+0x0/0x18
--- Exception: 501 at _raw_spin_unlock_irq+0x30/0x50
    LR = _raw_spin_unlock_irq+0x2c/0x50
[c6047bd0] [c0590158] wait_for_common+0xb8/0x170
[c6047c10] [c059024c] wait_for_completion_interruptible+0x1c/0x40
[c6047c20] [c022fc78] do_one_async_hash_op.isra.2.part.3+0x18/0x40
[c6047c30] [c022ff98] __test_hash+0x2f8/0x6c0
[c6047de0] [c0230388] test_hash+0x28/0xb0
[c6047e00] [c0230458] alg_test_hash+0x48/0xc0
[c6047e20] [c022fa94] alg_test+0x114/0x2e0
[c6047ea0] [c022cd1c] cryptomgr_test+0x4c/0x60
[c6047eb0] [c00497a4] kthread+0xc4/0xe0
[c6047f40] [c000f2fc] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
41de01c8 80a9002c 2f850000 40fe0008 80a90008 80fa0018 3c60c06d 811a001c
3863f4a4 813a0020 815a0024 4830cd01 <0fe00000> 81340048 2f890000 40feff48

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:31:36 +08:00
Horia Geanta
bc9e05f9e7 crypto: caam - fix DMA direction mismatch in ahash_done_ctx_src
caam_jr ffe301000.jr: DMA-API: device driver frees DMA memory with different direction [device address=0x0000000006271dac] [size=28 bytes] [mapped with DMA_TO_DEVICE] [unmapped with DMA_FROM_DEVICE]
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:1131
Modules linked in: caamhash(+) [last unloaded: caamhash]
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W     3.16.0-rc1 #23
task: c0789380 ti: effd2000 task.ti: c07d6000
NIP: c02885cc LR: c02885cc CTR: c02d7020
REGS: effd3d50 TRAP: 0700   Tainted: G        W      (3.16.0-rc1)
MSR: 00021002 <CE,ME>  CR: 44048082  XER: 00000000

GPR00: c02885cc effd3e00 c0789380 000000c6 c1de3478 c1de382c 00000000 00021002
GPR08: 00000007 00000000 01660000 0000012f 84048082 00000000 00000018 c07db080
GPR16: 00000006 00000100 0000002c c62517a0 c07e1e10 c0da1180 00029002 c0d95f88
GPR24: c07a0000 c07a4acc effd3e58 ee322bc0 0000001c ee130210 00000000 c0d95f80
NIP [c02885cc] check_unmap+0x47c/0xab0
LR [c02885cc] check_unmap+0x47c/0xab0
Call Trace:
[effd3e00] [c02885cc] check_unmap+0x47c/0xab0 (unreliable)
[effd3e50] [c0288c78] debug_dma_unmap_page+0x78/0x90
[effd3ed0] [f9624d84] ahash_done_ctx_src+0xa4/0x200 [caamhash]
[effd3f00] [c0429640] caam_jr_dequeue+0x1c0/0x280
[effd3f50] [c002c94c] tasklet_action+0xcc/0x1a0
[effd3f80] [c002cb30] __do_softirq+0x110/0x220
[effd3fe0] [c002cf34] irq_exit+0xa4/0xe0
[effd3ff0] [c000d834] call_do_irq+0x24/0x3c
[c07d7d50] [c000489c] do_IRQ+0x8c/0x110
[c07d7d70] [c000f86c] ret_from_except+0x0/0x18
--- Exception: 501 at _raw_spin_unlock_irq+0x30/0x50
    LR = _raw_spin_unlock_irq+0x2c/0x50
[c07d7e40] [c0053084] finish_task_switch+0x74/0x130
[c07d7e60] [c058f278] __schedule+0x238/0x620
[c07d7f70] [c058fb50] schedule_preempt_disabled+0x10/0x20
[c07d7f80] [c00686a0] cpu_startup_entry+0x100/0x1b0
[c07d7fb0] [c074793c] start_kernel+0x338/0x34c
[c07d7ff0] [c00003d8] set_ivor+0x140/0x17c
Instruction dump:
7d495214 7d294214 806a0010 80c90010 811a001c 813a0020 815a0024 90610008
3c60c06d 90c1000c 3863f764 4830d131 <0fe00000> 3c60c06d 3863f0f4 4830d121
---[ end trace db1fae088c75c280 ]---
Mapped at:
 [<f96251bc>] ahash_final_ctx+0x14c/0x7b0 [caamhash]
 [<c022ff4c>] __test_hash+0x2ac/0x6c0
 [<c0230388>] test_hash+0x28/0xb0
 [<c02304a4>] alg_test_hash+0x94/0xc0
 [<c022fa94>] alg_test+0x114/0x2e0

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:31:35 +08:00
Horia Geanta
ef62b2310b crypto: caam - fix DMA direction mismatch in ahash_done_ctx_dst
caam_jr ffe301000.jr: DMA-API: device driver frees DMA memory with different direction [device address=0x00000000062ad1ac] [size=28 bytes] [mapped with DMA_FROM_DEVICE] [unmapped with DMA_TO_DEVICE]
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:1131
Modules linked in: caamhash(+) [last unloaded: caamhash]
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W     3.16.0-rc1 #23
task: c0789380 ti: effd2000 task.ti: c07d6000
NIP: c02885cc LR: c02885cc CTR: c02d7020
REGS: effd3d50 TRAP: 0700   Tainted: G        W      (3.16.0-rc1)
MSR: 00021002 <CE,ME>  CR: 44048082  XER: 00000000

GPR00: c02885cc effd3e00 c0789380 000000c6 c1de3478 c1de382c 00000000 00021002
GPR08: 00000007 00000000 01660000 0000012f 84048082 00000000 00000018 c07db080
GPR16: 00000006 00000100 0000002c eee567e0 c07e1e10 c0da1180 00029002 c0d96708
GPR24: c07a0000 c07a4acc effd3e58 ee29b140 0000001c ee130210 00000000 c0d96700
NIP [c02885cc] check_unmap+0x47c/0xab0
LR [c02885cc] check_unmap+0x47c/0xab0
Call Trace:
[effd3e00] [c02885cc] check_unmap+0x47c/0xab0 (unreliable)
[effd3e50] [c0288c78] debug_dma_unmap_page+0x78/0x90
[effd3ed0] [f9350974] ahash_done_ctx_dst+0xa4/0x200 [caamhash]
[effd3f00] [c0429640] caam_jr_dequeue+0x1c0/0x280
[effd3f50] [c002c94c] tasklet_action+0xcc/0x1a0
[effd3f80] [c002cb30] __do_softirq+0x110/0x220
[effd3fe0] [c002cf34] irq_exit+0xa4/0xe0
[effd3ff0] [c000d834] call_do_irq+0x24/0x3c
[c07d7d50] [c000489c] do_IRQ+0x8c/0x110
[c07d7d70] [c000f86c] ret_from_except+0x0/0x18
--- Exception: 501 at _raw_spin_unlock_irq+0x30/0x50
    LR = _raw_spin_unlock_irq+0x2c/0x50
[c07d7e40] [c0053084] finish_task_switch+0x74/0x130
[c07d7e60] [c058f278] __schedule+0x238/0x620
[c07d7f70] [c058fb50] schedule_preempt_disabled+0x10/0x20
[c07d7f80] [c00686a0] cpu_startup_entry+0x100/0x1b0
[c07d7fb0] [c074793c] start_kernel+0x338/0x34c
[c07d7ff0] [c00003d8] set_ivor+0x140/0x17c
Instruction dump:
7d495214 7d294214 806a0010 80c90010 811a001c 813a0020 815a0024 90610008
3c60c06d 90c1000c 3863f764 4830d131 <0fe00000> 3c60c06d 3863f0f4 4830d121
---[ end trace db1fae088c75c270 ]---
Mapped at:
 [<f9352454>] ahash_update_first+0x5b4/0xba0 [caamhash]
 [<c022ff28>] __test_hash+0x288/0x6c0
 [<c0230388>] test_hash+0x28/0xb0
 [<c02304a4>] alg_test_hash+0x94/0xc0
 [<c022fa94>] alg_test+0x114/0x2e0

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:31:35 +08:00
Horia Geanta
e11aa9f135 crypto: caam - fix DMA unmapping error in hash_digest_key
Key being hashed is unmapped using the digest size instead of
initial length:

caam_jr ffe301000.jr: DMA-API: device driver frees DMA memory with different size [device address=0x000000002eeedac0] [map size=80 bytes] [unmap size=20 bytes]
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:1090
Modules linked in: caamhash(+)
CPU: 0 PID: 1327 Comm: cryptomgr_test Not tainted 3.16.0-rc1 #23
task: eebda5d0 ti: ee26a000 task.ti: ee26a000
NIP: c0288790 LR: c0288790 CTR: c02d7020
REGS: ee26ba30 TRAP: 0700   Not tainted  (3.16.0-rc1)
MSR: 00021002 <CE,ME>  CR: 44022082  XER: 00000000

GPR00: c0288790 ee26bae0 eebda5d0 0000009f c1de3478 c1de382c 00000000 00021002
GPR08: 00000007 00000000 01660000 0000012f 82022082 00000000 c07a1900 eeda29c0
GPR16: 00000000 c61deea0 000c49a0 00000260 c07e1e10 c0da1180 00029002 c0d9ef08
GPR24: c07a0000 c07a4acc ee26bb38 ee2765c0 00000014 ee130210 00000000 00000014
NIP [c0288790] check_unmap+0x640/0xab0
LR [c0288790] check_unmap+0x640/0xab0
Call Trace:
[ee26bae0] [c0288790] check_unmap+0x640/0xab0 (unreliable)
[ee26bb30] [c0288c78] debug_dma_unmap_page+0x78/0x90
[ee26bbb0] [f929c3d4] ahash_setkey+0x374/0x720 [caamhash]
[ee26bc30] [c022fec8] __test_hash+0x228/0x6c0
[ee26bde0] [c0230388] test_hash+0x28/0xb0
[ee26be00] [c0230458] alg_test_hash+0x48/0xc0
[ee26be20] [c022fa94] alg_test+0x114/0x2e0
[ee26bea0] [c022cd1c] cryptomgr_test+0x4c/0x60
[ee26beb0] [c00497a4] kthread+0xc4/0xe0
[ee26bf40] [c000f2fc] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
41de03e8 83da0020 3c60c06d 83fa0024 3863f520 813b0020 815b0024 80fa0018
811a001c 93c10008 93e1000c 4830cf6d <0fe00000> 3c60c06d 3863f0f4 4830cf5d
---[ end trace db1fae088c75c26c ]---
Mapped at:
 [<f929c15c>] ahash_setkey+0xfc/0x720 [caamhash]
 [<c022fec8>] __test_hash+0x228/0x6c0
 [<c0230388>] test_hash+0x28/0xb0
 [<c0230458>] alg_test_hash+0x48/0xc0
 [<c022fa94>] alg_test+0x114/0x2e0

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:28:44 +08:00
Horia Geanta
ce57208528 crypto: caam - fix "failed to check map error" DMA warnings
Use dma_mapping_error for every dma_map_single / dma_map_page.

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:28:43 +08:00
Horia Geanta
71c65f7c90 crypto: caam - fix typo in dma_mapping_error
dma_mapping_error checks for an incorrect DMA address:
s/ctx->sh_desc_enc_dma/ctx->sh_desc_dec_dma

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:28:43 +08:00
Horia Geanta
a2ac287e9e crypto: caam - set coherent_dma_mask
Replace dma_set_mask with dma_set_mask_and_coherent, since both
streaming and coherent DMA mappings are being used.

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:28:42 +08:00
Horia Geanta
29b77e5dd8 crypto: testmgr - avoid DMA mapping from text, rodata, stack
With DMA_API_DEBUG set, following warnings are emitted
(tested on CAAM accelerator):
DMA-API: device driver maps memory from kernel text or rodata
DMA-API: device driver maps memory from stack
and the culprits are:
-key in __test_aead and __test_hash
-result in __test_hash

MAX_KEYLEN is changed to accommodate maximum key length from
existing test vectors in crypto/testmgr.h (131 bytes) and rounded.

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:28:40 +08:00
Tom Lendacky
126ae9adc1 crypto: ccp - Base AXI DMA cache settings on device tree
The default cache operations for ARM64 were changed during 3.15.
To use coherent operations a "dma-coherent" device tree property
is required.  If that property is not present in the device tree
node then the non-coherent operations are assigned for the device.

Add support to the ccp driver to assign the AXI DMA cache settings
based on whether the "dma-coherent" property is present in the device
node.  If present, use settings that work with the caches.  If not
present, use settings that do not look at the caches.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-23 21:28:38 +08:00
Fengguang Wu
96956aef2f crypto: drbg - drbg_exit() can be static
CC: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-10 16:52:04 +08:00
Dan Carpenter
ac1a2b49ea crypto: qat - remove an unneeded cast
The cast to (unsigned int *) doesn't hurt anything but it is pointless.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-10 16:50:36 +08:00
Ruchika Gupta
35af640386 crypto: caam - Check for CAAM block presence before registering with crypto layer
The layer which registers with the crypto API should check for the presence of
the CAAM device it is going to use.  If the platform's device tree doesn't have
the required CAAM node, the layer should return an error and not register the
algorithms with crypto API layer.

Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-10 16:50:33 +08:00
Stephan Mueller
5b635e280e crypto: drbg - HMAC-SHA1 DRBG has crypto strength of 128 bits
The patch corrects the security strength of the HMAC-SHA1 DRBG to 128
bits. This strength defines the size of the seed required for the DRBG.
Thus, the patch lowers the seeding requirement from 256 bits to 128 bits
for HMAC-SHA1.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-08 21:18:25 +08:00
Stephan Mueller
27e4de2bd1 crypto: drbg - Mix a time stamp into DRBG state
The current locking approach of the DRBG tries to keep the protected
code paths very minimal. It is therefore possible that two threads query
one DRBG instance at the same time. When thread A requests random
numbers, a shadow copy of the DRBG state is created upon which the
request for A is processed. After finishing the state for A's request is
merged back into the DRBG state. If now thread B requests random numbers
from the same DRBG after the request for thread A is received, but
before A's shadow state is merged back, the random numbers for B will be
identical to the ones for A. Please note that the time window is very
small for this scenario.

To prevent that there is even a theoretical chance for thread A and B
having the same DRBG state, the current time stamp is provided as
additional information string for each new request.

The addition of the time stamp as additional information string implies
that now all generate functions must be capable to process a linked
list with additional information strings instead of a scalar.

CC: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-08 21:18:25 +08:00
Stephan Mueller
4f15071879 crypto: drbg - Select correct DRBG core for stdrng
When the DRBG is initialized, the core is looked up using the DRBG name.
The name that can be used for the lookup is registered in
cra_driver_name. The cra_name value contains stdrng.

Thus, the lookup code must use crypto_tfm_alg_driver_name to obtain the
precise DRBG name and select the correct DRBG.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-08 21:18:24 +08:00
Stephan Mueller
72e7c25aa6 crypto: drbg - Call CTR DRBG DF function only once
The CTR DRBG requires the update function to be called twice when
generating a random number. In both cases, update function must process
the additional information string by using the DF function. As the DF
produces the same result in both cases, we can save one invocation of
the DF function when the first DF function result is reused.

The result of the DF function is stored in the scratchpad storage. The
patch ensures that the scratchpad is not cleared when we want to reuse
the DF result. For achieving this, the CTR DRBG update function must
know by whom and in which scenario it is called. This information is
provided with the reseed parameter to the update function.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-08 21:18:24 +08:00
Stephan Mueller
a9089571f2 crypto: drbg - Fix format string for debugging statements
The initial format strings caused warnings on several architectures. The
updated format strings now match the variable types.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
CC: Joe Perches <joe@perches.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-08 21:18:23 +08:00
Stephan Mueller
e25e47ec3d crypto: drbg - cleanup of preprocessor macros
The structure used to construct the module description line was marked
problematic by the sparse code analysis tool. The module line
description now does not contain any ifdefs to prevent error reports
from sparse.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-08 21:18:23 +08:00
Stanimir Varbanov
44b933b9e1 crypto: qce - add dependancy to Kconfig
Make qce crypto driver depend on ARCH_QCOM and make
possible to test driver compilation.

Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-08 21:18:22 +08:00
Stanimir Varbanov
58a6535f1a crypto: qce - fix sparse warnings
Fix few sparse warnings of type:
- sparse: incorrect type in argument
- sparse: incorrect type in initializer

Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-08 21:18:21 +08:00
Nitesh Narayan Lal
178f827a60 crypto: caam - Enabling multiple caam debug support for C29x platform
In the current setup debug file system enables us to debug the operational
details for only one CAAM. This patch adds the support for debugging multiple
CAAM's.

Signed-off-by: Nitesh Narayan Lal <b44382@freescale.com>
Signed-off-by: Vakul Garg <b16394@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-08 21:06:36 +08:00
Nitesh Narayan Lal
6f39da1cad crypto: dts - Addition of missing SEC compatibile property in c29x device tree
The driver is compatible with SEC version 4.0, which was missing from
device tree resulting that the caam driver doesn't gets probed. Since
SEC is backward compatible with older versions, so this patch adds those
missing versions in c29x device tree.

Signed-off-by: Nitesh Narayan Lal <b44382@freescale.com>
Signed-off-by: Vakul Garg <b16394@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-08 21:06:32 +08:00
Herbert Xu
f2c89a10de crypto: drbg - Use Kconfig to ensure at least one RNG option is set
This patch removes the build-time test that ensures at least one RNG
is set.  Instead we will simply not build drbg if no options are set
through Kconfig.

This also fixes a typo in the name of the Kconfig option CRYTPO_DRBG
(should be CRYPTO_DRBG).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-04 22:15:08 +08:00
Stephan Mueller
8c98716601 crypto: drbg - use of kernel linked list
The DRBG-style linked list to manage input data that is fed into the
cipher invocations is replaced with the kernel linked list
implementation.

The change is transparent to users of the interfaces offered by the
DRBG. Therefore, no changes to the testmgr code is needed.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-04 21:09:20 +08:00
Stephan Mueller
8fecaad77f crypto: drbg - fix memory corruption for AES192
For the CTR DRBG, the drbg_state->scratchpad temp buffer (i.e. the
memory location immediately before the drbg_state->tfm variable
is the buffer that the BCC function operates on. BCC operates
blockwise. Making the temp buffer drbg_statelen(drbg) in size is
sufficient when the DRBG state length is a multiple of the block
size. For AES192 this is not the case and the length for temp is
insufficient (yes, that also means for such ciphers, the final
output of all BCC rounds are truncated before used to update the
state of the DRBG!!).

The patch enlarges the temp buffer from drbg_statelen to
drbg_statelen + drbg_blocklen to have sufficient space.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-04 11:04:53 +08:00
Arnd Bergmann
e1f8859ee2 crypto: ux500 - make interrupt mode plausible
The interrupt handler in the ux500 crypto driver has an obviously
incorrect way to access the data buffer, which for a while has
caused this build warning:

../ux500/cryp/cryp_core.c: In function 'cryp_interrupt_handler':
../ux500/cryp/cryp_core.c:234:5: warning: passing argument 1 of '__fswab32' makes integer from pointer without a cast [enabled by default]
     writel_relaxed(ctx->indata,
     ^
In file included from ../include/linux/swab.h:4:0,
                 from ../include/uapi/linux/byteorder/big_endian.h:12,
                 from ../include/linux/byteorder/big_endian.h:4,
                 from ../arch/arm/include/uapi/asm/byteorder.h:19,
                 from ../include/asm-generic/bitops/le.h:5,
                 from ../arch/arm/include/asm/bitops.h:340,
                 from ../include/linux/bitops.h:33,
                 from ../include/linux/kernel.h:10,
                 from ../include/linux/clk.h:16,
                 from ../drivers/crypto/ux500/cryp/cryp_core.c:12:
../include/uapi/linux/swab.h:57:119: note: expected '__u32' but argument is of type 'const u8 *'
 static inline __attribute_const__ __u32 __fswab32(__u32 val)

There are at least two, possibly three problems here:
a) when writing into the FIFO, we copy the pointer rather than the
   actual data we want to give to the hardware
b) the data pointer is an array of 8-bit values, while the FIFO
   is 32-bit wide, so both the read and write access fail to do
   a proper type conversion
c) This seems incorrect for big-endian kernels, on which we need to
   byte-swap any register access, but not normally FIFO accesses,
   at least the DMA case doesn't do it either.

This converts the bogus loop to use the same readsl/writesl pair
that we use for the two other modes (DMA and polling). This is
more efficient and consistent, and probably correct for endianess.

The bug has existed since the driver was first merged, and was
probably never detected because nobody tried to use interrupt mode.
It might make sense to backport this fix to stable kernels, depending
on how the crypto maintainers feel about that.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: linux-crypto@vger.kernel.org
Cc: Fabio Baltieri <fabio.baltieri@linaro.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03 21:42:10 +08:00
Luca Clementi
263a8df0d3 crypto: tcrypt - print cra driver name in tcrypt tests output
Print the driver name that is being tested. The driver name can be
inferred parsing /proc/crypto but having it in the output is
clearer

Signed-off-by: Luca Clementi <luca.clementi@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03 21:42:09 +08:00
Stanimir Varbanov
14748d7c56 ARM: DT: qcom: Add Qualcomm crypto driver binding document
Here is Qualcomm crypto driver device tree binding documentation
to used as a reference example.

Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03 21:42:07 +08:00
Stanimir Varbanov
c672752d9c crypto: qce - Build Qualcomm crypto driver
Modify crypto Kconfig and Makefile in order to build the qce
driver and adds qce Makefile as well.

Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03 21:42:03 +08:00
Stanimir Varbanov
ec8f5d8f6f crypto: qce - Qualcomm crypto engine driver
The driver is separated by functional parts. The core part
implements a platform driver probe and remove callbaks.
The probe enables clocks, checks crypto version, initialize
and request dma channels, create done tasklet and init
crypto queue and finally register the algorithms into crypto
core subsystem.

- DMA and SG helper functions
 implement dmaengine and sg-list helper functions used by
 other parts of the crypto driver.

- ablkcipher algorithms
 implementation of AES, DES and 3DES crypto API callbacks,
 the crypto register alg function, the async request handler
 and its dma done callback function.

- SHA and HMAC transforms
 implementation and registration of ahash crypto type.
 It includes sha1, sha256, hmac(sha1) and hmac(sha256).

- infrastructure to setup the crypto hw
 contains functions used to setup/prepare hardware registers for
 all algorithms supported by the crypto block. It also exports
 few helper functions needed by algorithms:
	- to check hardware status
	- to start crypto hardware
	- to translate data stream to big endian form

 Adds register addresses and bit/masks used by the driver
 as well.

Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03 21:40:27 +08:00
Jarod Wilson
002c77a48b crypto: fips - only panic on bad/missing crypto mod signatures
Per further discussion with NIST, the requirements for FIPS state that
we only need to panic the system on failed kernel module signature checks
for crypto subsystem modules. This moves the fips-mode-only module
signature check out of the generic module loading code, into the crypto
subsystem, at points where we can catch both algorithm module loads and
mode module loads. At the same time, make CONFIG_CRYPTO_FIPS dependent on
CONFIG_MODULE_SIG, as this is entirely necessary for FIPS mode.

v2: remove extraneous blank line, perform checks in static inline
function, drop no longer necessary fips.h include.

CC: "David S. Miller" <davem@davemloft.net>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Stephan Mueller <stephan.mueller@atsec.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-07-03 21:38:32 +08:00
Tadeusz Struk
8f312d64b5 crypto: qat - Fix error path crash when no firmware is present
Firmware loader crashes when no firmware file is present.

Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-26 14:49:46 +08:00
Tadeusz Struk
d65071ecde crypto: qat - Fixed new checkpatch warnings
After updates to checkpatch new warnings pops up this patch fixes them.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Acked-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-26 14:49:43 +08:00
Tadeusz Struk
83530d818f crypto: qat - Updated Firmware Info Metadata
Updated Firmware Info Metadata

Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-26 14:49:35 +08:00
Tadeusz Struk
bce3cc61d3 crypto: qat - Fix random config build warnings
Fix random config build warnings:

Implicit-function-declaration ‘__raw_writel’
Cast to pointer from integer of different size [-Wint-to-pointer-cast]

Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-26 14:49:29 +08:00
Stephan Mueller
c0eedf8034 crypto: drbg - simplify ordering of linked list in drbg_ctr_df
As reported by a static code analyzer, the code for the ordering of
the linked list can be simplified.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-26 14:46:31 +08:00
Eric Dumazet
de18cd4b71 crypto: lzo - use kvfree() helper
kvfree() helper is now available, use it instead of open code it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25 21:51:53 +08:00
Jussi Kivilinna
5e50d43d65 crypto: des3_ede-x86_64 - fix parse warning
Patch fixes following sparse warning:

  CHECK   arch/x86/crypto/des3_ede_glue.c
arch/x86/crypto/des3_ede_glue.c:308:52: warning: restricted __be64 degrades to integer
arch/x86/crypto/des3_ede_glue.c:309:52: warning: restricted __be64 degrades to integer
arch/x86/crypto/des3_ede_glue.c:310:52: warning: restricted __be64 degrades to integer
arch/x86/crypto/des3_ede_glue.c:326:44: warning: restricted __be64 degrades to integer

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25 21:38:43 +08:00
Ruchika Gupta
1da2be33ad crypto: caam - Correct the dma mapping for sg table
At few places in caamhash and caamalg, after allocating a dmable
buffer for sg table , the buffer was being modified.  As per
definition of DMA_FROM_DEVICE ,afer allocation the memory should
be treated as read-only by the driver. This patch shifts the
allocation of dmable buffer for sg table after it is populated
by the  driver, making it read-only as per the DMA API's requirement.

Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25 21:38:41 +08:00
Ruchika Gupta
ef94b1d834 crypto: caam - Add definition of rd/wr_reg64 for little endian platform
CAAM IP has certain 64 bit registers . 32 bit architectures cannot force
atomic-64 operations.  This patch adds definition of these atomic-64
operations for little endian platforms. The definitions which existed
previously were for big endian platforms.

Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25 21:38:40 +08:00
Ruchika Gupta
17157c90a8 crypto: caam - Configuration for platforms with virtualization enabled in CAAM
For platforms with virtualization enabled

    1. The job ring registers can be written to only is the job ring has been
       started i.e STARTR bit in JRSTART register is 1

    2. For DECO's under direct software control, with virtualization enabled
       PL, BMT, ICID and SDID values need to be provided. These are provided by
       selecting a Job ring in start mode whose parameters would be used for the
       DECO access programming.

Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25 21:38:39 +08:00
Ruchika Gupta
eb1139cd43 crypto: caam - Correct definition of registers in memory map
Some registers like SECVID, CHAVID, CHA Revision Number,
CTPR were defined as 64 bit resgisters.  The IP provides
a DWT bit(Double word Transpose) to transpose the two words when
a double word register is accessed. However setting this bit
would also affect the operation of job descriptors as well as
other registers which are truly double word in nature.
So, for the IP to work correctly on big-endian as well as
little-endian SoC's, change is required to access all 32 bit
registers as 32 bit quantities.

Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-25 21:38:39 +08:00
Herbert Xu
e60b244281 crypto: qat - Fix build problem with O=
qat adds -I to the ccflags.  Unfortunately it uses CURDIR which
breaks when make is invoked with O=.  This patch replaces CURDIR
with $(src) which should work with/without O=.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-23 21:37:53 +08:00
Ard Biesheuvel
6c9e3dcd36 crypto: testmgr - add 4 more test vectors for GHASH
This adds 4 test vectors for GHASH (of which one for chunked mode), making
a total of 5.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-21 01:59:29 +08:00
chandramouli narayanan
22cddcc7df crypto: aes - AES CTR x86_64 "by8" AVX optimization
This patch introduces "by8" AES CTR mode AVX optimization inspired by
Intel Optimized IPSEC Cryptograhpic library. For additional information,
please see:
http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=22972

The functions aes_ctr_enc_128_avx_by8(), aes_ctr_enc_192_avx_by8() and
aes_ctr_enc_256_avx_by8() are adapted from
Intel Optimized IPSEC Cryptographic library. When both AES and AVX features
are enabled in a platform, the glue code in AESNI module overrieds the
existing "by4" CTR mode en/decryption with the "by8"
AES CTR mode en/decryption.

On a Haswell desktop, with turbo disabled and all cpus running
at maximum frequency, the "by8" CTR mode optimization
shows better performance results across data & key sizes
as measured by tcrypt.

The average performance improvement of the "by8" version over the "by4"
version is as follows:

For 128 bit key and data sizes >= 256 bytes, there is a 10-16% improvement.
For 192 bit key and data sizes >= 256 bytes, there is a 20-22% improvement.
For 256 bit key and data sizes >= 256 bytes, there is a 20-25% improvement.

A typical run of tcrypt with AES CTR mode encryption of the "by4" and "by8"
optimization shows the following results:

tcrypt with "by4" AES CTR mode encryption optimization on a Haswell Desktop:
---------------------------------------------------------------------------

testing speed of __ctr-aes-aesni encryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 343 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 336 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 491 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1130 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 7309 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 346 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 361 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 543 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1321 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 9649 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 369 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 366 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 595 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1531 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 10522 cycles (8192 bytes)

testing speed of __ctr-aes-aesni decryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 336 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 350 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 487 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1129 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 7287 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 350 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 359 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 635 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1324 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 9595 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 364 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 377 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 604 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1527 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 10549 cycles (8192 bytes)

tcrypt with "by8" AES CTR mode encryption optimization on a Haswell Desktop:
---------------------------------------------------------------------------

testing speed of __ctr-aes-aesni encryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 340 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 330 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 450 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1043 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 6597 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 339 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 352 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 539 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1153 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 8458 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 353 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 360 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 512 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1277 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 8745 cycles (8192 bytes)

testing speed of __ctr-aes-aesni decryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 348 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 335 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 451 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1030 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 6611 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 354 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 346 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 488 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1154 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 8390 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 357 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 362 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 515 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1284 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 8681 cycles (8192 bytes)

crypto: Incorporate feed back to AES CTR mode optimization patch

Specifically, the following:
a) alignment around main loop in aes_ctrby8_avx_x86_64.S
b) .rodata around data constants used in the assembely code.
c) the use of CONFIG_AVX in the glue code.
d) fix up white space.
e) informational message for "by8" AES CTR mode optimization
f) "by8" AES CTR mode optimization can be simply enabled
if the platform supports both AES and AVX features. The
optimization works superbly on Sandybridge as well.

Testing on Haswell shows no performance change since the last.

Testing on Sandybridge shows that the "by8" AES CTR mode optimization
greatly improves performance.

tcrypt log with "by4" AES CTR mode optimization on Sandybridge
--------------------------------------------------------------

testing speed of __ctr-aes-aesni encryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 383 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 408 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 707 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1864 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 12813 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 395 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 432 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 780 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 2132 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 15765 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 416 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 438 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 842 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 2383 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 16945 cycles (8192 bytes)

testing speed of __ctr-aes-aesni decryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 389 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 409 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 704 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1865 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 12783 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 409 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 434 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 792 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 2151 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 15804 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 421 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 444 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 840 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 2394 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 16928 cycles (8192 bytes)

tcrypt log with "by8" AES CTR mode optimization on Sandybridge
--------------------------------------------------------------

testing speed of __ctr-aes-aesni encryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 383 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 401 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 522 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1136 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 7046 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 394 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 418 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 559 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1263 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 9072 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 408 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 428 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 595 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1385 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 9224 cycles (8192 bytes)

testing speed of __ctr-aes-aesni decryption
test 0 (128 bit key, 16 byte blocks): 1 operation in 390 cycles (16 bytes)
test 1 (128 bit key, 64 byte blocks): 1 operation in 402 cycles (64 bytes)
test 2 (128 bit key, 256 byte blocks): 1 operation in 530 cycles (256 bytes)
test 3 (128 bit key, 1024 byte blocks): 1 operation in 1135 cycles (1024 bytes)
test 4 (128 bit key, 8192 byte blocks): 1 operation in 7079 cycles (8192 bytes)
test 5 (192 bit key, 16 byte blocks): 1 operation in 414 cycles (16 bytes)
test 6 (192 bit key, 64 byte blocks): 1 operation in 417 cycles (64 bytes)
test 7 (192 bit key, 256 byte blocks): 1 operation in 572 cycles (256 bytes)
test 8 (192 bit key, 1024 byte blocks): 1 operation in 1312 cycles (1024 bytes)
test 9 (192 bit key, 8192 byte blocks): 1 operation in 9073 cycles (8192 bytes)
test 10 (256 bit key, 16 byte blocks): 1 operation in 415 cycles (16 bytes)
test 11 (256 bit key, 64 byte blocks): 1 operation in 454 cycles (64 bytes)
test 12 (256 bit key, 256 byte blocks): 1 operation in 598 cycles (256 bytes)
test 13 (256 bit key, 1024 byte blocks): 1 operation in 1407 cycles (1024 bytes)
test 14 (256 bit key, 8192 byte blocks): 1 operation in 9288 cycles (8192 bytes)

crypto: Fix redundant checks

a) Fix the redundant check for cpu_has_aes
b) Fix the key length check when invoking the CTR mode "by8"
encryptor/decryptor.

crypto: fix typo in AES ctr mode transform

Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com>
Reviewed-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20 21:27:58 +08:00
Jussi Kivilinna
6574e6c64e crypto: des_3des - add x86-64 assembly implementation
Patch adds x86_64 assembly implementation of Triple DES EDE cipher algorithm.
Two assembly implementations are provided. First is regular 'one-block at
time' encrypt/decrypt function. Second is 'three-blocks at time' function that
gains performance increase on out-of-order CPUs.

tcrypt test results:

Intel Core i5-4570:

des3_ede-asm vs des3_ede-generic:
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B     1.21x   1.22x   1.27x   1.36x   1.25x   1.25x
64B     1.98x   1.96x   1.23x   2.04x   2.01x   2.00x
256B    2.34x   2.37x   1.21x   2.40x   2.38x   2.39x
1024B   2.50x   2.47x   1.22x   2.51x   2.52x   2.51x
8192B   2.51x   2.53x   1.21x   2.56x   2.54x   2.55x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-06-20 21:27:58 +08:00