This looks similar to before - the intention being that %r3 is loaded with the address of the start of the function, %r12 is loaded with the offset of OPENSSL_armcap_P from the start of the function, then %r12 is loaded with the value of OPENSSL_armcap_P.
However, the first instruction is broken - the #3 should be a #4. Stepping past it produces this state:
Now, %r3 is the address of the start of the function, off-by-1. Continuing to step through demonstrates that we load the wrong value in to %r12, and this causes the non-NEON code path to be executed, which somehow doesn't work.
If I step past these first 3 instructions, and then write the expected value of OPENSSL_armcap_P in to %r12, the test completes successfully:
Continuing debugging the broken build, if we look at the first few instructions of sha256_ block_data_ order:
0x004160c0 <+0>: subw r3, pc, #3
0x004160c4 <+4>: ldr.w r12, [pc, #-36] ; 0x4160a4
0x004160c8 <+8>: ldr.w r12, [r3, r12]
This looks similar to before - the intention being that %r3 is loaded with the address of the start of the function, %r12 is loaded with the offset of OPENSSL_armcap_P from the start of the function, then %r12 is loaded with the value of OPENSSL_armcap_P.
However, the first instruction is broken - the #3 should be a #4. Stepping past it produces this state:
(gdb) info registers block_data_ order+4> block_data_ order block_data_ order>
r0 0x4b7558 4945240
r1 0x4b7580 4945280
r2 0x1 1
r3 0x4160c1 4284609
r4 0x4b7558 4945240
r5 0xfffef44c 4294898764
r6 0x0 0
r7 0x4b7580 4945280
r8 0x3 3
r9 0xfffef44c 4294898764
r10 0x0 0
r11 0x0 0
r12 0x0 0
sp 0xfffef3b8 0xfffef3b8
lr 0x414387 4277127
pc 0x4160c4 0x4160c4 <sha256_
cpsr 0x40080030 1074266160
(gdb) p sha256_
$1 = {<text variable, no debug info>} 0x4160c0 <sha256_
Now, %r3 is the address of the start of the function, off-by-1. Continuing to step through demonstrates that we load the wrong value in to %r12, and this causes the non-NEON code path to be executed, which somehow doesn't work.
If I step past these first 3 instructions, and then write the expected value of OPENSSL_armcap_P in to %r12, the test completes successfully:
(gdb) info registers block_data_ order+12> block_data_ order+12> 026\277\ 217\001\ 317\352AA@ \336]\256\ "#\260\ 003a\243\ 226\027z\ 234\264\ 020\377a\ 362", size=0x0, type=0x4aff00 <sha256_md>, impl=0x0)
r0 0x4b7558 4945240
r1 0x4b7580 4945280
r2 0x1 1
r3 0x4160c1 4284609
r4 0x4b7558 4945240
r5 0xfffef44c 4294898764
r6 0x0 0
r7 0x4b7580 4945280
r8 0x3 3
r9 0xfffef44c 4294898764
r10 0x0 0
r11 0x0 0
r12 0x1000000 16777216
sp 0xfffef3b8 0xfffef3b8
lr 0x414387 4277127
pc 0x4160cc 0x4160cc <sha256_
cpsr 0x40080030 1074266160
(gdb) set $r12 = 3
(gdb) info registers
r0 0x4b7558 4945240
r1 0x4b7580 4945280
r2 0x1 1
r3 0x4160c1 4284609
r4 0x4b7558 4945240
r5 0xfffef44c 4294898764
r6 0x0 0
r7 0x4b7580 4945280
r8 0x3 3
r9 0xfffef44c 4294898764
r10 0x0 0
r11 0x0 0
r12 0x3 3
sp 0xfffef3b8 0xfffef3b8
lr 0x414387 4277127
pc 0x4160cc 0x4160cc <sha256_
cpsr 0x40080030 1074266160
(gdb) cont
Continuing.
Testing SHA-256 .
Breakpoint 1, EVP_Digest (data=0x47f38c, count=56, md=0xfffef44c "\272x\
at digest.c:353
353 digest.c: No such file or directory.
(gdb)
Yay!