Inner loop can be optimized better in autcor00
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Linaro GCC |
Triaged
|
Medium
|
Unassigned |
Bug Description
In the inner loop of autcor00:
.L4:
adds r2, r3, #2
ldrh ip, [r0, r3]
ldrh r7, [r5, r3]
adds r3, r3, #4
ldrh r6, [r0, r2]
ldrh r2, [r5, r2]
smulbb r7, ip, r7
smulbb r2, r6, r2
asrs r7, r7, r4
adds r7, r1, r7
asrs r6, r2, r4
cmp r3, r8
add r1, r7, r6
bne .L4
r3 is used as a loop variable, and incremented for 4 each time. r8 is the upper bound of it. However, if we can transform loop variable to a decrement mode, we can make use of subs to replace add/cmp. Like this,
Change,
adds r3, r3, #4
.....
cmp r3, r8
bne .L4
to
subs XX XX #1
bne .L4
This is also related to IVOPTS choices.