[size] Replace multiple vldr by vldm
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Linaro GCC |
Confirmed
|
Undecided
|
Yao Qi |
Bug Description
Test case is extracted from eembc/office/
$ arm-none-
And gcc gives code:
00000000 <interpolatePoi
0: b5f0 push {r4, r5, r6, r7, lr}
2: 460f mov r7, r1
4: ed2d 8b10 vpush {d8-d15}
8: f100 0438 add.w r4, r0, #56 ; 0x38 // <--- [1]
c: b085 sub sp, #20
e: 2600 movs r6, #0
10: e03d b.n 8e <interpolatePoi
12: e954 2302 ldrd r2, r3, [r4, #-8]
16: 2500 movs r5, #0
18: ed14 ab0e vldr d10, [r4, #-56] ; 0xffffffc8 // <-- [2]
1c: ed14 bb0c vldr d11, [r4, #-48] ; 0xffffffd0 //
20: ed14 cb0a vldr d12, [r4, #-40] ; 0xffffffd8 //
24: ed14 db08 vldr d13, [r4, #-32] ; 0xffffffe0 //
multiple vldr is addressed by negative offset. Replace instruction [1] by 'mov r4, r0', and offset in vldr instructions is positive. Then, we can replace them by vldm in the further step.
tags: | added: size task |
Changed in gcc-linaro: | |
assignee: | nobody → Yao Qi (yao-codesourcery) |
Changed in gcc-linaro: | |
status: | New → Confirmed |
My plan to this optimization is multiple_ sequence to support float registers,
1) to enhance arm.c:load_
2) modify arm-ldmstm.ml to generate peephole2 for float registers to call gen_ldm/stm_seq.