ABI compliance with multi-register NEON intrinsics
Bug #952565 reported by
Michael Hope
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Linaro GCC |
Fix Released
|
High
|
Ramana Radhakrishnan |
Bug Description
When using NEON intrinsics that use two or more registers and the hard float calling convention, GCC puts the argument in core registers instead of NEON registers.
For example, this test:
#include <arm_neon.h>
void foo(uint32x4x2_t v);
void bar()
{
uint32x4x2_t v = { 0, };
foo(v);
}
calls foo with v in r0-r3 and the stack. v should be in q0-q1.
This is rare as it applies to hard float NEON installations where multi-register NEON intrinsics are passed across an API.
Related branches
lp:~ramana/gcc-linaro/46-abi-fix-backport
- Ulrich Weigand (community): Approve
Changed in gcc-linaro: | |
status: | Triaged → Fix Committed |
Changed in gcc-linaro: | |
status: | Fix Committed → Fix Released |
To post a comment you must log in.
Confirmed fixed. tip generates:
push {r3, lr}
movw r3, #:lower16:.LC0
movt r3, #:upper16:.LC0
vldmia r3, {d0-d3}
bl foo
pop {r3, pc}
Note that the push r3 is to keep the stack eight byte aligned when saving lr.