Optimize load 0.0 for NEON seems not to work

Bug #667490 reported by Khem Raj
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro GCC
Incomplete
Medium
Unassigned

Bug Description

with latest gcc this patch here

http://bazaar.launchpad.net/~linaro-toolchain-dev/gcc-linaro/4.5/revision/99350

is causing cairo to have broken rendering. I have not dug into cairo but this patch seems to be responsible
If I revert this it works all well. Then I also see that the testcase that this patch adds

gcc/testsuite/gcc.target/arm/neon-load-df0.c

does not pass, so this optimization is not triggering or regressing

The output I get is when compiled using
-mfpu=neon -mfloat-abi=softfp -march=armv7-a -S -O3
it remains same with -mfloat-abi=hard too

 .arch armv7-a
 .eabi_attribute 27, 3
 .fpu neon
 .eabi_attribute 20, 1
 .eabi_attribute 21, 1
 .eabi_attribute 23, 3
 .eabi_attribute 24, 1
 .eabi_attribute 25, 1
 .eabi_attribute 26, 2
 .eabi_attribute 30, 2
 .eabi_attribute 18, 4
 .file "test.c"
 .text
 .align 2
 .p2align 4,,15
 .global bar
 .type bar, %function
bar:
 @ args = 0, pretend = 0, frame = 0
 @ frame_needed = 0, uses_anonymous_args = 0
 @ link register save eliminated.
 movw r1, #:lower16:x
 mov r2, #0
 movt r1, #:upper16:x
 mov r3, #0
 strd r2, [r1]
 bx lr

as can be seen it does not generate vmov.i32 d16, #0 as expected

Michael Hope (michaelh1)
Changed in gcc-linaro:
status: New → Confirmed
status: Confirmed → New
importance: Undecided → Medium
Revision history for this message
Michael Hope (michaelh1) wrote :

Hi there. Could you tell me a bit more about the problem? What version of cairo? How is the rendering broken? How can I reproduce it?

Revision history for this message
Julian Brown (julian-codesourcery) wrote :

I can't find a bug here, apart from perhaps that the test case (gcc/testsuite/gcc.target/arm/neon-load-df0.c) is not written very robustly. The compiler is making a sane choice when compiling that test: there's no particular need to use a NEON register for the operation in question, whether or not the hard-float ABI is in use. An alternative test case, e.g. simply:

double bar ()
{
  return 0.0;
}

compiled with -mfloat-abi=hard, reveals that the load-double-zero patch does work correctly (in a case where using a NEON register is definitely beneficial).

Revision history for this message
Julian Brown (julian-codesourcery) wrote :

The Cairo problem must be something more subtle: we'd need a proper test case to figure that one out, as Michael hints at.

Revision history for this message
Khem Raj (khem-raj) wrote : Re: [Bug 667490] Re: Optimize load 0.0 for NEON seems not to work

On Tue, Nov 9, 2010 at 11:08 AM, Julian Brown <email address hidden> wrote:
> I can't find a bug here, apart from perhaps that the test case
> (gcc/testsuite/gcc.target/arm/neon-load-df0.c) is not written very
> robustly. The compiler is making a sane choice when compiling that test:
> there's no particular need to use a NEON register for the operation in
> question, whether or not the hard-float ABI is in use. An alternative
> test case, e.g. simply:
>
> double bar ()
> {
>  return 0.0;
> }
>
> compiled with -mfloat-abi=hard, reveals that the load-double-zero patch
> does work correctly (in a case where using a NEON register is definitely
> beneficial).
>

how about -mfloat-abi=soft-fp ? thats where we see the problem.
It will take some time to pin down the problem in cairo. Let me try to
gather details
and see if we can have something put together.
> --
> Optimize load 0.0 for NEON seems not to work
> https://bugs.launchpad.net/bugs/667490
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message
Julian Brown (julian-codesourcery) wrote :

I didn't mean to imply the test written in my previous comment was _broken_ with -mfloat-abi=softfp, but it does compile to (in Thumb mode):

        movs r0, #0
        movs r1, #0
        bx lr

which again is entirely sensible: for the soft-float ABI, the return value must be in core registers, so it'd be silly to load the 0.0 value into an FP register only to immediately transfer it to core registers.

Revision history for this message
Michael Hope (michaelh1) wrote :

Hi Khem. Any luck on reproducing the problem? Could you tell me what version of Cairo this is against, and if there's a test case that shows it?

Changed in gcc-linaro:
status: New → Incomplete
Revision history for this message
Ramana Radhakrishnan (ramana) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.