[thumb2,size] Push/pop low register rather than high register when keeping stack alignment

Bug #633233 reported by Yao Qi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro GCC
Won't Fix
Undecided
Unassigned
4.5
Won't Fix
Undecided
Unassigned
gcc
Fix Released
Medium

Bug Description

As required by ARM EABI, stack should be 8byte aligned. In order to keep such alignment, GCC will push/pop register on stack.

In bash-3.2/bashline.o:history_expand_line_internal
Dump of assembler code for function history_expand_line_internal:
   0x00001c1c <+0>: stmdb sp!, {r4, r5, r6, r7, r8, lr} // <-- [1]
....
   0x00001c4a <+46>: ldmia.w sp!, {r4, r5, r6, r7, r8, lr} // <-- [2]
   0x00001c4e <+50>: b.w 0x1c4e <history_expand_line_internal+50>
   0x00001c52 <+54>: ldmia.w sp!, {r4, r5, r6, r7, r8, pc} // <-- [3]

Possible improvement is,
[1] can be changed to push {r3, r4, r5, r6, r7, lr}
[2] can be changed to ldmia.w sp!, {r3, r4, r5, r6, r7, lr}
[3] can be changed to pop {r3, r4, r5, r6, r7, pc}

Tags: size task
Revision history for this message
Richard Earnshaw (richard-earnshaw) wrote :

This was implemented (supposedly) in gcc-4.5 with

2009-06-02 Richard Earnshaw <email address hidden>

        * arm.c (arm_get_frame_offsets): Prefer using r3 for padding a
        push/pop multiple to 8-byte alignment.

Revision history for this message
Yao Qi (yao-codesourcery) wrote :

Richard,
Thanks for letting us know this patch. However, the problems exists in GCC FSF trunk ( 4.6.0 20100902). Test case is extracted from bash-3.2 and attached here.

$ ./fsf-mainline/install/bin/arm-none-linux-gnueabi-gcc --version
arm-none-linux-gnueabi-gcc (GCC) 4.6.0 20100902 (experimental)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ ./fsf-mainline/install/bin/arm-none-linux-gnueabi-gcc -Os -mthumb -march=armv7-a bashline.c -S

We can see that r8 is still used,

history_expand_line_internal:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        push {r4, r5, r6, r7, r8, lr}
        movs r1, #0
        ldr r4, .L3
        mov r2, r1
        mov r5, r0
        ldr r6, [r4, #0]
        str r1, [r4, #0]
        bl pre_process_line
        str r6, [r4, #0]
        cmp r0, r5
        mov r7, r0
        bne .L2
        bl strlen
        adds r0, r0, #1
        bl xmalloc
        mov r1, r7
        pop {r4, r5, r6, r7, r8, lr}
        b strcpy
.L2:
        pop {r4, r5, r6, r7, r8, pc}
.L4:
        .align 2

Revision history for this message
Yao Qi (yao-codesourcery) wrote :

Compile the test case with GCC 4.5.0 and Linaro GCC 4.5 respectively, r3 is used indeed in a correct way. It should be a regression here.

--- bashline.trunk.s 2010-09-16 16:10:41.000000000 +0800
+++ bashline.4.5.s 2010-09-16 16:09:19.000000000 +0800
@@ -20,11 +20,11 @@
 history_expand_line_internal:
  @ args = 0, pretend = 0, frame = 0
  @ frame_needed = 0, uses_anonymous_args = 0
- push {r4, r5, r6, r7, r8, lr}
- movs r1, #0
+ push {r3, r4, r5, r6, r7, lr}
  ldr r4, .L3
- mov r2, r1
+ movs r1, #0
  mov r5, r0
+ mov r2, r1
  ldr r6, [r4, #0]
  str r1, [r4, #0]
  bl pre_process_line

Revision history for this message
In , Qiyao (qiyao) wrote :

This was implemented in gcc-4.5, still works in gcc-4.5 branch.
2009-06-02 Richard Earnshaw <email address hidden>

        * arm.c (arm_get_frame_offsets): Prefer using r3 for padding a
        push/pop multiple to 8-byte alignment.
However, it doesn't work any longer on gcc-4.6 trunk. Reproduced on the following steps,

$ arm-none-linux-gnueabi-gcc -Os -mthumb -march=armv7-a bashline.c -S

On gcc-4.5 branch, r3 is used to keep stack alignment, which is expected.
history_expand_line_internal:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        push {r3, r4, r5, r6, r7, lr}

However, on gcc-4.6 trunk, we can see that, r8 is used to keep stack alignment, which is *not* expected.
history_expand_line_internal:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        push {r4, r5, r6, r7, r8, lr}

Revision history for this message
In , Qiyao (qiyao) wrote :

Created attachment 21816
Test case

Revision history for this message
In , Qiyao (qiyao) wrote :

In arm.c:arm_get_frame_offsets (void)

   if (!crtl->tail_call_emit
       && arm_size_return_regs () <= 12
       && (offsets->saved_regs_mask & (1 << 3)) == 0)
     {
       reg = 3;
     }

In gcc-4.5, crtl->tail_call_emit is always false, so the condition is true. However, in gcc-4.6, crtl->tail_call_emit can be true. My stupid question is 'why can't we use r3 when we do tail call optimization?'. I read one sentence in comment, 'We try to avoid using the arg registers (r0 -r3) as they might be used to pass values in a tail call.'. Is it the answer to my question?

If that is the answer, in oder to fix this bug, we may refine the condition to 'tail call insns are emitted, and r3 is used to pass values to tail call', like this,

  if (!(crtl->tail_call_emit && used_to_pass_values (3))
       && ...
       && ...)

Is this a correct fix?

Revision history for this message
In , Qiyao (qiyao) wrote :

Patch is submitted to gcc-patches for review
http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01741.html

Michael Hope (michaelh1)
tags: added: size task
Changed in gcc-linaro:
status: New → In Progress
assignee: nobody → Yao Qi (yao-codesourcery)
Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Created attachment 22835
gcc46-pr45701.patch

Remembering pointers to tail call insns is both wasteful (allocates stuff that is never used anywhere but on arm), error-prone (nothing adjusts the list if
tail calls are deleted, or duplicated, etc.) and unnecessary, because
the list is trivially available from the IL.

Here is completely untested patch (tested just on the 3 testcases with a cross).
Could anyone interested in ARM target please bootstrap/regtest this?
We have way too many open P1 bugs.

Revision history for this message
In , Ramana-gcc (ramana-gcc) wrote :

Ok I'll test this. (In reply to comment #4)
> Created attachment 22835 [details]
> gcc46-pr45701.patch
>
> Remembering pointers to tail call insns is both wasteful (allocates stuff that
> is never used anywhere but on arm), error-prone (nothing adjusts the list if
> tail calls are deleted, or duplicated, etc.) and unnecessary, because
> the list is trivially available from the IL.
>
> Here is completely untested patch (tested just on the 3 testcases with a
> cross).
> Could anyone interested in ARM target please bootstrap/regtest this?
> We have way too many open P1 bugs.

I'll test this.

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

(In reply to comment #5)
> I'll test this.

Have you managed to test it already? Thanks.

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

On IRC you've mentioned some TLS test issues, were they caused by this patch or unrelated?

Revision history for this message
In , Ramana Radhakrishnan (ramana) wrote :

On Sat, Jan 15, 2011 at 1:36 PM, jakub at gcc dot gnu.org
<email address hidden> wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45701
>
> --- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-01-15 13:35:46 UTC ---
> On IRC you've mentioned some TLS test issues, were they caused by this patch or
> unrelated?

Been travelling this week. Should be able to get answers once I can
access my board next week.

cheers
Ramana

>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>

Revision history for this message
In , Ramana-gcc (ramana-gcc) wrote :

(In reply to comment #7)
> On IRC you've mentioned some TLS test issues, were they caused by this patch or
> unrelated?

Sorry about the delay in responding. With trunk I see no new regressions with my testing configurations . The tls failures were unrelated to this patch.

Ok to submit

Ramana

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Author: jakub
Date: Tue Jan 25 16:22:34 2011
New Revision: 169240

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169240
Log:
 PR target/45701
 * config/arm/arm.c (any_sibcall_uses_r3): New function.
 (arm_get_frame_offsets): Use it.

2011-01-25 Yao Qi <email address hidden>

        PR target/45701
        * gcc.target/arm/pr45701-1.c: New test.
        * gcc.target/arm/pr45701-2.c: New test.
 * gcc.target/arm/pr45701-3.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/arm/pr45701-1.c
    trunk/gcc/testsuite/gcc.target/arm/pr45701-2.c
    trunk/gcc/testsuite/gcc.target/arm/pr45701-3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/arm/arm.c
    trunk/gcc/testsuite/ChangeLog

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Fixed.

Changed in gcc:
importance: Unknown → Medium
status: Unknown → Fix Released
Revision history for this message
In , Dnovillo (dnovillo) wrote :

Author: dnovillo
Date: Wed Feb 2 17:46:50 2011
New Revision: 169583

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169583
Log:
 PR target/45701
 * config/arm/arm.c (any_sibcall_uses_r3): New function.
 (arm_get_frame_offsets): Use it.

2011-01-25 Yao Qi <email address hidden>

        PR target/45701
        * gcc.target/arm/pr45701-1.c: New test.
        * gcc.target/arm/pr45701-2.c: New test.
 * gcc.target/arm/pr45701-3.c: New test.

Added:
    branches/google/integration/gcc/testsuite/gcc.target/arm/pr45701-1.c
    branches/google/integration/gcc/testsuite/gcc.target/arm/pr45701-2.c
    branches/google/integration/gcc/testsuite/gcc.target/arm/pr45701-3.c
Modified:
    branches/google/integration/gcc/ChangeLog
    branches/google/integration/gcc/config/arm/arm.c
    branches/google/integration/gcc/testsuite/ChangeLog

Revision history for this message
Ramana Radhakrishnan (ramana) wrote :

Does this need a backport to the linaro 4.5 branch ? I think this should appear in the linaro 4.6 branch and in 4.6.0 upstream.

Ramana

Revision history for this message
Michael Hope (michaelh1) wrote :

If practical, backport to 4.5.

Changed in gcc-linaro:
assignee: Yao Qi (yao-codesourcery) → nobody
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.