Missed optimizations (loop and otherwise)

Reported by Yao Qi on 2010-10-14
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro GCC
Medium
Unassigned

Bug Description

Missed iterator promotion opportunity

// test.c
void
foo (signed short n, unsigned char *a)
{
  signed short i;
  for (i = n; i > 0; i--)
    a[i] = a[i - 1];
}

Compile test case with FSF mainline,
$ arm-none-linux-gnueabi-gcc --version
arm-none-linux-gnueabi-gcc (GCC) 4.6.0 20101013 (experimental)
$arm-none-linux-gnueabi-gcc -mcpu=cortex-a8 -mthumb -O2 -funroll-loops --param max-unroll-times=2 -c test.c

Disassebmly shows:

00000000 <foo>:
   0: 2800 cmp r0, #0
   2: b410 push {r4}
   4: dd1e ble.n 44 <foo+0x44> // <---[1]
   6: 1809 adds r1, r1, r0
   8: 3801 subs r0, #1
   a: 460b mov r3, r1
   c: f000 0401 and.w r4, r0, #1
  10: f811 2c01 ldrb.w r2, [r1, #-1]
  14: b280 uxth r0, r0 // <-- [2]
  16: f803 2901 strb.w r2, [r3], #-1
  1a: b198 cbz r0, 44 <foo+0x44>
  1c: b134 cbz r4, 2c <foo+0x2c>
  1e: f813 1c01 ldrb.w r1, [r3, #-1]
  22: 1e42 subs r2, r0, #1
  24: b290 uxth r0, r2
  26: f803 1901 strb.w r1, [r3], #-1
  2a: b158 cbz r0, 44 <foo+0x44>
  2c: f813 2c01 ldrb.w r2, [r3, #-1]
  30: 3802 subs r0, #2
  32: b280 uxth r0, r0
  34: f803 2901 strb.w r2, [r3], #-1
  38: f813 cc01 ldrb.w ip, [r3, #-1]
  3c: f803 c901 strb.w ip, [r3], #-1
  40: 2800 cmp r0, #0
  42: d1f3 bne.n 2c <foo+0x2c>
  44: bc10 pop {r4}
  46: 4770 bx lr

There are some possible optimizations,
1. Move instruction [1] before "push {r4} ", and change its target address to foo+0x46
2. Instruction [2] is redundant, because r0 is unsigned int, and after instruction [1], r0 should be >= 1, so on instruction [2], r0 is >= 0. uxth is like a nop.

[Codesourcery #6168]

Ramana Radhakrishnan (ramana) wrote :

Opportunity 1 above is shrink wrapping which has been implemented in the Linaro 4.5 tree thanks to Bernd .

Ramana

Ulrich Weigand (uweigand) wrote :

Can be fixed by IVOPTs making better choices. Currently IVOPT is unaware that performing a comparison in HImode is less efficient (in general) than performing a comparison in SImode.

Changed in gcc-linaro:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Ulrich Weigand (uweigand)
Changed in gcc-linaro:
assignee: Ulrich Weigand (uweigand) → nobody
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers