Comment 13 for bug 1401316

Revision history for this message
Terry Guo (terry.guo) wrote :

Hi Gary,

Merry Christmas. What you shared are really valuable to us. Please allow me to wrap up and share some of my conclusions here:
1). Recommend to access continuous memory addresses via structure which is more friend to compiler optimization, as shown in GccConstProb1.c.

2). Sometimes the better performance comes at the cost of code size, for example function inline, loop unroll.

3). Compiler generates less optimal code for below case:
terguo01@terry-pc01:GccConstProb$ cat h.c
void __attribute__((noreturn)) Reset_Handler()
{
     *(unsigned int *)(&((unsigned char *)0x48000000)[8]) = 0xc000000;
}
terguo01@terry-pc01:GccConstProb$ arm-none-eabi-gcc -mthumb -mcpu=cortex-m4 -O2 h.c -S
terguo01@terry-pc01:GccConstProb$ cat h.s

Reset_Handler:
 @ Volatile: function does not return.
 @ args = 0, pretend = 0, frame = 0
 @ frame_needed = 0, uses_anonymous_args = 0
 @ link register save eliminated.
 ldr r3, .L2
 mov r2, #201326592
 str r2, [r3]
 bx lr
.L3:
 .align 2
.L2:
 .word 1207959560
 .size Reset_Handler, .-Reset_Handler
 .ident "GCC: (GNU Tools for ARM Embedded Processors) 4.9.3 20141119 (release) [ARM/embedded-4_9-branch revision 218278]"

We can take advantage of the offset in str instruction to avoid the use of literal pool.

4). For Os below code can be improved to get smaller code size:
arm-none-eabi-gcc -mthumb -mcpu=cortex-m4 -Os GccConstProb1.c -c
arm-none-eabi-objdump -d GccConstProb1.o

  a4: f8c2 340c str.w r3, [r2, #1036] ; 0x40c
  a8: f8c2 3420 str.w r3, [r2, #1056] ; 0x420
  ac: f8c2 3424 str.w r3, [r2, #1060] ; 0x424
  b0:f8c2 1418 str.w r1, [r2, #1048] ; 0x418

All instructions are 32bit long. The total code size is 16 bytes. It will be smaller if we do it like below:

mov.w r2, #1000
str r3, [r2, #36]
str r3, [r2, #56]
str r3, [r2, #60]
str r3, [r2, #48]

The total code size will be 12 bytes.

Those are easy to understand and come up with a hot fix to get them done. But when consider in the overall picture of compiler, such hot fix will have side effects. Some of my colleague is working on another compiler optimization task which is supposed to cover cases like below. Let us wait and see.