Comment 9 for bug 1401316

Revision history for this message
Terry Guo (terry.guo) wrote :

For this original code mentioned in comment #8, the smaller code size case is from GccConstProb0.c which is using inline assembly code, the generated code at O2 level are as below:

  18: f44f 41ee mov.w r1, #30464 ; 0x7700
  1c: 6211 str r1, [r2, #32]
  1e: 4928 ldr r1, [pc, #160] ; (c0 <Reset_Handler+0xc0>)
  20: 6251 str r1, [r2, #36] ; 0x24

Since you are using inline assembly code, the compiler can't schedule those instructions. You can see there are data dependence of register r1. When you actually run such code pattern, the performance is not optimal. However, since only the low registers will be used in such case, the code size is really better.

The bigger code size case is from GccConstProb1.c which is using C code. At O2 level, the compiler will do instruction schedule to avoid such read/write data dependence as much as possible. The generated code looks like:

   2: 4b14 ldr r3, [pc, #80] ; (54 <Reset_Handler+0x54>)
   4: f8df 905c ldr.w r9, [pc, #92] ; 64 <Reset_Handler+0x64>
   8: f8df e05c ldr.w lr, [pc, #92] ; 68 <Reset_Handler+0x68>
   c: 4e12 ldr r6, [pc, #72] ; (58 <Reset_Handler+0x58>)
   e: 4d13 ldr r5, [pc, #76] ; (5c <Reset_Handler+0x5c>)
  10: 4813 ldr r0, [pc, #76] ; (60 <Reset_Handler+0x60>)
  ......................................................
    2c: f8c2 a008 str.w sl, [r2, #8]
  30: 6051 str r1, [r2, #4]
  32: f8c2 9000 str.w r9, [r2]
  36: f8c2 800c str.w r8, [r2, #12]
  3a: f8c2 c020 str.w ip, [r2, #32]
  3e: f8c2 e024 str.w lr, [r2, #36] ; 0x24

You can see the instructions are shuffled to avoid data dependence. The performance will be better. The side effects are extended register live and more registers are consumed including the high registers. So the code size is increased.

If your code is not for performance wise, I suggest you to compile it with Og option. You can get expected code size.