[size, thumb2] Function is not inlined properly with -Os

Bug #634696 reported by Yao Qi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro GCC
New
Undecided
Unassigned

Bug Description

Compile consumer/cjpeg/jmemmgr.c from eembc by FSF GCC trunk with option -Os. Function out_of_memory() is inlined, but cause code size increase.

1. Compile jmemmgr.c,
 arm-none-linux-gnueabi-gcc -DNDE
BUG -DEEMBC_PROCESSOR="arm-none-linux-gnueabi_" -I. -I../th_lite/any/al -I../th_lite/src -Icjpeg -Icjpeg/datasets -DITERATIONS=10 -I. -I../th_lite/any/al -I../th_lite/src -c -mthumb -mcpu=cortex-a9 -mfpu=neon -Os -fno-common -mfloat-abi=hard -o gcc/obj_lite/cjpeg/jmemmgr.o cjpeg/jmemmgr.c

2. readelf shows size of .text section is 0x90c
$ readelf -S gcc/obj_lite/cjpeg/jmemmgr.o
There are 12 section headers, starting at offset 0xa18:

Section Headers:
  [Nr] Name Type Addr Off Size ES Flg Lk Inf Al
  [ 0] NULL 00000000 000000 000000 00 0 0 0
  [ 1] .text PROGBITS 00000000 000034 00090c 00 AX 0 0 4
  [ 2] .rel.text REL 00000000 0010bc 0000f8 08 10 1 4
  [ 3] .data PROGBITS 00000000 000940 000000 00 WA 0 0 1
  [ 4] .bss NOBITS 00000000 000940 000000 00 WA 0 0 1

3. add "-fno-inline", and compile it again, size of .text section is 0x8f0.
Section Headers:
  [Nr] Name Type Addr Off Size ES Flg Lk Inf Al
  [ 0] NULL 00000000 000000 000000 00 0 0 0
  [ 1] .text PROGBITS 00000000 000034 0008f0 00 AX 0 0 4
  [ 2] .rel.text REL 00000000 0010bc 0000f8 08 10 1 4
  [ 3] .data PROGBITS 00000000 000924 000000 00 WA 0 0 1
  [ 4] .bss NOBITS 00000000 000924 000000 00 WA 0 0 1

Since -Os is truned on, GCC should inline functions more carefully.

Tags: size task
Michael Hope (michaelh1)
tags: added: size task
Revision history for this message
Andrew Stubbs (ams-codesourcery) wrote :

This function is inlined by the early tree inliner. The compiler calculates the inlining costs as follows:

 Analyzing function body size: out_of_memory
  freq: 1000 size: 1 time: 1 D.4792 = cinfo->err;
    Likely eliminated
  freq: 1000 size: 1 time: 1 D.4792->msg_code = 54;
  freq: 1000 size: 1 time: 1 D.4792 = cinfo->err;
    Likely eliminated
  freq: 1000 size: 1 time: 1 D.4792->msg_parm.i[0] = which;
  freq: 1000 size: 1 time: 1 D.4792 = cinfo->err;
    Likely eliminated
  freq: 1000 size: 1 time: 1 D.4793 = D.4792->error_exit;
  freq: 1000 size: 2 time: 11 D.4793 (cinfo);
  freq: 1000 size: 0 time: 0 return;
    Likely eliminated
Overall function body time: 17-3 size: 8-3
With function call overhead time: 17-15 size: 8-6

In other words, it estimates the function size at 8 "units", 3 of which would likely be eliminated by inlining, and reckons a further 3 units would be eliminated from the calling function. Thus, it estimates inlining this function will cause the caller to grow by two units.

At -Os, the compiler will permit a function to grow, but only if it can eliminate the callee function entirely, and to do will not cost any more space overall. In the test case, the estimates say we save 8 units by eliminating "out_of_memory", and it costs 2 units to inline the function, so if there are 4 callers or fewer, it will do the inline.

The example code has exactly 4 callers, so the function is judged ok for inline. The problem is that the estimates are wrong - it's totally overestimated now many statements can be eliminated. In fact, the body of out_of_memory can almost be seen simply embedded within the calling functions.

Revision history for this message
Charles Baylis (cbaylis) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.