Using -flto can cause stack corruption

Bug #1698217 reported by John Martin on 2017-06-15

This bug report will be marked for expiration in 48 days if no further activity occurs. (find out why)

This bug affects 1 person
Affects Status Importance Assigned to Milestone
GNU ARM Embedded Toolchain

Bug Description

See attached test case.

Link-time optimization is turned on (-flto) which is causing FunctionA to inline FunctionB. In my application FunctionC was not inlined but called by FunctionA. FunctionC is built without link-time optimization so I didn't have to include the lengthly code that was in the function (I was concerned the example may inline it as well and perform different optimizations). When looking at the assembly there is two versions of FunctionB. The original takes argument c as a variable length of the array e. The second version replaces this variable length with the known length of 1 byte. The first example does not run correctly and causes stack corruption. The second example runs as expected. When looking at the assembly the sub instruction at the beginning of the function is off by 4 (the size of the array on the stack); however, the same stack offset is used for the pointer passed as argument 1 and the same size is passed as argument 2. Also, the optimization is not ideal as it uses an intermediate register r7 for no reason.

John Martin (jmartin-emmicro) wrote :
John Martin (jmartin-emmicro) wrote :

Tested with (most recent download):
arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 5.4.1 20160919 (release) [ARM/embedded-5-branch revision 240496]

Hi John,

Can you test with our 2016Q1 toolchain to see if this has already been fixed?

Best regards.

Changed in gcc-arm-embedded:
status: New → Incomplete
Leo Havmøller (leh-p) wrote :

> Can you test with our 2016Q1 toolchain
You mean 2017Q1, right?

err yes, 2017Q1 sorry.

John Martin (jmartin-emmicro) wrote :

The one I tested was the latest one on the launchpad page:
I assume you mean this one:

Which doesn't run on my work Linux system because of the glibc version. It does work on Windows and the stack seems to be set right. It still has the quirky "optimization" with r7:

Dump of assembler code for function FunctionA:
   0x00008130 <+0>: push {r4, r7, lr}
   0x00008132 <+2>: sub sp, #12
   0x00008134 <+4>: add r7, sp, #0
   0x00008136 <+6>: movs r1, #4
   0x00008138 <+8>: adds r0, r7, #4
   0x0000813a <+10>: mov r4, sp
   0x0000813c <+12>: bl 0x8170 <FunctionC>

Try -fomit-frame-pointer and see if this helps the quirky optimization. So it seems to have been fixed in recent toolchain. It might have been fixed on latest gcc-5 branch as well which would contain more recent fixes than the 5-2016Q3 release.

Best regards.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments