arm-non-eabi-gcc v10.3.1 generate wrong code with O2 and higher

Bug #1968584 reported by Pablo
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
GNU Arm Embedded Toolchain
New
Undecided
Unassigned

Bug Description

- release version
arm-none-eabi-gcc (GNU Arm Embedded Toolchain 10.3-2021.10) 10.3.1 20210824 (release)

- whether the toolchain was rebuilt or you are using or binary package
package downloaded with STM32CubeProgrammer
Size: 2126664 byte (2076 KiB)
SHA256: 76aef3a90269c5dd0b2940f5cdf9b387e8ba114a3213c335ff4535665d4b29ac

- host machine
Intel i7, windows 10

- preprocessed testcase for us to reproduce
#include <stddef.h>
#include <stdint.h>
void * memset (void * _dst, int v, size_t size)
{
 uint8_t* dst = _dst;
 uint8_t* end = dst+size;
 while(dst<end)
  *dst++ = v;
 return _dst;
}

compile with:
arm-none-eabi-gcc "test.c" -c -O2 -mthumb -o "test.elf"
arm-none-eabi-objdump.exe -d test.elf

output:

test.elf: file format elf32-littlearm

Disassembly of section .text:

00000000 <memset>:
   0: 1883 adds r3, r0, r2
   2: b510 push {r4, lr}
   4: 0004 movs r4, r0
   6: 4298 cmp r0, r3
   8: d203 bcs.n 12 <memset+0x12>
   a: 23ff movs r3, #255 ; 0xff
   c: 4019 ands r1, r3
   e: f7ff fffe bl 0 <memset>
  12: 0020 movs r0, r4
  14: bc10 pop {r4}
  16: bc02 pop {r1}
  18: 4708 bx r1
  1a: 46c0 nop ; (mov r8, r8)

- symptom(s)
the binary code is broken, generating a stackoverflow... yeah, not the website, a crash

- error/warning message encountered
compiler is silent, on runtime you see smoke coming out of the cpu

Revision history for this message
Liviu Ionescu (ilg) wrote (last edit ):

Initially I thought that `dst+size` (adding an integer to a pointer to a void) might be undefined behaviour, but I did a test and even after rewriting it as `uint8_t *end = dst; end += size; `, the result is the same, so it might be a bug.

Revision history for this message
Liviu Ionescu (ilg) wrote (last edit ):

I did some further tests and this is definitely a bug in GCC, but it is not as severe as I was afraid initially, since it does not affect usual code.

With -O2, the compiler optimises loops that set a memory area by internally calling memset(), instead of generating code for the loop.

Unfortunately it does not check that the current function name is... memset() :-(

The result is that memset() calls memset(), and since the function prologue pushes something on the stack, you get a stack overflow, but otherwise you'll probably get an infinite loop.

To verify this, add a memset2() function with exactly the same content as memset(), and you'll see that the generated code is identical, except that memset2() calls memset(), while memset() calls itself.

To conclude, the bug is a missing check, since the compiler should not apply the optimisation to replace loops with calls to memset(), if the function is called memset() itself.

I would expect that a similar problem occurs for other such optimisations, like memcpy() and possibly for memmove().

As a workaround, if you really want to redefine the memset() function, add some pragmas and temporarily disable this optimisation.

Although redefining these functions occurs rarely, it would still be nice for Arm to fix this bug in the next release, or at least to list it as a known issue.

Revision history for this message
Liviu Ionescu (ilg) wrote :

In newlib the problem is avoided with a macro that defines an attribute to disable the optimisation; the function definitions look like:

void
__attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
*memset(void *_dst, int v, size_t size) { ... }

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.