__restrict generate wrong code with O2 and higher
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
GNU Arm Embedded Toolchain |
New
|
Undecided
|
Unassigned |
Bug Description
- release version
arm-none-eabi-gcc (GNU Arm Embedded Toolchain 10.3-2021.10) 10.3.1 20210824 (release)
- whether the toolchain was rebuilt or you are using or binary package
binary package from STM32CubeIDE
Size: 2126664 byte (2076 KiB)
SHA256: 76aef3a90269c5d
verified to happen on other versions too
- host machine
Intel i7, windows 10
- preprocessed testcase for us to reproduce
#include <stddef.h>
#include <stdint.h>
void * memcpy (void *__restrict _dst, const void *__restrict _src, size_t size)
{
uint8_t* dst = _dst;
uint8_t* end = dst+size;
const uint8_t* src = _src;
while(dst<end)
*dst++ = *src++;
return _dst;
}
compile with:
arm-none-eabi-gcc "test.c" -c -O2 -mthumb -o "test.elf"
arm-none-
output:
test.elf: file format elf32-littlearm
Disassembly of section .text:
00000000 <memcpy>:
0: 1883 adds r3, r0, r2
2: b510 push {r4, lr}
4: 0004 movs r4, r0
6: 4298 cmp r0, r3
8: d201 bcs.n e <memcpy+0xe>
a: f7ff fffe bl 0 <memcpy>
e: 0020 movs r0, r4
10: bc10 pop {r4}
12: bc02 pop {r1}
14: 4708 bx r1
16: 46c0 nop ; (mov r8, r8)
- symptom(s)
the prologue of the function is part of the loop ! this line here:
2: b510 push {r4, lr}
- error/warning message encountered
on runtime you may get a stackoverflow. not the website
no compiler
comments:
__restrict attribute is the cause of this. without restrict:
00000000 <memcpy>:
0: 1883 adds r3, r0, r2
2: b510 push {r4, lr}
4: 4298 cmp r0, r3
6: d205 bcs.n 14 <memcpy+0x14>
8: 2300 movs r3, #0
a: 5ccc ldrb r4, [r1, r3]
c: 54c4 strb r4, [r0, r3]
e: 3301 adds r3, #1
10: 4293 cmp r3, r2
12: d1fa bne.n a <memcpy+0xa>
14: bc10 pop {r4}
16: bc02 pop {r1}
18: 4708 bx r1
1a: 46c0 nop ; (mov r8, r8)
I wonder if the test case is hitting an undefined behaviour, however with restrict I am assuming the contract I am providing is that src and dst are not poinintg to the same location so writing to dst will not have a side effect on src.