Consecutive memory barriers even with optimization
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
GNU Arm Embedded Toolchain |
New
|
Undecided
|
Unassigned |
Bug Description
-------
#include <stdatomic.h>
void test_atomic(void) {
_Atomic int a;
atomic_
atomic_
}
^^^^^^^
$ arm-none-eabi-gcc --version
arm-none-eabi-gcc (GNU Tools for Arm Embedded Processors 7-2017-q4-major) 7.2.1 20170904 (release) [ARM/embedded-
...
$ arm-none-eabi-gcc -O3 -c test_atomic.c
$ arm-none-
test_atomic.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <test_atomic>:
0: e52de004 push {lr} ; (str lr, [sp, #-4]!)
4: e24dd00c sub sp, sp, #12
8: ebfffffe bl 0 <__sync_
c: e3a03003 mov r3, #3
10: e58d3004 str r3, [sp, #4]
14: ebfffffe bl 0 <__sync_
18: ebfffffe bl 0 <__sync_
1c: e3a03004 mov r3, #4
20: e58d3004 str r3, [sp, #4]
24: ebfffffe bl 0 <__sync_
28: e28dd00c add sp, sp, #12
2c: e49de004 pop {lr} ; (ldr lr, [sp], #4)
30: e12fff1e bx lr
The two consecutive calls to __sync_synchronize can be collapsed into a single call.
Similar results when a target architecture is given (-mcpu=cortex-m4 -mthumb)
test_atomic.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <test_atomic>:
0: b082 sub sp, #8
2: 2203 movs r2, #3
4: f3bf 8f5b dmb ish
8: 2304 movs r3, #4
a: 9201 str r2, [sp, #4]
c: f3bf 8f5b dmb ish
10: f3bf 8f5b dmb ish
14: 9301 str r3, [sp, #4]
16: f3bf 8f5b dmb ish
1a: b002 add sp, #8
1c: 4770 bx lr
1e: bf00 nop
The two consecutive dmb instructions can be collapsed into one.