gcc emmits malformed ldr instruction when calling an incorrectly `extern`-ed function

Bug #1782834 reported by mexlez on 2018-07-20
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
GNU Arm Embedded Toolchain
Undecided
Unassigned

Bug Description

I've found what appears to be a bug that causes GCC to emit a malformed ldr instruction with no compiler/linker warnings or errors given. If the resulting binary is executed on a processor (ST nucleo_f0 https://www.st.com/en/evaluation-tools/nucleo-f091rc.html), the CPU will hardfault when it tries to execute the malformed instruction.

If a function (defined in another file) is incorrectly ("Main" is a label-of-function, not a label-of-pointer-to-function) forward-declared as an `extern` function pointer:
`extern void(*Main)(void);`
and then compiled and linked, GCC will emit a malformed ldr instruction as part of the setup to call the Main() function.

GCC also emits a suspicious-looking blx instruction instead of a bl; this is odd because the cortex-m0 can only execute Thumb instructions, and the blx instruction is used to branch and change execution mode between ARM and Thumb modes. Note that the address being branched to by the blx is correct; it's the address of Main() with the LSB set (switches to or stays in Thumb mode: http://www.keil.com/support/man/docs/armasm/armasm_dom1361289866046.htm).

Here's the relevant disassembly when the incorrect `extern` is used:
#######################################################################
void Startup(void) {
  Main();
 80000c0: 4b02 ldr r3, [pc, #8] ; (80000cc <Startup+0xc>)
void Startup(void) {
 80000c2: b510 push {r4, lr}
  Main();
 80000c4: 681b ldr r3, [r3, #0] ; <=== ##### Malformed instruction
 80000c6: 4798 blx r3 ; <=== ##### Suspicious instruction; cortex-m0 is all-thumb, so why use bl variant that can change execution mode?
 80000c8: e7fe b.n 80000c8 <Startup+0x8>
 80000ca: 46c0 nop ; (mov r8, r8)
 80000cc: 080000d5 .word 0x080000d5

080000d4 <Main>:
#include "system.h"

void Main(void) {
  return;
}
 80000d4: 4770 bx lr
 80000d6: 46c0 nop ; (mov r8, r8)
#######################################################################

According to the docs on ldr (http://www.keil.com/support/man/docs/armasm/armasm_dom1361289873425.htm, "Register Restrictions" section), the two registers provided to the ldr instruction must be different in the pre-index and post-index forms; a condition which is violated by loading r3 with an offset of 0 into r3.

Compare with the assembly emitted when either `#include`-ing the function's header file, or using a correct declaration:
`extern void Main(void);`

#######################################################################
080000c4 <Startup>:
void Startup(void) {
 80000c4: b510 push {r4, lr}
  Main();
 80000c6: f000 f801 bl 80000cc <Main> ; <=== ##### A regular branch directly to Main's address
 80000ca: e7fe b.n 80000ca <Startup+0x6>

080000cc <Main>:
#include "system.h"

void Main(void) {
  return;
}
 80000cc: 4770 bx lr
 80000ce: 46c0 nop ; (mov r8, r8)
#######################################################################

Although the `extern` statement is incorrect (Main is a function label, not a function pointer label), GCC should throw an error before it emits malformed instructions.

Find attached a small example project with Makefile that demonstrates this problem. The offending `extern` is located in system.c.

System info:
Compiler invocation:
<toolchain_root>/bin/arm-none-eabi-gcc -c -mcpu=cortex-m0 -mthumb -g -O0 -DSTM32 -I . system.c -o obj/system.o
<toolchain_root>/bin/arm-none-eabi-gcc -c -mcpu=cortex-m0 -mthumb -g -O0 -DSTM32 -I . main.c -o obj/main.o
<toolchain_root>/bin/arm-none-eabi-gcc obj/system.o obj/main.o -mcpu=cortex-m0 -mthumb -T linker_script.ld -nostartfiles -o main.elf

Host OS: Ubuntu 18.04
Optimization levels checked: -O3 and -O0

Binary Linux releases checked:
gcc-arm-none-eabi-7-2017-q4-major
gcc-arm-none-eabi-7-2018-q2-update

mexlez (mexlez) wrote :

The ldr is not malformed. The thumb blx instruction is for indirect branches (branch to address contained in register). Everything the compiler is doing is correct based on the input you are giving it. You are effectively telling the compiler that Main is the address of a function pointer sized variable and you want it to call a function whose address is contained in that variable. It does that by first loading the address of the variable from a literal pool. Since Main is really a thumb function the value in the literal pool is an odd value. This odd value is loaded into r3. The compiler then needs to get the contents at that address (remember, you are asking the compiler to call the function which is pointed at by the variable, you are not asking it to call Main, because Main is not a function it is a variable with a pointer to a function) and does so with the ldr. This causes an unaligned access because the address being accessed is odd. This is not the compilers fault, and in fact the compiler doesn't know it will happen, because it isn't until the linker has run that the actual address of Main is resolved. Even if the processor supported unaligned accesses, it would still fail, because it would go on to load the actual bytes of code at the start of Main into r3 and then branch to it as if it represented an actual address of a function that has been put into that variable by another part of the program. In summary, the compiler is doing exactly what you are telling it to do, and there is no way it can tell that it is wrong.

Thanks Thomas for the great explanation.

Changed in gcc-arm-embedded:
status: New → Invalid
mexlez (mexlez) wrote :

Thanks Thomas! Everything is clear now.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers