Invalid code generation at high optimization level

Bug #1632494 reported by Nickolay Kandaratskov
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
GNU Arm Embedded Toolchain
Invalid
Undecided
Unassigned

Bug Description

Invalid code generated at optimization level >= -O2.
Generated code may be reviewed from assembly output:
---------------
   ldr r3, [r3, #36] @ dhcp_rx_options_val, dhcp_rx_options_val
   mov r1, r4 @, tmp1173
   rev r3, r3 @ D.6523, dhcp_rx_options_val
   movs r0, #1 @,
   str r3, [sp, #36] @ D.6523, dns_addr.addr
   bl dns_setserver @
.L334:
   ldr r1, [r3, #48] @ D.6532, dhcp_37->offered_t0_lease !!!! <- BUG !!!!!
   b .L335 @
.L332:
---------------
At label .L334 'r3' register does not points to 'dhcp_37' anymore ('dhcp' variable in C code). This causes crash on Cortex-M3 (MPU enabled to track NULL accesses (first 4KB region)).
Host: Mac OS
gcc version: arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 5.4.1 20160919 (release) [ARM/embedded-5-branch revision 240496]
Invocation: arm-none-eabi-gcc -mthumb -mcpu=cortex-m3 -I. -S -fverbose-asm -O3 dhcp.c

Revision history for this message
Nickolay Kandaratskov (concorde) wrote :
Revision history for this message
Nickolay Kandaratskov (concorde) wrote :

This place may be found by searching for 2nd occurrence of 'dns_setserver' referencing (assembly output).

summary: - Invalid core generation at high optimization level
+ Invalid code generation at high optimization level
Revision history for this message
Nickolay Kandaratskov (concorde) wrote :

Yet another bug. Looks like gcc 5 optimization is broken at high level.
Generated code (see aes.s):
------------------
.L3:
   cmp r2, #1 @ keyLen,
   strb r1, [r0, #248] @ aesMode, aes_7(D)->cbc
   beq .L6 @,
   bcc .L7 @,
   cmp r2, #2 @ keyLen,
   bne .L13 @,
   movs r1, #8 @ tmp190,
   movs r2, #14 @ tmp191,
   str r1, [r0, #240] @ tmp190, aes_7(D)->nk
   str r2, [r0, #244] @ tmp191, aes_7(D)->nr
.L9:
   subs r3, r3, #1 @ ivtmp.44, key,
   subs r0, r0, #1 @ ivtmp.49, aes,
.L10: !!!!! BUG LOOP !!!!!!!
   ldrb r2, [r3, #1]! @ zero_extendqisi2 @ D.5198, MEM[base: _99, offset: 0B]
   strb r2, [r0, #1]! @ D.5198, MEM[base: _100, offset: 0B]
   b .L10 @
.L13:
   movs r0, #0 @,
   bx lr @
.L6:
   movs r1, #6 @ tmp188,
   movs r2, #12 @ tmp189,
   str r1, [r0, #240] @ tmp188, aes_7(D)->nk
   str r2, [r0, #244] @ tmp189, aes_7(D)->nr
   b .L9 @
.L7:
   movs r1, #4 @ tmp186,
   movs r2, #10 @ tmp187,
   str r1, [r0, #240] @ tmp186, aes_7(D)->nk
   str r2, [r0, #244] @ tmp187, aes_7(D)->nr
------------------
As you can see at label '.L10' there's infinite loop (loading r2 from r3 ptr, storing into r0 ptr) without any condition check (until mcu bus fault).
This bug appears only at -O2 level, it disappears at -O3 (but code is still not working due to some other problem - not found yet).
Can anyone check this on another host os ? May be this is host os problem ? (unlikely)

Host: Mac OS
gcc version: arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 5.4.1 20160919 (release) [ARM/embedded-5-branch revision 240496]
Invocation: arm-none-eabi-gcc -mthumb -mcpu=cortex-m3 -I. -S -fverbose-asm -O2 aes.c

Revision history for this message
Andre Vieira (andre-simoesdiasvieira) wrote :

Hi Nicolay,

I have had a look at your first code generation issue. I was able to reproduce it and reduce the problem. However, it highlighted an issue with your code.

You have the following code:
...
#define DHCP_OPTION_IDX_DNS_SERVER 8
#define DHCP_OPTION_IDX_MAX (DHCP_OPTION_IDX_DNS_SERVER + DNS_MAX_SERVERS)
...
u8_t dhcp_rx_options_given[DHCP_OPTION_IDX_MAX];

#define dhcp_option_given(dhcp, idx) (dhcp_rx_options_given[idx] != 0)
...
while(dhcp_option_given(dhcp, DHCP_OPTION_IDX_DNS_SERVER + n) && (n < DNS_MAX_SERVERS))

Your first check 'dhcp_option_given' has an array access that will go out of bounds for 'n + DHCP_OPTION_IDX_DNS_SERVER > DHCP_OPTION_IDX_MAX', you check this condition afterwards, but by then the array access has already been done and GCC sees this as undefined behavior at which point it is free to do as its heart desires.

If you swap around the two conditions, this is no longer the case and GCC will produce correct code.

Let me know if this also solves your second issue.

Best Regards,
Andre

Revision history for this message
Nickolay Kandaratskov (concorde) wrote :

Yes, this is a bug (actually not mine, but lwip author).
However, this is nothing to do with actual bug that I posted. The problem is in 'r3' register which contains not the pointer but previous 'dhcp_rx_options_val' calculation. You can see 'rev r3, r3' opcode (which is actually 'lwip_ntohl' function return value. Nobody does this with a pointer ('r3' used later as pointer to access the 'dhcp' structure).
And second bug 'gcc_bug2' has completely different nature (compiler generates infinite loop):
----------------
.L10: !!!!! BUG LOOP !!!!!!!
   ldrb r2, [r3, #1]! @ zero_extendqisi2 @ D.5198, MEM[base: _99, offset: 0B]
   strb r2, [r0, #1]! @ D.5198, MEM[base: _100, offset: 0B]
   b .L10 @
----------------

Revision history for this message
Andre Vieira (andre-simoesdiasvieira) wrote :

Hi Nickolay,

I will have a look at the second bug separately, haven't had the time to look at that yet. But as for the first bug, I don't really follow you.

If I swap around the conditions, getting rid of the undefined behavior, the generated code looks alright:

        ldrb r2, [r5, #9] @ zero_extendqisi2 @ dhcp_rx_options_given, dhcp_rx_options_given
        ldr r3, .L373+4 @ tmp1197,
        cbz r2, .L277 @ dhcp_rx_options_given,
        ldr r3, [r3, #36] @ dhcp_rx_options_val, dhcp_rx_options_val
        mov r1, r4 @, tmp1199
        rev r3, r3 @ D.6421, dhcp_rx_options_val
        movs r0, #1 @,
        str r3, [sp, #36] @ D.6421, dns_addr.addr
        bl dns_setserver @
.L277:
        ldr r3, [sp, #24] @ arg, %sfp
        ldr r4, [r3, #32] @ dhcp, MEM[(struct netif *)arg_3(D)].dhcp
        ldrb r3, [r4, #12] @ zero_extendqisi2 @ dhcp_28->state, dhcp_28->state
        cmp r3, #8 @ dhcp_28->state,
        beq .L279 @,

 The rev is there because of the define of lwip_htonl(a) to __builtin_bswap32 which inlines the use of the rev instructions and as you can see, without the undefined behavior, dhcp seems to be reloaded. Are you still seeing some erroneous behavior here?

Regards,
Andre

Revision history for this message
Nickolay Kandaratskov (concorde) wrote :

Yes, you're right. This fixed the problem (second case too). Thank you very much.

Changed in gcc-arm-embedded:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.