parallel builds fail on Windows due to bug in MinGW-w64 used to build binutils

Bug #1848002 reported by Ievgenii Meshcheriakov
32
This bug affects 6 people
Affects Status Importance Assigned to Milestone
GNU Arm Embedded Toolchain
Fix Released
High
Przemyslaw Wirkus

Bug Description

Parallel builds that use GNU ar, objcopy, or strip may fail or generate corrupted files on Windows due to unsafe temporary file handling by the version of MinGW-w64 used to build binutils. The usual symptom are messages like: "unable to rename: <file name>; reason: No such file or directory" or "unable to rename: <file name>; reason: Permission denied", or silent output file corruption. This applies to the latest currently available version of the toolchain (8-2019-q3-update) and at least some previous versions.

The root cause of the issue is usage of MinGW-w64 that is missing this fix for mkstemp(): https://github.com/mirror/mingw-w64/commit/76119a8e8938dd23cdb4fe72843723fe4d4cc121#diff-cc1bd3d845f9a0ff2d38778e12b84a17

The presence of _O_TEMPORARY flag can be verified by disassembling arm-none-eabi-ar:

  3 loc_48f97b:
  4 0048f97b mov dword [esp+0x3c+var_30], 0x180 ; CODE XREF=sub_48f8c0+137
  5 0048f983 mov dword [esp+0x3c+var_34], 0x10
  6 0048f98b mov dword [esp+0x3c+var_38], 0x8542
  7 0048f993 mov dword [esp+0x3c+var_3C], edi
  8 0048f996 call j__sopen ; _sopen
  9 0048f99b cmp eax, 0xffffffff

0x8542 represents flags passed to _sopen, and 0x40 is _O_TEMPORARY.

Having _O_TEMPORARY here together with code in binutils that closes the returned file descriptor immediately after it is created (https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=binutils/bucomm.c;h=06fbc462e242467fe6f15f46e8515d8ddf6b1456;hb=7e27a9d5f22f9f7ead11738b1546d0b5c737266b#l523) creates a race condition allowing multiple processes to use the same temporary file. Usage of rand() for generating candidate file names in mkstemp() makes this bug relatively easy to reproduce.

To reproduce the bug try creating multiple archives in the _same_ _directory_ using arm-none-eabi-ar in parallel. This may require multiple iterations. I'm attaching a script that can be used to generate build.ninja file that will attempt to create multiple archives in parallel. It requires modification to specify input archives (any should work).

Fixing this bug should be as easy as rebuilding the toolchain using a newer version of MinGW-w64 (or MinGW).

Revision history for this message
Ievgenii Meshcheriakov (ievgenii-meshcheriakov) wrote :
description: updated
Revision history for this message
Joey Ye (jinyun-ye) wrote :

Ievgenii,

Thanks for reporting. I can reproduce the error with your script with a little addtional work.

Joey

Changed in gcc-arm-embedded:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Joey Ye (jinyun-ye)
milestone: none → 8-2018-q4-major
milestone: 8-2018-q4-major → none
Revision history for this message
Julien (atsju2) wrote :

Hello,
I have a bug and possibly this ticket is the cause. I have not been able to reproduce the bug on my side with your script. Joey could you share your modified script ?

If the cause and my understanding are correct the bug does not come from gcc code and the fix would be to build current sources with a MinGW-w64 version 5.0 or more (2016 or newer).

Is this fix planned in next maintenance release ? When will next maintenance release be done ?

Revision history for this message
Joey Ye (jinyun-ye) wrote :

Julien,

The gen.sh script didn't provide input1.o or input2.o. I just modified to created two very simple input1/2.c and generated input1/2.o. Then with a ninjia command I reproduced it successfully.

You can just create your own input1/2.o, to see if the issue is reproducible. If you already have your .o but still cannot reproduce, then my modification is not going to help you make more progress.

Thanks,
Joey

Revision history for this message
Joey Ye (jinyun-ye) wrote :

Upgrading the build machine to ubuntu 18.04 resolves this problem.

Changed in gcc-arm-embedded:
milestone: none → 9-2020-q2
status: Confirmed → Fix Committed
Revision history for this message
Joey Ye (jinyun-ye) wrote :

Please visit https://github.com/ARM-software/toolchain-gnu-bare-metal to build the binary that has this issue fixed.

The docker image in the github project ubuntu 18.04 so this issue should have been resolved there.

Revision history for this message
Przemyslaw Wirkus (wirkus) wrote :

Julien,
to reproduce do below steps on Windows:

$ echo "int foo() {return 0;}" > input1.c
$ echo "int bar() {return 3;}" > input2.c

$ ./gcc-arm-none-eabi-9-2019-q4-major-win32/bin/arm-none-eabi-gcc.exe -c input1.c
$ ./gcc-arm-none-eabi-9-2019-q4-major-win32/bin/arm-none-eabi-gcc.exe -c input2.c

$ ./gen.sh > build.ninja
$ ./ninja.exe

$ PATH=./gcc-arm-none-eabi-9-2019-q4-major-win32/bin:$PATH ./ninja.exe
[1/1000] Creating archive out-61.a
FAILED: out-61.a
'cmd.exe /C "arm-none-eabi-ar qc out-61.a input1.o input2.o && arm-none-eabi-ranlib out-61.a'
CreateProcess failed: The system cannot find the file specified.
[2/1000] Creating archive out-68.a
FAILED: out-68.a
'cmd.exe /C "arm-none-eabi-ar qc out-68.a input1.o input2.o && arm-none-eabi-ranlib out-68.a'
CreateProcess failed: The system cannot find the file specified.
[3/1000] Creating archive out-77.a
FAILED: out-77.a
'cmd.exe /C "arm-none-eabi-ar qc out-77.a input1.o input2.o && arm-none-eabi-ranlib out-77.a'
CreateProcess failed: The system cannot find the file specified.
[4/1000] Creating archive out-27.a
FAILED: out-27.a
'cmd.exe /C "arm-none-eabi-ar qc out-27.a input1.o input2.o && arm-none-eabi-ranlib out-27.a'
CreateProcess failed: The system cannot find the file specified.
[5/1000] Creating archive out-31.a
FAILED: out-31.a
'cmd.exe /C "arm-none-eabi-ar qc out-31.a input1.o input2.o && arm-none-eabi-ranlib out-31.a'
CreateProcess failed: The system cannot find the file specified.
[6/1000] Creating archive out-37.a
FAILED: out-37.a
'cmd.exe /C "arm-none-eabi-ar qc out-37.a input1.o input2.o && arm-none-eabi-ranlib out-37.a'
CreateProcess failed: The system cannot find the file specified.
ninja: build stopped: subcommand failed.

Changed in gcc-arm-embedded:
assignee: Joey Ye (jinyun-ye) → Przemyslaw Wirkus (wirkus)
Revision history for this message
Ievgenii Meshcheriakov (ievgenii-meshcheriakov) wrote : Re: [Bug 1848002] Re: parallel builds fail on Windows due to bug in MinGW-w64 used to build binutils

________________________________
From: <email address hidden> <email address hidden> on behalf of Przemyslaw Wirkus <email address hidden>
Sent: Tuesday, March 10, 2020 12:57 PM
To: Meshcheriakov, Ievgenii <email address hidden>
Subject: [Bug 1848002] Re: parallel builds fail on Windows due to bug in MinGW-w64 used to build binutils

> $ PATH=./gcc-arm-none-eabi-9-2019-q4-major-win32/bin:$PATH ./ninja.exe
> [1/1000] Creating archive out-61.a
> FAILED: out-61.a
> 'cmd.exe /C "arm-none-eabi-ar qc out-61.a input1.o input2.o && arm-none-eabi-ranlib out-61.a'
> CreateProcess failed: The system cannot find the file specified.

Just a clarification: I don't think the error here is because of this bug. There is most likely something wrong with PATH specification above, and either cmd.exe or arm-none-eabi-ar could not be found.

Revision history for this message
Przemyslaw Wirkus (wirkus) wrote :

Ah, let me double check that.

Revision history for this message
Przemyslaw Wirkus (wirkus) wrote :

OK, I had a typo as I hand crafted build.ninja. This should be better:

$ PATH=./gcc-arm-none-eabi-9-2020-q1-update-win32/bin:$PATH ./ninja.exe
[1/1000] Creating archive out-72.a
[2/1000] Creating archive out-48.a
[3/1000] Creating archive out-62.a
[4/1000] Creating archive out-66.a
[5/1000] Creating archive out-40.a
[6/1000] Creating archive out-61.a
[7/1000] Creating archive out-58.a
FAILED: out-58.a
cmd.exe /C "gcc-arm-none-eabi-9-2019-q4-major-win32\bin\arm-none-eabi-ar qc out-58.a input1.o input2.o && gcc-arm-none-eabi-9-2019-q4-major-win32\bin\arm-none-eabi-ranlib out-58.a"
gcc-arm-none-eabi-9-2019-q4-major-win32\bin\arm-none-eabi-ranlib: unable to rename 'out-58.a'; reason: Permission denied
[8/1000] Creating archive out-74.a
[9/1000] Creating archive out-60.a
[10/1000] Creating archive out-69.a
[11/1000] Creating archive out-27.a
[12/1000] Creating archive out-39.a
ninja: build stopped: subcommand failed.

Revision history for this message
Przemyslaw Wirkus (wirkus) wrote :

This will be fixed with "GNU Arm Embedded Toolchain 9-2020-q2" toolchain release.

Changed in gcc-arm-embedded:
status: Fix Committed → Fix Released
Revision history for this message
Przemyslaw Wirkus (wirkus) wrote :
Download full text (4.0 KiB)

Hi,

This should be fixed with "GNU Arm Embedded Toolchain 9-2020-q2" toolchain release.
https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-rm/downloads

Przemyslaw

> -----Original Message-----
> From: <email address hidden> <email address hidden> On Behalf Of
> Bhargav Shah
> Sent: 22 June 2020 10:42
> To: Przemyslaw Wirkus <email address hidden>
> Subject: [Bug 1848002] Re: parallel builds fail on Windows due to bug in
> MinGW-w64 used to build binutils
>
> I am facing this issue with GNU Tools Arm Embedded\9 2019-q4-major.
>
> C:\PROGRA~2\GNUTOO~1\92019-~1\bin\AR17F9~1.EXE: unable to rename
> 'debug\libtest.a'; reason: Permission denied
>
> I hope fix is present in "9 2019-q4-major" version.
>
> Is there any way to verify the fix is present or not?
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1848002
>
> Title:
> parallel builds fail on Windows due to bug in MinGW-w64 used to build
> binutils
>
> Status in GNU Arm Embedded Toolchain:
> Fix Released
>
> Bug description:
> Parallel builds that use GNU ar, objcopy, or strip may fail or
> generate corrupted files on Windows due to unsafe temporary file
> handling by the version of MinGW-w64 used to build binutils. The usual
> symptom are messages like: "unable to rename: <file name>; reason: No
> such file or directory" or "unable to rename: <file name>; reason:
> Permission denied", or silent output file corruption. This applies to
> the latest currently available version of the toolchain
> (8-2019-q3-update) and at least some previous versions.
>
> The root cause of the issue is usage of MinGW-w64 that is missing this
> fix for mkstemp():
> https://github.com/mirror/mingw-
> w64/commit/76119a8e8938dd23cdb4fe72843723fe4d4cc121
> #diff-cc1bd3d845f9a0ff2d38778e12b84a17
>
> The presence of _O_TEMPORARY flag can be verified by disassembling
> arm-none-eabi-ar:
>
> 3 loc_48f97b:
> 4 0048f97b mov dword [esp+0x3c+var_30], 0x180 ;
> CODE XREF=sub_48f8c0+137
> 5 0048f983 mov dword [esp+0x3c+var_34], 0x10
> 6 0048f98b mov dword [esp+0x3c+var_38], 0x8542
> 7 0048f993 mov dword [esp+0x3c+var_3C], edi
> 8 0048f996 call j__sopen ; _sopen
> 9 0048f99b cmp eax, 0xffffffff
>
> 0x8542 represents flags passed to _sopen, and 0x40 is _O_TEMPORARY.
>
> Having _O_TEMPORARY here together with code in binutils that closes
> the returned file descriptor immediately after it is created
> (https://sourceware.org/git/gitweb.cgi?p=binutils-
>
> gdb.git;a=blob;f=binutils/bucomm.c;h=06fbc462e242467fe6f15f46e8515d8d
> df6b1456;hb=7e27a9d5f22f9f7ead11738b1546d0b5c737266b#l523)
> creates a race condition allowing multiple processes to use the same
> temporary file. Usage of rand() for generating candidate file names in
> mkstemp() makes this bug relatively easy to reproduce.
>
> To reproduce the ...

Read more...

Revision history for this message
Przemyslaw Wirkus (wirkus) wrote :

[snip...]

> Please ignore my previous comment. I Overlooked fixed version. I dont see
> this issue after moving to "GNU Arm Embedded Toolchain 9-2020-q2".

I'm very pleased to hear that !

Kind regards,
Przemyslaw
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.