FTBFS in Eoan - Error: operand type mismatch for `push' - gcc 9.2.1 / binutils 2.32.51.20190905-0ubuntu1

Bug #1843394 reported by Christian Ehrhardt 
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
binutils
Fix Released
Medium
binutils (Ubuntu)
Fix Released
Critical
Matthias Klose
ipxe (Ubuntu)
Invalid
High
Unassigned

Bug Description

This might be due to new gcc-9 being more strict, but the build that worked before now fails with:

arch/x86_64/core/gdbidt.S: Assembler messages:
arch/x86_64/core/gdbidt.S:109: Error: operand type mismatch for `push'
arch/x86_64/core/gdbidt.S:110: Error: operand type mismatch for `push'
arch/x86_64/core/gdbidt.S:161: Error: operand type mismatch for `pop'
arch/x86_64/core/gdbidt.S:162: Error: operand type mismatch for `pop'
make[2]: *** [Makefile.housekeeping:937: bin-x86_64-efi/gdbidt.o] Error 1

Full log at: https://launchpadlibrarian.net/441262285/buildlog_ubuntu-eoan-amd64.ipxe_1.0.0+git-20190109.133f4c4-0ubuntu2_BUILDING.txt.gz

Now all of this is about push/pop of %fs and %gs.
That needs to match the size of the registers which depend on the current running mode.

In this particular case in ./src/arch/x86_64/core/gdbidt.S
The failing file is in ".code64" mode.
In that I'd expect %gs/%fs to be 64 bit.

Usually we see push/pop "w" in .code16 (word), l in .code32 (long) but in that sense here q (quad word) seems right at first (should be what correctly matches the .code64).
That matches what I see throughout the ipxe code but also throughout the archive https://codesearch.debian.net/search?q=pop%5Ba-z%5D.*%25fs&literal=0&page=2

Maybe I misread the mode it is in, or it is actually a false positives.
Or the sizes of FS/GS do not change - haven't touched them in a loooong time.
Was it that segment registers didn't change size?
I'll need to do a few checks to first see what the compiler would expect there and from there need to understand this.

The command used also points to AS being in 64 bit mode when this happens:
gcc -E -DARCH=x86_64 -DPLATFORM=efi -DSECUREBOOT=0 -fstrength-reduce -fomit-frame-pointer -falign-jumps=1 -falign-loops=1 -falign-functions=1 -m64 -mno-mmx -mno-sse -fshort-wchar -Ui386 -Ulinux -DNVALGRIND -fpie -mno-red-zone -Iinclude -I. -Iarch/x86/include -Iarch/x86_64/include -Iarch/x86_64/include/efi -Os -g -ffreestanding -Wall -W -Wformat-nonliteral -fno-stack-protector -fno-dwarf2-cfi-asm -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -Wno-address -Wno-stringop-truncation -ffunction-sections -fdata-sections -include include/compiler.h -DASM_TCHAR='@' -DASM_TCHAR_OPS='@' -DASSEMBLY -DOBJECT=gdbidt arch/x86_64/core/gdbidt.S | as --64 -o bin-x86_64-efi/gdbidt.o

Related branches

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I have an experimental fix, but I sort of do not trust it yet as I struggle to recreate the same issue with a build from git. But the failing code still is the same.
So I want to check if there as a (very different) way committed upstream to fix this.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Proper build (with that I can recreate the problem on a build from upstream git) for the x86_64 code is:

cd src
make veryclean
export CONFIG=qemu
cat <<EOF >config/local/general.h
#define ROM_BANNER_TIMEOUT 0
#define NET_PROTO_IPV6
#define DOWNLOAD_PROTO_NFS
EOF
make V=1 bin-x86_64-efi/ipxe.efi

summary: - FTBFS in Eoan - Error: operand type mismatch for `push'
+ FTBFS in Eoan - Error: operand type mismatch for `push' - gcc 9.2.1
Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: FTBFS in Eoan - Error: operand type mismatch for `push' - gcc 9.2.1

I wanted to test that gdb stub, to ensure the fix doesn't break things.

But all our usual ipxe components don't have the gdbstub.
The only place this object file leads to is in the last component of:
  debian/util/geniso src/bin-efi/ipxe.iso src/bin/ipxe.lkrn src/bin-x86_64-efi/ipxe.efi

None of the other roms gets linked with the code that failed to build.

Booting the iso with an efi loader (to get the right one loaded):
  qemu-system-x86_64 -enable-kvm -pflash /usr/share/OVMF/OVMF_CODE.fd -hda /usr/lib/ipxe/ipxe.iso -nographic

Still has no gdbstub feature, which matches the config as we have nothing of [1] set.
We can check we booted the right thing seeing:
  Features: DNS HTTP HTTPS iSCSI NFS TFTP SRP AoE EFI Menu
If I boot via seabios instead of ovms I get the bios version on the ISO which is different.
Features: DNS HTTP HTTPS iSCSI NFS TFTP SRP AoE ELF MBOOT PXE bzImage Menu PXEXT

Also none of the objects created by ipxe has any "gdb" string or symbol in it.

So it seems that this code is compiled, but eventually unused.
Which if true means that it isn't too important how good/bad the fix is as long as it gets the build going.
I still hope for upstream feedback on it ...

[1]: http://ipxe.org/buildcfg/gdbudp

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I have had another dev try this on fedora which also has gcc 9.2.1 and it works there.

But since it fails in "as" which is actually binutils lets add that as a bug task.
Maybe binutils developers know something about this?

summary: - FTBFS in Eoan - Error: operand type mismatch for `push' - gcc 9.2.1
+ FTBFS in Eoan - Error: operand type mismatch for `push' - gcc 9.2.1 /
+ binutils 2.32.51.20190905-0ubuntu1
Matthias Klose (doko)
tags: added: ftbfs
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI: The comparison to Fedora (there it still works) uses binutils 2.31.1-29.fc30.x86_64 still.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Quoting a post of a ipxe related developer that did a nice check realizing that e.g. disassembly always lists pushq (and opcode stays the same) - thanks to Valentine for these checks!:

AFAIU, segment registers
are 16-bit long, however, "pushl" and "pushq" should also be valid and
produce the same opcode.

Quite interesting that obdump -S always shows "pushq" even when
"push" or "pushl" is used in the original source.

Please, take a look at the following log:

<email address hidden>:/tmp$ cat push.S
 push %gs
 push %fs
<email address hidden>:/tmp$ as --64 push.S -o push.out
<email address hidden>:/tmp$ objdump -S push.out

push.out: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
   0: 0f a8 pushq %gs
   2: 0f a0 pushq %fs
<email address hidden>:/tmp$ cat pushl.S
 pushl %gs
 pushl %fs
<email address hidden>:/tmp$ as --64 pushl.S -o pushl.out
<email address hidden>:/tmp$ objdump -S pushl.out

pushl.out: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
   0: 0f a8 pushq %gs
   2: 0f a0 pushq %fs
<email address hidden>:/tmp$ cat pushq.S
 pushq %gs
 pushq %fs
<email address hidden>:/tmp$ as --64 pushq.S -o pushq.out
pushq.S: Assembler messages:
pushq.S:1: Error: operand type mismatch for `push'
pushq.S:2: Error: operand type mismatch for `push'

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Slightly extending on that test by Valentine I really think this is a change in binutils behavior.
Attaching a script to test which I then will add logs from Disco (older 2.32) and Eoan (2.32.51).

The script does:
- test pop/push with suffixes "" w l q
- test this in code 16/32/64 blocks

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Result Disco:
### Testing suffix '' ###

.code16
push %gs
push %fs
pop %gs
pop %fs
.code32
push %gs
push %fs
pop %gs
pop %fs
.code64
push %gs
push %fs
pop %gs
pop %fs

push.out: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
   0: 0f a8 pushq %gs
   2: 0f a0 pushq %fs
   4: 0f a9 popq %gs
   6: 0f a1 popq %fs
   8: 0f a8 pushq %gs
   a: 0f a0 pushq %fs
   c: 0f a9 popq %gs
   e: 0f a1 popq %fs
  10: 0f a8 pushq %gs
  12: 0f a0 pushq %fs
  14: 0f a9 popq %gs
  16: 0f a1 popq %fs

### Testing suffix 'w' ###

.code16
pushw %gs
pushw %fs
popw %gs
popw %fs
.code32
pushw %gs
pushw %fs
popw %gs
popw %fs
.code64
pushw %gs
pushw %fs
popw %gs
popw %fs

push.out: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
   0: 0f a8 pushq %gs
   2: 0f a0 pushq %fs
   4: 0f a9 popq %gs
   6: 0f a1 popq %fs
   8: 66 0f a8 pushw %gs
   b: 66 0f a0 pushw %fs
   e: 66 0f a9 popw %gs
  11: 66 0f a1 popw %fs
  14: 66 0f a8 pushw %gs
  17: 66 0f a0 pushw %fs
  1a: 66 0f a9 popw %gs
  1d: 66 0f a1 popw %fs

### Testing suffix 'l' ###

.code16
pushl %gs
pushl %fs
popl %gs
popl %fs
.code32
pushl %gs
pushl %fs
popl %gs
popl %fs
.code64
pushl %gs
pushl %fs
popl %gs
popl %fs
push.S: Assembler messages:
push.S:12: Error: invalid instruction suffix for `push'
push.S:13: Error: invalid instruction suffix for `push'
push.S:14: Error: invalid instruction suffix for `pop'
push.S:15: Error: invalid instruction suffix for `pop'

### Testing suffix 'q' ###

.code16
pushq %gs
pushq %fs
popq %gs
popq %fs
.code32
pushq %gs
pushq %fs
popq %gs
popq %fs
.code64
pushq %gs
pushq %fs
popq %gs
popq %fs
push.S: Assembler messages:
push.S:2: Error: unsupported instruction `push'
push.S:3: Error: unsupported instruction `push'
push.S:4: Error: unsupported instruction `pop'
push.S:5: Error: unsupported instruction `pop'
push.S:7: Error: unsupported instruction `push'
push.S:8: Error: unsupported instruction `push'
push.S:9: Error: unsupported instruction `pop'
push.S:10: Error: unsupported instruction `pop'

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Result Eoan:
### Testing suffix '' ###

.code16
push %gs
push %fs
pop %gs
pop %fs
.code32
push %gs
push %fs
pop %gs
pop %fs
.code64
push %gs
push %fs
pop %gs
pop %fs

push.out: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
   0: 0f a8 pushq %gs
   2: 0f a0 pushq %fs
   4: 0f a9 popq %gs
   6: 0f a1 popq %fs
   8: 0f a8 pushq %gs
   a: 0f a0 pushq %fs
   c: 0f a9 popq %gs
   e: 0f a1 popq %fs
  10: 0f a8 pushq %gs
  12: 0f a0 pushq %fs
  14: 0f a9 popq %gs
  16: 0f a1 popq %fs

### Testing suffix 'w' ###

.code16
pushw %gs
pushw %fs
popw %gs
popw %fs
.code32
pushw %gs
pushw %fs
popw %gs
popw %fs
.code64
pushw %gs
pushw %fs
popw %gs
popw %fs

push.out: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
   0: 0f a8 pushq %gs
   2: 0f a0 pushq %fs
   4: 0f a9 popq %gs
   6: 0f a1 popq %fs
   8: 66 0f a8 pushw %gs
   b: 66 0f a0 pushw %fs
   e: 66 0f a9 popw %gs
  11: 66 0f a1 popw %fs
  14: 66 0f a8 pushw %gs
  17: 66 0f a0 pushw %fs
  1a: 66 0f a9 popw %gs
  1d: 66 0f a1 popw %fs

### Testing suffix 'l' ###

.code16
pushl %gs
pushl %fs
popl %gs
popl %fs
.code32
pushl %gs
pushl %fs
popl %gs
popl %fs
.code64
pushl %gs
pushl %fs
popl %gs
popl %fs

push.out: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
   0: 66 0f a8 pushw %gs
   3: 66 0f a0 pushw %fs
   6: 66 0f a9 popw %gs
   9: 66 0f a1 popw %fs
   c: 0f a8 pushq %gs
   e: 0f a0 pushq %fs
  10: 0f a9 popq %gs
  12: 0f a1 popq %fs
  14: 0f a8 pushq %gs
  16: 0f a0 pushq %fs
  18: 0f a9 popq %gs
  1a: 0f a1 popq %fs

### Testing suffix 'q' ###

.code16
pushq %gs
pushq %fs
popq %gs
popq %fs
.code32
pushq %gs
pushq %fs
popq %gs
popq %fs
.code64
pushq %gs
pushq %fs
popq %gs
popq %fs
push.S: Assembler messages:
push.S:2: Error: unsupported instruction `push'
push.S:3: Error: unsupported instruction `push'
push.S:4: Error: unsupported instruction `pop'
push.S:5: Error: unsupported instruction `pop'
push.S:7: Error: unsupported instruction `push'
push.S:8: Error: unsupported instruction `push'
push.S:9: Error: unsupported instruction `pop'
push.S:10: Error: unsupported instruction `pop'
push.S:12: Error: operand type mismatch for `push'
push.S:13: Error: operand type mismatch for `push'
push.S:14: Error: operand type mismatch for `pop'
push.S:15: Error: operand type mismatch for `pop'

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Summary D vs E:
- no suffix
  => works equally in both releases
  => same opcodes in all .code segments
- suffix "w"
  => works equally in both releases
  => opcodes in .code32/.code64 differ from .code16 (660f..)
  => .code16 matches the non-suffix opcodes (0f..)
- suffix "l"
  => failures in Disco, works in Eoan
  => .code16 opcodes match the non-.code16 of the "w" suffix (660f..)
  => .code32/.code64 are back to the base opcode (0f..)
  => If I remove the failing .code64 from disco then .code16/.code32 is the same as in Eoan
- suffix "q"
  => different failures in Disco and Eoan
  => in Disco .code16/.code32 fails
  => in Disco .code64 generates the basic opcode (0f..)
  => in Eoan all three .code segments fail

Therefore it seems this part had major changes.
Not sure what to do, is this a bug in binutils that needs to be fixed?
Or was it a bug in IPXE that now is exposed?

I'd appreciate help by binutils-people.
@Doko when you read that you might ask some of your contacts maybe?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Note: given that this went into the archive in early July and that nothing but this section in IPXE broke I lean toward assuming that the ipxe code is broken or at least (while it worked in the past) not implemented the way it should be for the assembly to work.
Yet I'm not feeling experienced enough to really feel confident about my suggested fix without further upstream feedback or other people backing me up.

Revision history for this message
In , Christian Ehrhardt  (paelzer) wrote :

Hi,
I have found that with recent binutils ipxe was no more compiling.
It broke on a section doing essentially (simplified) doing:

.code64
pushq %gs
pushq %fs
popq %gs
popq %fs

The fail I got was like:
push.S:2: Error: unsupported instruction `push'
push.S:3: Error: unsupported instruction `push'
push.S:4: Error: unsupported instruction `pop'
push.S:5: Error: unsupported instruction `pop'

There is more about that in Ubuntu bug https://bugs.launchpad.net/ubuntu/+source/binutils/+bug/1843394

But I'm reaching out to you here as I was wondering if it is a bug in gas (unlikely, but it is a change in behavior for sure).

I found that even older 2.32 are ok.
2.32-7ubuntu4 - ok
2.32.51.20190905-0ubuntu1 - fails.

I was wondering and hoping you could help me on three things:
1. was the change to no more build that assembly intentional and could one maybe link to a commit for that?
2. if the change was intentional is there a best practice how the code should be modified to work again?
3. I was checking the behavior on push/pop %fs/gs and it confused me so I wonder if the current state is correct (see attachment and below).

For #3 some details:
Summary D vs E:
- no suffix
  => works equally in both releases
  => same opcodes in all .code segments
- suffix "w"
  => works equally in both releases
  => opcodes in .code32/.code64 differ from .code16 (660f..)
  => .code16 matches the non-suffix opcodes (0f..)
- suffix "l"
  => failures in Disco, works in Eoan
  => .code16 opcodes match the non-.code16 of the "w" suffix (660f..)
  => .code32/.code64 are back to the base opcode (0f..)
  => If I remove the failing .code64 from disco then .code16/.code32 is the same as in Eoan
- suffix "q"
  => different failures in Disco and Eoan
  => in Disco .code16/.code32 fails
  => in Disco .code64 generates the basic opcode (0f..)
  => in Eoan all three .code segments fail
I wonder if all of that is correct?

P.S. I can break this up into multiple bugs if you tell me which section you want to discuss separately.
P.P.S. I have asked upstream IPXE for feedback for (a pure gut feeling) change, but there was no response yet (http://lists.ipxe.org/pipermail/ipxe-devel/2019-September/006763.html)

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI reported upstream: https://sourceware.org/bugzilla/show_bug.cgi?id=25012 and linked in this bug.

Changed in binutils:
importance: Unknown → Medium
status: Unknown → New
Revision history for this message
In , Hjl-tools (hjl-tools) wrote :

It is caused by

commit 21df382b918888de64749e977f185c4e10a5b838
Author: Jan Beulich <email address hidden>
Date: Tue Jul 16 09:30:29 2019 +0200

    x86: fold SReg{2,3}

    They're the only exception to there generally being no mix of register
    kinds possible in an insn operand template, and there being two bits per
    operand for their representation is also quite wasteful, considering the
    low number of uses. Fold both bits and deal with the little bit of
    fallout.

    Also take the liberty and drop dead code trying to set REX_B: No segment
    register has RegRex set on it.

    Additionally I was quite surprised that PUSH/POP with the permitted
    segment registers is not covered by the test cases. Add the missing
    pieces.

Assembler disallows "popq %fs", but disassembler still display "popq %fs".

Revision history for this message
In , Christian Ehrhardt  (paelzer) wrote :

Thanks H.J. Lu,
so the change was intentional just as I thought. I'll add this to the thread [1] I have with the ipxe people then.

If there is anyone here that wants to help guiding [1] please feel free to chime in there or let me know here and I can pass the info.

[1]: http://lists.ipxe.org/pipermail/ipxe-devel/2019-September/006763.html

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI: feedback indicates this is intentional and we'd need to change IPXE.
My experiments showed that the initial idea isn't good, I'll submit a V2 to IPXE on the thread already linked here.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
In , Jbeulich (jbeulich) wrote :

Hmm, yes - No_qSuf gets in the way here. Simply removing the attribute won't work though, I'm afraid. I'll try to find time soon to look into this. (And I'm surprised there's nothing in the testsuite that would have allowed me to notice this before even submitting the patch.)

Changed in binutils:
status: New → Confirmed
Changed in ipxe (Ubuntu):
status: New → Triaged
importance: Undecided → High
tags: added: server-next
Changed in ipxe (Ubuntu):
assignee: nobody → Christian Ehrhardt  (paelzer)
Revision history for this message
In , Hjl-tools (hjl-tools) wrote :

Fixed for 2.34 by

commit 3f9aad111cea2f25877d0a6b404956769c14faee
Author: Jan Beulich <email address hidden>
Date: Fri Sep 20 10:18:15 2019 +0200

    x86-64: fix handling of PUSH/POP of segment register

    Commit 21df382b91 ("x86: fold SReg{2,3}") went too far: Folding 64-bit
    PUSH/POP templates into non-64-bit ones isn't correct, due to the
    different operand widths, and hence suffixes permitted. Restore the
    separate templates.

    Add tests of PUSH/POP with q suffix and %fs/%gs operands to the
    testsuite. While doing so also add PUSHF/POPF ones _without_ suffix.

and on 2.33 branch by

commit 20057c8c5e67ffdfb1a7b6a4ef3d337ea27663d1
Author: Jan Beulich <email address hidden>
Date: Fri Sep 20 10:18:15 2019 +0200

    x86-64: fix handling of PUSH/POP of segment register

    Commit 21df382b91 ("x86: fold SReg{2,3}") went too far: Folding 64-bit
    PUSH/POP templates into non-64-bit ones isn't correct, due to the
    different operand widths, and hence suffixes permitted. Restore the
    separate templates.

    Add tests of PUSH/POP with q suffix and %fs/%gs operands to the
    testsuite. While doing so also add PUSHF/POPF ones _without_ suffix.

    (cherry picked from commit 3f9aad111cea2f25877d0a6b404956769c14faee)

Changed in binutils:
status: Confirmed → Fix Released
Revision history for this message
In , Christian Ehrhardt  (paelzer) wrote :

Thank you Jan!

Changed in ipxe (Ubuntu):
assignee: Christian Ehrhardt  (paelzer) → nobody
status: Triaged → Invalid
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Now after we have understood it we know the fix is actually in binutils.
Furthermore any other third party code (or even a few other FTBFS in Eoan) could be due to that.
IMHO that is a high prio to fix before Eoan as it would not be a new bug bur a regression.
I'll set it as critical, but it is up to you having more experience in such issues.

@Doko - can we get this fixed in Eoan?

Changed in binutils (Ubuntu):
importance: Undecided → Critical
assignee: nobody → Matthias Klose (doko)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package binutils - 2.33-1ubuntu1

---------------
binutils (2.33-1ubuntu1) eoan; urgency=medium

  * Don't generate control file entries for any native mips* packages.

binutils (2.33-1) unstable; urgency=medium

  * Binutils 2.33 release (taken from the binutils-2_33 tag).
  * Update from the binutils 2.33 branch:
    - Fix PR 24942, change objcopy's --set-section-alignment option so that it
      takes a byte alignment value rather than a power of two alignment value.
    - x86-64: fix handling of PUSH/POP of segment register. LP: #1843394.
  * Merge changes from binutils-mipsen.
  * Bump standards version.

 -- Matthias Klose <email address hidden> Tue, 08 Oct 2019 12:41:06 +0200

Changed in binutils (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.