qemu TCG in s390x mode issue with calculating HASH

Bug #1847232 reported by Ivan Warren
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Fix Released
Undecided
David Hildenbrand

Bug Description

When using go on s390x on Debian x64 (buster) (host) and debian s390x (sid) (guest) I run into the following problem :

The following occurs while trying to build a custom project :

go: <email address hidden>: Get https://proxy.golang.org/github.com/%21factom%21project/basen/@v/v0.0.0-20150613233007-fe3947df716e.mod: local error: tls: bad record MAC

Doing a git bisect I find that this problem only occurs on and after commit 08ef92d556c584c7faf594ff3af46df456276e1b

Before that commit, all works fine. Past this commit, build always fails.

Without any proof, It looks like a hash calculation bug related to using z/Arch vector facilities...

Tags: s390x
Revision history for this message
David Hildenbrand (davidhildenbrand) wrote : Re: [Bug 1847232] [NEW] qemu TCG in s390x mode issue with calculating HASH

On 08.10.19 14:11, Cornelia Huck wrote:
> On Tue, 08 Oct 2019 11:19:25 -0000
> Ivan Warren via <email address hidden> wrote:
>
>> Public bug reported:
>>
>> When using go on s390x on Debian x64 (buster) (host) and debian s390x
>> (sid) (guest) I run into the following problem :
>>
>> The following occurs while trying to build a custom project :
>>
>> go: <email address hidden>:
>> Get
>> https://proxy.golang.org/github.com/%21factom%21project/basen/@v/v0.0.0-20150613233007-fe3947df716e.mod:
>> local error: tls: bad record MAC
>>
>> Doing a git bisect I find that this problem only occurs on and after
>> commit 08ef92d556c584c7faf594ff3af46df456276e1b
>>
>> Before that commit, all works fine. Past this commit, build always
>> fails.
>
> What version are you using? Current master?
>
> Can you please share your command line?
>
>>
>> Without any proof, It looks like a hash calculation bug related to using
>> z/Arch vector facilities...
>
> Not an unreasonable guess, cc:ing David in case he has seen that before.
>

Can you reproduce with "-cpu qemu,vx=off" added to the QEMU command
line? Could be some fallout from vector instruction support. Currently
ill, will have a look when I'm feeling better.

--

Thanks,

David / dhildenb

Thomas Huth (th-huth)
tags: added: s390x
Revision history for this message
Ivan Warren (ivmn) wrote :

On 10/8/2019 3:35 PM, David Hildenbrand wrote:
> On 08.10.19 14:11, Cornelia Huck wrote:
>> On Tue, 08 Oct 2019 11:19:25 -0000
>> Ivan Warren via <email address hidden> wrote:
>>
>>> Public bug reported:
>>>
>>> When using go on s390x on Debian x64 (buster) (host) and debian s390x
>>> (sid) (guest) I run into the following problem :
>>>
>>> The following occurs while trying to build a custom project :
>>>
>>> go: <email address hidden>:
>>> Get
>>> https://proxy.golang.org/github.com/%21factom%21project/basen/@v/v0.0.0-20150613233007-fe3947df716e.mod:
>>> local error: tls: bad record MAC
>>>
>>> Doing a git bisect I find that this problem only occurs on and after
>>> commit 08ef92d556c584c7faf594ff3af46df456276e1b
>>>
>>> Before that commit, all works fine. Past this commit, build always
>>> fails.
>> What version are you using? Current master?
>>
>> Can you please share your command line?
>>
>>> Without any proof, It looks like a hash calculation bug related to using
>>> z/Arch vector facilities...
>> Not an unreasonable guess, cc:ing David in case he has seen that before.
>>
> Can you reproduce with "-cpu qemu,vx=off" added to the QEMU command
> line? Could be some fallout from vector instruction support. Currently
> ill, will have a look when I'm feeling better.

Reposted with a reply all... (sorry for the duplicates)

So it does !

My qemu command line is now (forget the odd funny networking things..)

qemu-system-s390x \
     -drive
file=DEB002.IMG.NEW,discard=unmap,cache=writeback,id=drive-0,if=none \
     -device virtio-scsi-ccw,id=virtio-scsi-0 \
     -device scsi-hd,id=scsi-hd-0,drive=drive-0 \
     -m 8G \
     -net nic,macaddr=52:54:00:00:00:02 \
     -net tap,ifname=taparm,script=no \
     -nographic -accel tcg,thread=multi \
     -monitor unix:ms,server,nowait \
     -cpu qemu,vx=off \  ##### THAT WAS ADDED as instructed - without it
everything goes kaput !
     -smp 12

And using the latest bleeding edge qemu from github, my build works (the
problem goes away).

So the z/Arch vector instructions may have a glitch is a venue to
consider.. Probably one that couldn't be screened through conventional
methods.

I'm not that versed into z/Arch vector instruction, but if there
anything I can help with, I will !

--Ivan

Revision history for this message
David Hildenbrand (davidhildenbrand) wrote :

On 08.10.19 16:11, Ivan Warren wrote:
>
> On 10/8/2019 3:35 PM, David Hildenbrand wrote:
>> On 08.10.19 14:11, Cornelia Huck wrote:
>>> On Tue, 08 Oct 2019 11:19:25 -0000
>>> Ivan Warren via <email address hidden> wrote:
>>>
>>>> Public bug reported:
>>>>
>>>> When using go on s390x on Debian x64 (buster) (host) and debian s390x
>>>> (sid) (guest) I run into the following problem :
>>>>
>>>> The following occurs while trying to build a custom project :
>>>>
>>>> go: <email address hidden>:
>>>> Get
>>>> https://proxy.golang.org/github.com/%21factom%21project/basen/@v/v0.0.0-20150613233007-fe3947df716e.mod:
>>>> local error: tls: bad record MAC
>>>>
>>>> Doing a git bisect I find that this problem only occurs on and after
>>>> commit 08ef92d556c584c7faf594ff3af46df456276e1b
>>>>
>>>> Before that commit, all works fine. Past this commit, build always
>>>> fails.
>>> What version are you using? Current master?
>>>
>>> Can you please share your command line?
>>>
>>>> Without any proof, It looks like a hash calculation bug related to using
>>>> z/Arch vector facilities...
>>> Not an unreasonable guess, cc:ing David in case he has seen that before.
>>>
>> Can you reproduce with "-cpu qemu,vx=off" added to the QEMU command
>> line? Could be some fallout from vector instruction support. Currently
>> ill, will have a look when I'm feeling better.
>
> Reposted with a reply all... (sorry for the duplicates)
>
> So it does !
>
>
> My qemu command line is now (forget the odd funny networking things..)
>
> qemu-system-s390x \
>     -drive
> file=DEB002.IMG.NEW,discard=unmap,cache=writeback,id=drive-0,if=none \
>     -device virtio-scsi-ccw,id=virtio-scsi-0 \
>     -device scsi-hd,id=scsi-hd-0,drive=drive-0 \
>     -m 8G \
>     -net nic,macaddr=52:54:00:00:00:02 \
>     -net tap,ifname=taparm,script=no \
>     -nographic -accel tcg,thread=multi \
>     -monitor unix:ms,server,nowait \
>     -cpu qemu,vx=off \  ##### THAT WAS ADDED as instructed - without it
> everything goes kaput !
>     -smp 12
>
> And using the latest bleeding edge qemu from github, my build works (the
> problem goes away).
>
> So the z/Arch vector instructions may have a glitch is a venue to
> consider.. Probably one that couldn't be screened through conventional
> methods.
>
> I'm not that versed into z/Arch vector instruction, but if there
> anything I can help with, I will !

I'll have to reproduce, can you outline the steps needed to trigger
this? (never had to build a go project before #luckyme ( ;) )). It looks
like https://github.com/FactomProject/basen is getting pulled in from
some other project?

Thanks!

--

Thanks,

David / dhildenb

Revision history for this message
David Hildenbrand (davidhildenbrand) wrote :
Download full text (4.6 KiB)

On 14.10.19 11:53, David Hildenbrand wrote:
> On 08.10.19 16:11, Ivan Warren wrote:
>>
>> On 10/8/2019 3:35 PM, David Hildenbrand wrote:
>>> On 08.10.19 14:11, Cornelia Huck wrote:
>>>> On Tue, 08 Oct 2019 11:19:25 -0000
>>>> Ivan Warren via <email address hidden> wrote:
>>>>
>>>>> Public bug reported:
>>>>>
>>>>> When using go on s390x on Debian x64 (buster) (host) and debian s390x
>>>>> (sid) (guest) I run into the following problem :
>>>>>
>>>>> The following occurs while trying to build a custom project :
>>>>>
>>>>> go: <email address hidden>:
>>>>> Get
>>>>> https://proxy.golang.org/github.com/%21factom%21project/basen/@v/v0.0.0-20150613233007-fe3947df716e.mod:
>>>>> local error: tls: bad record MAC
>>>>>
>>>>> Doing a git bisect I find that this problem only occurs on and after
>>>>> commit 08ef92d556c584c7faf594ff3af46df456276e1b
>>>>>
>>>>> Before that commit, all works fine. Past this commit, build always
>>>>> fails.
>>>> What version are you using? Current master?
>>>>
>>>> Can you please share your command line?
>>>>
>>>>> Without any proof, It looks like a hash calculation bug related to using
>>>>> z/Arch vector facilities...
>>>> Not an unreasonable guess, cc:ing David in case he has seen that before.
>>>>
>>> Can you reproduce with "-cpu qemu,vx=off" added to the QEMU command
>>> line? Could be some fallout from vector instruction support. Currently
>>> ill, will have a look when I'm feeling better.
>>
>> Reposted with a reply all... (sorry for the duplicates)
>>
>> So it does !
>>
>>
>> My qemu command line is now (forget the odd funny networking things..)
>>
>> qemu-system-s390x \
>>     -drive
>> file=DEB002.IMG.NEW,discard=unmap,cache=writeback,id=drive-0,if=none \
>>     -device virtio-scsi-ccw,id=virtio-scsi-0 \
>>     -device scsi-hd,id=scsi-hd-0,drive=drive-0 \
>>     -m 8G \
>>     -net nic,macaddr=52:54:00:00:00:02 \
>>     -net tap,ifname=taparm,script=no \
>>     -nographic -accel tcg,thread=multi \
>>     -monitor unix:ms,server,nowait \
>>     -cpu qemu,vx=off \  ##### THAT WAS ADDED as instructed - without it
>> everything goes kaput !
>>     -smp 12
>>
>> And using the latest bleeding edge qemu from github, my build works (the
>> problem goes away).
>>
>> So the z/Arch vector instructions may have a glitch is a venue to
>> consider.. Probably one that couldn't be screened through conventional
>> methods.
>>
>> I'm not that versed into z/Arch vector instruction, but if there
>> anything I can help with, I will !
>
> I'll have to reproduce, can you outline the steps needed to trigger
> this? (never had to build a go project before #luckyme ( ;) )). It looks
> like https://github.com/FactomProject/basen is getting pulled in from
> some other project?
>

I just tried with Fedora 31 Nightly using "go get"

[root@f31 ~]# go get -v -d github.com/FactomProject/factom
github.com/FactomProject/factom (download)
github.com/FactomProject/btcutil (download)
github.com/FactomProject/ed25519 (download)
github.com/FactomProject/go-bip32 (download)
github.com/FactomProject/btcutilecc (download)
package golang.org/x/crypto/ripemd160: unrecognized import...

Read more...

Revision history for this message
Ivan Warren (ivmn) wrote :

I was looking for a simpler method (without using go), but curl/wget and whatnot don't seem to have any problem.. There must be something go is doing that is different.

My guess (but ONLY a guess) is that the "go" TLS code might be doing some dynamic checking on the z/Arch facility list (STFLE ?) and do some dynamic change to the code path. Maybe a bit far fetched...

Revision history for this message
Ivan Warren (ivmn) wrote :

Since wget/curl use the openssl library - which (unless a specific engine is specified) uses the lowest common denominator - it will work even when z/Arch VX is present.. However, "golang" may be using a different technique (and may not be using openssl at all) - or invoke openssl with a specific engine if it detects z/Arch VX...

However, I have the workaround (vx=off), but I'd love to see if there is something amiss with z/Arch VX (for purely educational purpose).

Revision history for this message
David Hildenbrand (davidhildenbrand) wrote :
Download full text (5.1 KiB)

On 14.10.19 12:22, David Hildenbrand wrote:
> On 14.10.19 11:53, David Hildenbrand wrote:
>> On 08.10.19 16:11, Ivan Warren wrote:
>>>
>>> On 10/8/2019 3:35 PM, David Hildenbrand wrote:
>>>> On 08.10.19 14:11, Cornelia Huck wrote:
>>>>> On Tue, 08 Oct 2019 11:19:25 -0000
>>>>> Ivan Warren via <email address hidden> wrote:
>>>>>
>>>>>> Public bug reported:
>>>>>>
>>>>>> When using go on s390x on Debian x64 (buster) (host) and debian s390x
>>>>>> (sid) (guest) I run into the following problem :
>>>>>>
>>>>>> The following occurs while trying to build a custom project :
>>>>>>
>>>>>> go: <email address hidden>:
>>>>>> Get
>>>>>> https://proxy.golang.org/github.com/%21factom%21project/basen/@v/v0.0.0-20150613233007-fe3947df716e.mod:
>>>>>> local error: tls: bad record MAC
>>>>>>
>>>>>> Doing a git bisect I find that this problem only occurs on and after
>>>>>> commit 08ef92d556c584c7faf594ff3af46df456276e1b
>>>>>>
>>>>>> Before that commit, all works fine. Past this commit, build always
>>>>>> fails.
>>>>> What version are you using? Current master?
>>>>>
>>>>> Can you please share your command line?
>>>>>
>>>>>> Without any proof, It looks like a hash calculation bug related to using
>>>>>> z/Arch vector facilities...
>>>>> Not an unreasonable guess, cc:ing David in case he has seen that before.
>>>>>
>>>> Can you reproduce with "-cpu qemu,vx=off" added to the QEMU command
>>>> line? Could be some fallout from vector instruction support. Currently
>>>> ill, will have a look when I'm feeling better.
>>>
>>> Reposted with a reply all... (sorry for the duplicates)
>>>
>>> So it does !
>>>
>>>
>>> My qemu command line is now (forget the odd funny networking things..)
>>>
>>> qemu-system-s390x \
>>>     -drive
>>> file=DEB002.IMG.NEW,discard=unmap,cache=writeback,id=drive-0,if=none \
>>>     -device virtio-scsi-ccw,id=virtio-scsi-0 \
>>>     -device scsi-hd,id=scsi-hd-0,drive=drive-0 \
>>>     -m 8G \
>>>     -net nic,macaddr=52:54:00:00:00:02 \
>>>     -net tap,ifname=taparm,script=no \
>>>     -nographic -accel tcg,thread=multi \
>>>     -monitor unix:ms,server,nowait \
>>>     -cpu qemu,vx=off \  ##### THAT WAS ADDED as instructed - without it
>>> everything goes kaput !
>>>     -smp 12
>>>
>>> And using the latest bleeding edge qemu from github, my build works (the
>>> problem goes away).
>>>
>>> So the z/Arch vector instructions may have a glitch is a venue to
>>> consider.. Probably one that couldn't be screened through conventional
>>> methods.
>>>
>>> I'm not that versed into z/Arch vector instruction, but if there
>>> anything I can help with, I will !
>>
>> I'll have to reproduce, can you outline the steps needed to trigger
>> this? (never had to build a go project before #luckyme ( ;) )). It looks
>> like https://github.com/FactomProject/basen is getting pulled in from
>> some other project?
>>
>
> I just tried with Fedora 31 Nightly using "go get"
>
> [root@f31 ~]# go get -v -d github.com/FactomProject/factom
> github.com/FactomProject/factom (download)
> github.com/FactomProject/btcutil (download)
> github.com/FactomProject/ed25519 (download)
> github.c...

Read more...

Changed in qemu:
assignee: nobody → David Hildenbrand (davidhildenbrand)
status: New → In Progress
Changed in qemu:
status: In Progress → Fix Committed
Revision history for this message
Ivan Warren (ivmn) wrote :

Looks fixed to me ! The issue no longer shows even without specifying vx=off

Revision history for this message
David Hildenbrand (davidhildenbrand) wrote : Re: [Bug 1847232] Re: qemu TCG in s390x mode issue with calculating HASH

On 23.10.19 11:37, Ivan Warren wrote:
> Looks fixed to me ! The issue no longer shows even without specifying
> vx=off
>

Nice, I suspect that there might be more issues when using golang (as it
really makes excessive use of vector registers to my surprise). So in
case you run into problems (and especially can't reproduce with vx=off),
please inform me!

Thanks!

--

Thanks,

David / dhildenb

Revision history for this message
Ivan Warren (ivmn) wrote :

I will exercise this thoroughly ! The go application involved is itself a blockchain signing/verification application, so I suspect it will be a good exercise (codenotary.io)

Thanks for looking into this and fixing this !

--Ivan

Revision history for this message
Thomas Huth (th-huth) wrote :

I assume David's fixes have been released with QEMU v4.2, so I'm closing this ticket now.

Changed in qemu:
status: Fix Committed → Fix Released
Revision history for this message
John Boero (boeroboy) wrote :

Thank you so much! I've been hitting the same issue with s390x.

Maybe an endian issue with hardware vx?

Revision history for this message
John Boero (boeroboy) wrote :

Also affects RISC-V apparently. https://github.com/moby/moby/issues/40240

Revision history for this message
carlosedp (carlosedp) wrote :

Tried applying the -cpu parameter into Qemu 4.1.1 but I got:

qemu-system-riscv64: unable to find CPU model 'qemu'

My command is:

qemu-system-riscv64 \
    -machine virt \
    -cpu qemu,vx=off \
    -nographic \
    -smp 4 \
    -m 4G \
    -bios fw_payload.bin \
    -device virtio-blk-device,drive=hd0 \
    -object rng-random,filename=/dev/urandom,id=rng0 \
    -device virtio-rng-device,rng=rng0 \
    -drive file=riscv64-debianrootfs-qemu.qcow2,format=qcow2,id=hd0 \
    -device virtio-net-device,netdev=usernet \
    -netdev user,id=usernet,hostfwd=tcp::22222-:22"

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.