edk2 autopkgtest spotted a real issue, still we need to mitigate it for now
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
edk2 (Ubuntu) |
Fix Released
|
Undecided
|
dann frazier | ||
qemu (Ubuntu) |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
Hi,
as Dannf and I have discussed on IRC already this is a bit convoluted, so I need to outline what this is about.
Qemu has landed a patch [1] upstream that fixes serious issues in emulation and dword access.
Due to being important Debian has pulled it into qemu 7.2 (upstream hasn't yet released a version with it applied, just in the main branch).
Sadly this affects TCG emulation on s390x (needs to be big endian) when running with more virtual cpus (two) than real (one).
That issue is reported upstream [2], but due to being an edge case might not get a fast resolution (if at all). In general anything using more vcpus than real cpus is kind of out of support - so I'm not sure we can expect much.
Sadly the edk2 tests have it all
- #1 s390x - all tests except s390x are fine, I tested arm&power they do not expose the slowdown due to [1][2].
- #2 host cpus - The autopkgtest of edk2 run with the default of 1 vcpu in the Ubuntu autopkgtest infrastructure.
- #3 guest vcpus - edk2 tests use Qemu.QemuCommand from debian/
- #4 timing - the test runs in ~5-6 seconds usually, but with the problem exposed it is ~55-65 seconds which hits the timeout in the tests of edk2 (set to 60s)
We have many options now (listed from worst to best):
- #1 we are just barely on the 60s timeout, we could hit retry rather often to pass by chance at some point
- #2 wait until qemu has a patch and apply, but that is unsure to happen in time
- #3 we could modify src:edk2 to bump the test timeout 60 -> 120, that will mask the issue
- #4 if we need -smp 2 for any of the edk2 tests, then we should add it to big_packages [3] to ensure we never hit vcpu > host cpu (which could also cause other pain elsewhere)
- #5 we could modify src:edk2 to use '-smp 1,sockets=
@Dannf - I think I'd want your input to pick what we should do, so WDYT?
[1]: https:/
[2]: https:/
[3]: https:/
Added a qemu task and update-excuse so people will find it more easily.