Comment 10 for bug 1641532

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Sorry it took so long, but I'm happy that I finally found the right people to talk about it, summarizing the issue once more and then subscribing a bunch of people to make the right call for the UCA on this.

Affected:
- Trusty type guests in Liberty/Mitaka/Xenial/Yakkety/Zesty
- Utopic type guests in Liberty/Mitaka/Xenial

Status today - a migration forward fails only in some combinations but not in others.
- Trusty as-is (2.0) to Xenial works even with the bug in place (accidentally/fortunately)
- Trusty + UCA-Liberty (2.3) to Xenial fails due to the issue (reported case).
- other combinations have to be evaluated

Cause:
- the way machine types are defined got changed several times in qemu along the updates Trusty to Xenial
- It seem that there was an error added when porting the machine types to Wily
- This issue got carried forward when the conversion to Xenial was made
- Due to that those types are not non-ambiguous, they inherit the current qemus version
- So instead of Trusty being always 2.0 as it should be, it is 2.x matching the hosts qemu version
- The same applies to the utopic type

Challenges on fixing that:
- Testing showed that fixing on Xenial did not help - especially for the reported case - it expected 2.1 (utopic) instead ot 2.5 after the fix, but the source was Liberty which is 2.3)
- Since it seems it needs fixing on the source of the migration as well it needs
- In general to "pick up" the fix you need to (re-)start the guest
- This has the additional impact that the guest visible virtual hardware will change, so e.g. once we fix Xenial/Liberty to expect trusty to be 2.0 as it should be - any guests that were started as 2.5 type before will restart as 2.0 type and could cause further issues which makes that fix super critical. And especially going down versions might be critical.
- Due to the fact that Trusty->Xenial works fine, we might think on doing only a fix to UCA-Liberty and avoid these new issues by fixing the old one.
- There might be other even more complex approaches, but then equally more error prone, we have to decide on what we need here

Actions that are to be evaluated:
- Create a fix for Liberty (and eventually Kilo)
- Test if that fixes it without restarting a guest (unlikely but might help and would be great)
- Test if that fixes it with restarting the guest (that should work for sure at least)
- After that fix of the more immediate issue we have to think on >=Xenial preparation with the lowest impact to users due to restarting into lower machine type levels or later migration.