seabios missing NMI disable in rtc_mask()

Bug #1927547 reported by Heitor Alves de Siqueira
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Fix Released
High
Heitor Alves de Siqueira
Mitaka
Fix Committed
High
Heitor Alves de Siqueira
seabios (Ubuntu)
Fix Released
High
Heitor Alves de Siqueira
Trusty
Fix Released
High
Heitor Alves de Siqueira
Xenial
Fix Released
High
Heitor Alves de Siqueira

Bug Description

[Impact]
On seabios before rel-1.9.0~47, there's a bug in rtc_mask() that can cause VMs to miss interrupts and get stuck in a 'PAUSED' state due to KVM emulation errors.

While reading PORT_CMOS_DATA, an NMI can "steal" execution before the inb() call returns, which effectively leaves the guest waiting on the port read forever. This can then trigger watchdogs, and usually results in an KVM emulation error leaving the VM in the 'PAUSED' state. Since the guest VM is broken due to the missed interrupts, the only way to recover is restarting it.

[Test Plan]
Due to the somewhat small race window involved between the inb() call and an NMI coming in, this issue has been hard to reproduce consistently. Our test plan involves running the fixes in a heavily overcommited Openstack compute host where this issue has been reported multiple times, to also validate that no new regressions have been introduced.

[Where problems could occur]
The patch disables NMIs in rtc_mask(), so that it stays consistent with the other rtc_*() functions in seabios/srs/hw/rtc.c. After the CMOS port access finishes and the guest resumes execution, we could see regressions with missed interrupts or NMIs not being handled if they are not re-enabled.

Since the patch is already present in all Ubuntu releases starting with Bionic and there have been no 'fixes:' tags for this patch upstream, the chance for new regressions should be fairly limited.

[Other Info]
This has been fixed by the following upstream patch:
- 3156b71a535e (rtc: Disable NMI in rtc_mask()) [0]

$ git describe --contains 3156b71a535e661
rel-1.9.0~47
$ rmadison seabios -s trusty-updates,xenial,bionic
seabios | 1.7.4-4ubuntu1 | trusty-updates | source, all
seabios | 1.8.2-1ubuntu1 | xenial | source, all
seabios | 1.10.2-1ubuntu1 | bionic | source, all

Releases starting with Bionic already have this fix.

[0] https://review.coreboot.org/plugins/gitiles/seabios/+/3156b71a535e661%5E%21/#F0

Changed in seabios (Ubuntu):
status: Confirmed → Fix Released
Changed in seabios (Ubuntu Trusty):
status: New → In Progress
Changed in seabios (Ubuntu Xenial):
status: New → Confirmed
Changed in seabios (Ubuntu Trusty):
status: In Progress → Confirmed
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in seabios (Ubuntu Xenial):
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in seabios (Ubuntu Trusty):
importance: Undecided → High
Changed in seabios (Ubuntu Xenial):
importance: Undecided → High
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :
Changed in seabios (Ubuntu Trusty):
status: Confirmed → In Progress
Changed in seabios (Ubuntu Xenial):
status: Confirmed → In Progress
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Fixes are now available under the "ESM Infrastructure Security" PPA for Trusty and Xenial, according to the versions below:

Trusty -- 1.7.4-4ubuntu1+esm1
Xenial -- 1.8.2-1ubuntu1+esm1

Changed in seabios (Ubuntu Trusty):
status: In Progress → Fix Released
Changed in seabios (Ubuntu Xenial):
status: In Progress → Fix Released
Changed in cloud-archive:
status: New → Fix Released
importance: Undecided → High
assignee: nobody → Heitor Alves de Siqueira (halves)
Revision history for this message
Corey Bryant (corey.bryant) wrote : Please test proposed package

Hello Heitor, or anyone else affected,

Accepted seabios into mitaka-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:mitaka-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-mitaka-needed to verification-mitaka-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-mitaka-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-mitaka-needed
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Since we don't have a reliable test procedure for triggering the KVM emulation failures, I did basic smoke tests on VMs using seabios from mitaka-proposed. Things look good, and general NMI functionality seems to be working correctly.

I've also confirmed with affected users that this version has fixed those specific instances of KVM emulation failures for them, so they also confirmed the fix is working as intended.

tags: added: verification-mitaka-done
removed: verification-mitaka-needed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.