QEMU >= 5.0.0 with -accel tcg uses a tb-size of 1GB causing OOM issues in CI

Bug #1949606 reported by Lee Yarwood
4
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Lee Yarwood

Bug Description

This is a Nova tracker for a set of issues being seen in OpenStack CI jobs using QEMU >= 5.0.0 caused by the following change in defaults witin QEMU:

https://github.com/qemu/qemu/commit/600e17b26

https://gitlab.com/qemu-project/qemu/-/issues/693

At present most of the impacted jobs are being given an increased amount of swap with lower Tempest concurrency settings to avoid the issue, for example for CentOS 8 stream:

https://review.opendev.org/c/openstack/devstack/+/803706

https://review.opendev.org/c/openstack/tempest/+/797614

Longer term a libvirt RFE has been raised to allow Nova to control the size of the cache:

https://gitlab.com/libvirt/libvirt/-/issues/229

Tags: gate-failure
Lee Yarwood (lyarwood)
description: updated
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Putting to Confirmed given we already triaged it.

Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
yatin (yatinkarel) wrote :
Revision history for this message
yatin (yatinkarel) wrote :

Something like below is required with option to allow configurable tb-cache

# nova/virt/libvirt/config.py
class LibvirtConfigGuestFeatureTCG(LibvirtConfigGuestFeature):

    def __init__(self, **kwargs):
        super(LibvirtConfigGuestFeatureTCG, self).__init__("tcg", **kwargs)

    def format_dom(self):
        root = super(LibvirtConfigGuestFeatureTCG, self).format_dom()
        root.append(self._text_node("tb-cache", "32",
                                    unit="MiB"))

        return root

# nova/virt/libvirt/driver.py
if CONF.libvirt.virt_type in ('qemu'):
    guest.add_feature(vconfig.LibvirtConfigGuestFeatureTCG())

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
yatin (yatinkarel) wrote :
Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 28.0.0.0rc1

This issue was fixed in the openstack/nova 28.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.