nova-compute fails with "TypeError: Parameterized generics cannot be used with class" with python 3.6 and libvirt-python 6.8.0

Bug #1901383 reported by melanie witt on 2020-10-25
40
This bug affects 6 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Critical
Lee Yarwood
Train
High
Lee Yarwood
Ussuri
High
Lee Yarwood
Victoria
High
Lee Yarwood

Bug Description

Our gate jobs running python 3.6 and libvirt-python 6.8.0 are currently failing because when we call libvirt methods, the following error is raised [1]:

Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service [-] Error starting thread.: TypeError: Parameterized generics cannot be used with class or instance checks
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service Traceback (most recent call last):
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/usr/local/lib/python3.6/dist-packages/oslo_service/service.py", line 807, in run_service
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service service.start()
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/opt/stack/nova/nova/service.py", line 159, in start
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service self.manager.init_host()
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 1409, in init_host
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service self.driver.init_host(host=self.host)
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 600, in init_host
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service self._do_quality_warnings()
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 577, in _do_quality_warnings
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service caps = self._host.get_capabilities()
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/host.py", line 710, in get_capabilities
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service xmlstr = self.get_connection().getCapabilities()
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/host.py", line 511, in get_connection
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service conn = self._get_connection()
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/host.py", line 494, in _get_connection
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service {'msg': ex})
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
Oct 23 12:12:21.454426 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service self.force_reraise()
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service six.reraise(self.type_, self.value, self.tb)
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service raise value
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/host.py", line 483, in _get_connection
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service self._wrapped_conn = self._get_new_connection()
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/host.py", line 437, in _get_new_connection
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service wrapped_conn = self._connect(self._uri, self._read_only)
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/opt/stack/nova/nova/virt/libvirt/host.py", line 288, in _connect
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service return self._libvirt_proxy.openAuth(uri, auth, flags)
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 190, in doit
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service result = proxy_call(self._autowrap, f, *args, **kwargs)
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 149, in proxy_call
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service if isinstance(rv, autowrap):
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/usr/lib/python3.6/typing.py", line 1162, in __instancecheck__
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service return issubclass(instance.__class__, self)
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service File "/usr/lib/python3.6/typing.py", line 1148, in __subclasscheck__
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service raise TypeError("Parameterized generics cannot be used with class "
Oct 23 12:12:21.456615 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service TypeError: Parameterized generics cannot be used with class or instance checks
Oct 23 12:12:21.459391 ubuntu-bionic-limestone-regionone-0020957795 nova-compute[13578]: ERROR oslo_service.service

The libvirt-python package version was recently bumped in upper-constraints from version 6.6.0 to 6.8.0:

https://review.opendev.org/750084

and the tempest job on that change ^ passed running python 3.8.

But then our jobs running python 3.6 immediately began to fail.

There appears to be some kind of incompatibility between python 3.6 and libvirt-python 6.8.0.

I had a look through the libvirt-python repo to see what changed between 6.6.0 and 6.8.0 and can't tell what's the root cause. We get the TypeError raised in nova-compute when we call the libvirt openAuth method but nothing in this diff [2] looks problematic?

Not sure what we can do other than revert the upper-constraints bump for libvirt-python and go back to 6.6.0.

[1] https://zuul.opendev.org/t/openstack/build/20f1d309663347c28112b711b82b9c03/log/controller/logs/screen-n-cpu.txt?severity=4#861
[2] https://github.com/libvirt/libvirt-python/compare/v6.6.0...v6.8.0#diff-a3d0475c2dfecd386f36307bc70befd9d5bd69daceadc61a5589ee86413d04c7L91

Lee Yarwood (lyarwood) on 2020-10-26
Changed in nova:
status: New → Confirmed
assignee: nobody → melanie witt (melwitt)
Daniel Berrange (berrange) wrote :

The big change in 6.8.0 was that libvirt added python type hinting for almost all its APIs.

For the openAuth method, the change is this:

-def openAuth(uri, auth, flags=0):
+def openAuth(uri: str, auth: List, flags: int = 0) -> 'virConnect':

but nova should be matching that typing signature from what I see.

I wonder if there's a problem with eventlet tpool proxy stuff causing bad interactions with typing ?

Daniel Berrange (berrange) wrote :

The error message comes from the Python standard library typing.py module, however, the specific error message was removed in python 3.7.0 - only exists in python 3.6.x series

Daniel Berrange (berrange) wrote :

I've done some tests with openAuth() and confirmed that use of tpool proxy breaks type hinting on python 3.6.

Daniel Berrange (berrange) wrote :

I believe the problem is actually this bit of code:

  https://github.com/openstack/nova/blob/master/nova/virt/libvirt/host.py#L138

It attempts to get a list of libvirt classes via inspection. With the introduction of type checking to libvirt, there are a bunch of private global variables that are pointing to classes. This means Nova is extracting a bunch of classes that it shouldn't ever look at.

Check this code:

classes = inspect.getmembers(libvirt, inspect.isclass)
for cls in classes:
   print("Seen %s" % str(cls))

It prints:

Seen ('Callable', typing.Callable)
Seen ('Dict', typing.Dict)
Seen ('List', typing.List)
Seen ('TracebackType', <class 'traceback'>)
Seen ('Tuple', typing.Tuple)
Seen ('Type', typing.Type)
Seen ('TypeVar', typing.TypeVar)
Seen ('_BlkioParameter', typing.Dict[str, typing.Any])
Seen ('_DomainCB', typing.Callable[[_ForwardRef('virConnect'), _ForwardRef('virDomain'), int, int, ~_T], typing.Union[int, NoneType]])
Seen ('_EventAddHandleFunc', typing.Callable[[int, int, typing.Callable[[int, int, int, ~_T], NoneType], ~_T], int])
Seen ('_EventAddTimeoutFunc', typing.Callable[[int, typing.Callable[[int, ~_T], NoneType], ~_T], int])
Seen ('_EventCB', typing.Callable[[int, int, int, ~_T], NoneType])
Seen ('_EventRemoveHandleFunc', typing.Callable[[int], int])
Seen ('_EventRemoveTimeoutFunc', typing.Callable[[int], int])
Seen ('_EventUpdateHandleFunc', typing.Callable[[int, int], NoneType])
Seen ('_EventUpdateTimeoutFunc', typing.Callable[[int, int], NoneType])
Seen ('_MemoryParameter', typing.Dict[str, typing.Any])
Seen ('_SchedParameter', typing.Dict[str, typing.Any])
Seen ('_TimerCB', typing.Callable[[int, ~_T], NoneType])
Seen ('_TypedParameter', typing.Dict[str, typing.Any])
Seen ('libvirtError', <class 'libvirt.libvirtError'>)
Seen ('virConnect', <class 'libvirt.virConnect'>)
Seen ('virDomain', <class 'libvirt.virDomain'>)
Seen ('virDomainCheckpoint', <class 'libvirt.virDomainCheckpoint'>)
Seen ('virDomainSnapshot', <class 'libvirt.virDomainSnapshot'>)
Seen ('virInterface', <class 'libvirt.virInterface'>)
Seen ('virNWFilter', <class 'libvirt.virNWFilter'>)
Seen ('virNWFilterBinding', <class 'libvirt.virNWFilterBinding'>)
Seen ('virNetwork', <class 'libvirt.virNetwork'>)
Seen ('virNetworkPort', <class 'libvirt.virNetworkPort'>)
Seen ('virNodeDevice', <class 'libvirt.virNodeDevice'>)
Seen ('virSecret', <class 'libvirt.virSecret'>)
Seen ('virStoragePool', <class 'libvirt.virStoragePool'>)
Seen ('virStorageVol', <class 'libvirt.virStorageVol'>)
Seen ('virStream', <class 'libvirt.virStream'>)

Nova should only be looking at wrapping classes beginning with "vir".

I believe if nova changes:

        return tuple([cls[1] for cls in classes if cls[0] != 'libvirtError'])

to

        return tuple([cls[1] for cls in classes if cls.startswith("vir")])

then the bug should go away.

Kashyap Chamarthy (kashyapc) wrote :

Thanks for the analysis, Dan.

Seems like the original change of extracting the libvirt classes was introduced as part of this change:

     https://opendev.org/openstack/nova/commit/36ee9c1913
     libvirt: Fix service-wide pauses caused by un-proxied libvirt calls
     (2019-08-21)

Fix proposed to branch: master
Review: https://review.opendev.org/759831

Changed in nova:
assignee: melanie witt (melwitt) → Lee Yarwood (lyarwood)
status: Confirmed → In Progress

Reviewed: https://review.opendev.org/759831
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0d2ca53bb86b8e4a3c44855cb5ef57f223462543
Submitter: Jens Harbott (frickler) (<email address hidden>)
Branch: master

commit 0d2ca53bb86b8e4a3c44855cb5ef57f223462543
Author: Lee Yarwood <email address hidden>
Date: Tue Oct 27 08:44:59 2020 +0000

    libvirt: Only ask tpool.Proxy to autowrap vir* classes

    I668643c836d46a25df46d4c99a973af5e50a39db attempted to fix service wide
    pauses by providing a more complete list of classes to tpool.Proxy.

    While this excluded libvirtError it can include internal libvirt-python
    classes pointed to by private globals that have been introduced with the
    use of type checking within the module.

    Any attempt to wrap these internal classes will result in the failure
    seen in bug #1901383. As a result this change simply ignores any class
    found during inspection that doesn't start with the `vir` string, used
    by libvirt to denote public methods and classes.

    Closes-Bug: #1901383
    Co-Authored-By: Daniel Berrange <email address hidden>
    Change-Id: I568b0c4fd6069b9118ff116532f14abb46cc42ab

Changed in nova:
status: In Progress → Fix Released

Reviewed: https://review.opendev.org/761222
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=048a3337a8c98ec8fc138083376430ffb9027f67
Submitter: Zuul
Branch: stable/victoria

commit 048a3337a8c98ec8fc138083376430ffb9027f67
Author: Lee Yarwood <email address hidden>
Date: Tue Oct 27 08:44:59 2020 +0000

    libvirt: Only ask tpool.Proxy to autowrap vir* classes

    I668643c836d46a25df46d4c99a973af5e50a39db attempted to fix service wide
    pauses by providing a more complete list of classes to tpool.Proxy.

    While this excluded libvirtError it can include internal libvirt-python
    classes pointed to by private globals that have been introduced with the
    use of type checking within the module.

    Any attempt to wrap these internal classes will result in the failure
    seen in bug #1901383. As a result this change simply ignores any class
    found during inspection that doesn't start with the `vir` string, used
    by libvirt to denote public methods and classes.

    Closes-Bug: #1901383
    Co-Authored-By: Daniel Berrange <email address hidden>
    Change-Id: I568b0c4fd6069b9118ff116532f14abb46cc42ab
    (cherry picked from commit 0d2ca53bb86b8e4a3c44855cb5ef57f223462543)

tags: added: in-stable-victoria

This issue was fixed in the openstack/nova 21.2.0 release.

This issue was fixed in the openstack/nova 20.6.0 release.

This issue was fixed in the openstack/nova 23.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers