virt/libvirt/host.py: _test_connection is not reliable with modularized libvirtd services

Bug #1997216 reported by Jaroslav Pulchart
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

Description
===========

the virt/libvirt/host.py _test_connection function of openstack nova service is not reliable with modularized libvirtd services and cannot reestablish connection if some module of libvirt is restarted.

The modularized libvirt uses multiple daemons Like
  ...
  virtnodedevd
  virtnetworkd
  virtsecretd
  ...

in case one of them is restarted the nova-compute still thinks the connection is working. However using particular module will fail when used later.

In the nova-compute service log we can see:

internal error: client socket is closed: libvirt.libvirtError: internal error: client socket is closed

Steps to reproduce
==================

1/ use openstack with modularized libvirtd as on CentOS Stream 9

2/ gracefully restart virtnodedevd.service

   $ systemctl restart virtnodedevd.service

3/ check logs of openstack about connection issue (repeating one each 30s)

   "internal error: client socket is closed: libvirt.libvirtError: internal error: client socket is closed"

Expected result
===============

"_test_connection" function figures out the needed libvirt's module socket was closed and it will try to "open new connection".

Actual result
=============

Openstack Nova service is not able to use libvirt's service module if the module is restarted.

Environment
===========

CentOS Stream 9, Openstack Yoga

summary: virt/libvirt/host.py: _test_connection is not reliable with modularized
- libvird services
+ libvirtd services
Revision history for this message
Uggla (rene-ribaud) wrote :
Download full text (5.7 KiB)

Hello Jaroslav,

Maybe I miss something but I did not manage to reproduce the bug with the current master branch.
Looking at the _test_connection method there was no change in between master and Yoga.

Can you check if you can see the issue with the master branch ?

[stack@openstack devstack]$ sudo systemctl restart virtnodedevd.service
[stack@openstack devstack]$ sudo systemctl status virtnodedevd.service
● virtnodedevd.service - Virtualization nodedev daemon
     Loaded: loaded (/usr/lib/systemd/system/virtnodedevd.service; disabled; preset: disabled)
     Active: active (running) since Tue 2023-05-09 12:02:37 CEST; 7s ago
TriggeredBy: ● virtnodedevd-admin.socket
             ● virtnodedevd-ro.socket
             ● virtnodedevd.socket
       Docs: man:virtnodedevd(8)
             https://libvirt.org
   Main PID: 180075 (virtnodedevd)
      Tasks: 19 (limit: 48937)
     Memory: 10.9M
        CPU: 81ms
     CGroup: /system.slice/virtnodedevd.service
             └─180075 /usr/sbin/virtnodedevd --timeout 120

May 09 12:02:37 openstack systemd[1]: Starting Virtualization nodedev daemon...
May 09 12:02:37 openstack systemd[1]: Started Virtualization nodedev daemon.

---> here I can't see anything in the compute logd.

[stack@openstack devstack]$ openstack server list
+--------------------------------------+------+--------+---------------------------------------------------------+--------------------------+--------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+------+--------+---------------------------------------------------------+--------------------------+--------+
| bb5e451c-0dd3-428a-8240-df54c5625c37 | demo | ACTIVE | private=10.0.0.34, fd64:a592:51c8:0:f816:3eff:fecb:25cc | N/A (booted from volume) | ds512M |
+--------------------------------------+------+--------+---------------------------------------------------------+--------------------------+--------+
[stack@openstack devstack]$ openstack server create --image cirros-0.5.2-x86_64-disk --flavor ds512M --network private --boot-from-volume 5 demo2
+-------------------------------------+--------------------------------------+
| Field | Value |
+-------------------------------------+--------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | |
| OS-EXT-SRV-ATTR:host | None |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None |
| OS-EXT-SRV-ATTR:instance_name | |
| OS-EXT-STS:power_state | NOSTATE |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | None |
| OS-SRV-USG:terminated_at | None ...

Read more...

Changed in nova:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.