Nova's quobyte driver fails to mount a volume after upgrading OpenStack

Bug #2028581 reported by Rafael Madrid
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
New
Undecided
Unassigned

Bug Description

What happened:

After upgrading OpenStack (Xena -> yoga -> zed -> 2023.1 release) using kolla-ansible, nova-compute fails to start virtual machines. After some debugging, I found that this happens because nova's quobyte driver fails to mount our quobyte volume that contains the VM images/disks.

What I expected to happen:

Nova should be able to mount Quobyte volumes and launch VMs.

Steps to reproduce:
1. Log into horizon and navigate to the Instances view.
2. Start/Hard Reboot any VM that is shutoff

Environment
OS: Rocky Linux 9.2
Kernel: 5.14.0-284.18.1.el9_2.x86_64
Docker version: 24.0.5
Kolla-Ansible version: 16.1.0
Docker image install: source
docker image distribution: Rocky Linux
Are you using official images from Docker Hub or self built? self-built
If self built - Kolla version and environment used to build: Rocky Linux 9, kolla 16.1.0

---------------------------------------------------------------------------------------------

Nova Compute Error Log
2023-07-24 20:18:22.525 170 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0
2023-07-24 20:18:22.527 170 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_NET_ADMIN|CAP_SYS_ADMIN/CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_NET_ADMIN|CAP_SYS_ADMIN/none
2023-07-24 20:18:22.527 170 INFO oslo.privsep.daemon [-] privsep daemon running as pid 170
2023-07-24 20:18:22.754 7 ERROR nova.compute.manager [None req-581a996c-fca5-4e40-85dd-b5df73402786 c83b20bf05a74781aed3f71d5754016e 09db9c0e63c14e8284d2bd0c25808adb - - default default] [instance: d5357e39-903a-4ff1-a723-82b069cb140c] Cannot reboot instance: Unexpected error while running command.
Command: systemd-run --scope mount.quobyte --disable-xattrs s01,s02,s03,s04,s05,s06,s07,s08/OpenStack_VM /var/lib/nova/mnt/e776a3bd00c6c7dad9c45078adb85475
Exit code: 1
Stdout: ''
Stderr: 'Failed to connect to bus: No data available\n': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
2023-07-24 20:18:22.877 7 INFO nova.compute.manager [None req-581a996c-fca5-4e40-85dd-b5df73402786 c83b20bf05a74781aed3f71d5754016e 09db9c0e63c14e8284d2bd0c25808adb - - default default] [instance: d5357e39-903a-4ff1-a723-82b069cb140c] Successfully reverted task state from reboot_started_hard on failure for instance.
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server [None req-581a996c-fca5-4e40-85dd-b5df73402786 c83b20bf05a74781aed3f71d5754016e 09db9c0e63c14e8284d2bd0c25808adb - - default default] Exception during message handling: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
Command: systemd-run --scope mount.quobyte --disable-xattrs s01,s02,s03,s04,s05,s06,s07,s08/OpenStack_VM /var/lib/nova/mnt/e776a3bd00c6c7dad9c45078adb85475
Exit code: 1
Stdout: ''
Stderr: 'Failed to connect to bus: No data available\n'
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/exception_wrapper.py", line 71, in wrapped
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server _emit_versioned_exception_notification(
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server self.force_reraise()
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server raise self.value
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/exception_wrapper.py", line 63, in wrapped
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py", line 186, in decorated_function
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server LOG.warning("Failed to revert task state for instance. "
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server self.force_reraise()
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server raise self.value
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py", line 157, in decorated_function
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/utils.py", line 1439, in decorated_function
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py", line 214, in decorated_function
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server compute_utils.add_instance_fault_from_exc(context,
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server self.force_reraise()
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server raise self.value
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py", line 203, in decorated_function
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py", line 4157, in reboot_instance
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server do_reboot_instance(context, instance, block_device_info, reboot_type)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_concurrency/lockutils.py", line 414, in inner
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py", line 4155, in do_reboot_instance
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server self._reboot_instance(context, instance, block_device_info,
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py", line 4249, in _reboot_instance
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server self._set_instance_obj_error_state(instance)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server self.force_reraise()
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server raise self.value
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py", line 4219, in _reboot_instance
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server self.driver.reboot(context, instance,
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py", line 3857, in reboot
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server return self._hard_reboot(context, instance, network_info,
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py", line 3947, in _hard_reboot
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server xml = self._get_guest_xml(context, instance, network_info, disk_info,
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py", line 7528, in _get_guest_xml
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server conf = self._get_guest_config(instance, network_info, image_meta,
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py", line 7038, in _get_guest_config
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server storage_configs = self._get_guest_storage_config(context,
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py", line 5606, in _get_guest_storage_config
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server self._connect_volume(context, connection_info, instance)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py", line 1921, in _connect_volume
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server vol_driver.connect_volume(connection_info, instance)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_concurrency/lockutils.py", line 414, in inner
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/volume/quobyte.py", line 180, in connect_volume
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server mount_volume(quobyte_volume,
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/volume/quobyte.py", line 80, in mount_volume
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server nova.privsep.libvirt.systemd_run_qb_mount(volume, mnt_base,
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_privsep/priv_context.py", line 271, in _wrap
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server return self.channel.remote_call(name, args, kwargs,
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_privsep/daemon.py", line 215, in remote_call
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server raise exc_type(*result[2])
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server Command: systemd-run --scope mount.quobyte --disable-xattrs s01,s02,s03,s04,s05,s06,s07,s08/OpenStack_VM /var/lib/nova/mnt/e776a3bd00c6c7dad9c45078adb85475
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server Exit code: 1
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server Stdout: ''
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server Stderr: 'Failed to connect to bus: No data available\n'
2023-07-24 20:18:22.879 7 ERROR oslo_messaging.rpc.server
2023-07-24 20:21:32.369 7 INFO nova.virt.libvirt.driver [None req-9286cab1-c00c-4587-a7ca-d67b491d2a2b c83b20bf05a74781aed3f71d5754016e 09db9c0e63c14e8284d2bd0c25808adb - - default default] [instance: 69fe4b5a-dcbf-49e6-b547-d49c0c292f98] Instance already shutdown.

Revision history for this message
Rafael Madrid (rmadridr) wrote :

Update:

I fixed the issue by adding the following properties to the task "Restart nova-compute container":
- pid_mode: "host"
- cgroupns_mode: "host"

Is this a bug?

Revision history for this message
Silvan Kaiser (2-silvan) wrote :

Hi Rafael!
Where exactly did you have to adapt the restart settings and did you use modified containers?

I'm asking because afaics the nova-compute container runs without systemd integration but nova tries to launch the driver in a systemd environment ('systemd-run ...'). This means either the env is different from what I expect or there's an issue with the systemd detection in the driver.

The nova-libvirt container seems to run with systemd but has pid_mode & groupns_mode set to host by default.

Revision history for this message
Silvan Kaiser (2-silvan) wrote :

Ah, sry. You wrote you used modified containers. So to be more specific, what was modified (roughly) and might that interfere with systemd detection?

Revision history for this message
Rafael Madrid (rmadridr) wrote :

Hi Silvan,

I had to make the following changes to get nova-compute working:

1. Added "pid_mode" and "cgroupsns_mode" on file "ansible/roles/nova-cell/handlers/main.yml"
line 160: pid_mode: "{{ service.pid_mode | default('') }}"
line 161: cgroupns_mode: "{{ service.cgroupns_mode | default(omit) }}

2. Added "pid_mode" and "cgroupsns_mode" on file "ansible/roles/nova-cell/defaults/main.yml"
line 59: pid_mode: "host"
line 60: cgroupns_mode: "host"

Revision history for this message
Silvan Kaiser (2-silvan) wrote :

Hi Rafael!
Thanks for the update.
These were the only modifications? Which images exactly did you use (source url/version)?

Revision history for this message
Rafael Madrid (rmadridr) wrote :

Hi Silvan,

Yes, these were the only modifications.

Regarding the images, I am using the ones provided by kolla (16.1.0). However, instead of pulling them from some public registry (like dockerhub), I build them locally with Kolla-build and host them on my own private registry. Hope this helps!

Revision history for this message
Silvan Kaiser (2-silvan) wrote :

Hi Rafael!
Which images were used for the openstack-base image config, which image_name_prefix did you use?
Or could you perhaps drop me the kolla build config you used?
Gr
Silvan

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.