Nova VMwareapi Fails Image Download

Bug #2053027 reported by Fabian Wiesel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Undecided
Unassigned

Bug Description

Description
===========
When trying to start an instance with a local root disk, the VM fails with InvalidArgument.

Steps to reproduce
==================
A chronological list of steps which will bring off the
issue you noticed:
* Setup Devstack
* Create Server with boot from volume:
  source devstack/openrc admin demo
  openstack server create --flavor 1 --boot-from-volume 10 --image-property vmware_disktype=sparse --wait test-bfv-disk
* Create Server with root disk
  openstack server create --flavor 1 --image-property vmware_disktype=sparse --wait test-root-disk

Expected result
===============
Both instances come up.

Actual result
=============
The first one comes up.
The second one fails with the error
> Instance failed to spawn: oslo_vmware.exceptions.VimFaultException: A specified parameter was not correct: fileType

Further inspection yield, that the file is empty.
The connection has been closed by the nova-compute prematurely on glance side
> Unhandled error: OSError: write error
as well on the vsphere side.

Environment
===========
1. Exact version of OpenStack you are running.

Tested it with nova master (i.e. 35af4b345d997b63f999a090e236d91b78ea4304)

2. Which hypervisor did you use? What's the version of that?

vmwareapi (VSphere 7.0.3 & ESXi 7.0.3)

2. Which storage type did you use?

vmdk on NFS 4.0

3. Which networking type did you use?

networking-nsx-t (https://github.com/sapcc/networking-nsx-t)

Logs & Configs
==============

Logs for running on master:
http://openstack-ci-logs.global.cloud.sap/openstack/nova/35af4b345d997b63f999a090e236d91b78ea4304/index.html

Additional Information
========

As the argument is an IterableWithLength (https://github.com/openstack/python-glanceclient/blob/master/glanceclient/common/utils.py#L563C7-L563C25),
we can print debug log message in the ImageReadHandle (https://opendev.org/openstack/oslo.vmware/src/branch/master/oslo_vmware/rw_handles.py#L765).
When logging the read bytes and the and the file size, the length of the iterator matches the file-size stored in glance, while the `StopIteration` is raised right on the first call here:
https://opendev.org/openstack/oslo.vmware/src/branch/master/oslo_vmware/rw_handles.py#L762

Interestingly, the same code is used in cinder (the working case), and there the iterator is not raising a `StopIteration` before reaching the end of the file.
So, possibly that is a red herring.

The main difference is how Cinder and Nova handle the upload side.
Nova uses a PUT on an individual ESXi host, while Cinder uses a POST operation to the VSphere API.

description: updated
Revision history for this message
sean mooney (sean-k-mooney) wrote :
Download full text (4.9 KiB)

this is the relevent backtrace

[ 781.229500] env[61868]: Faults: ['InvalidArgument']
[ 781.229500] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] Traceback (most recent call last):
[ 781.229500] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] File "/opt/stack/nova/nova/compute/manager.py", line 2885, in _build_resources
[ 781.229500] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] yield resources
[ 781.229500] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] File "/opt/stack/nova/nova/compute/manager.py", line 2632, in _build_and_run_instance
[ 781.229500] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] self.driver.spawn(context, instance, image_meta,
[ 781.229500] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 539, in spawn
[ 781.229500] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] self._vmops.spawn(context, instance, image_meta, injected_files,
[ 781.229500] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 786, in spawn
[ 781.229500] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] self._fetch_image_if_missing(context, vi)
[ 781.229500] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 639, in _fetch_image_if_missing
[ 781.229904] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] image_cache(vi, tmp_image_ds_loc)
[ 781.229904] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 537, in _cache_sparse_image
[ 781.229904] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] vm_util.copy_virtual_disk(
[ 781.229904] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] File "/opt/stack/nova/nova/virt/vmwareapi/vm_util.py", line 1423, in copy_virtual_disk
[ 781.229904] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] session._wait_for_task(vmdk_copy_task)
[ 781.229904] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] File "/opt/stack/nova/nova/virt/vmwareapi/session.py", line 157, in _wait_for_task
[ 781.229904] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] return self.wait_for_task(task_ref)
[ 781.229904] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] File "/opt/stack/data/venv/lib/python3.10/site-packages/oslo_vmware/api.py", line 398, in wait_for_task
[ 781.229904] env[61868]: ERROR nova.compute.manager [instance: b9c8dd4c-8a19-4ffb-8e57-b273c000f121] ...

Read more...

Revision history for this message
Fabian Wiesel (fabian-wiesel) wrote :

That is a follow up error, you could say from some "hidden" or swallowed problem.

The file size is zero. Copying that file from the vsphere local cache location to the desired location for the VM causes the vsphere api to respond with the invalid file type.

description: updated
Revision history for this message
Fabian Wiesel (fabian-wiesel) wrote :

It seems to be independent of vmwareapi in a way.
I've isolated a test-case (attached) which simply reads from glance in three different forms:
1. Using the glance client directly and iterate over the image chunks (works).
2. Using the `nova.glance.API` class and download to a file (works).
3. Using the `nova.glance.API` class and iterate over the chunks (fails).

The vmwareapi driver in nova is presumably the only one iterating over chuncks.

Revision history for this message
Fabian Wiesel (fabian-wiesel) wrote :

I did a checkout of the latest xena release (ac1c6a8c7d85502babdf617bc49b5bb2301c2053), and the test script works.

Bisecting with the test identifies 43bca185fe2d00bb70d7486fa6c6a0b9eda1fc17

(https://opendev.org/openstack/nova/commit/43bca185fe2d00bb70d7486fa6c6a0b9eda1fc17)

as the commit, that breaks the iteration over the image chunks.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/909474

Changed in nova:
status: New → In Progress
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.