VMware vCenter Driver failed to extend virtual disk while booting new VM

Bug #1348495 reported by Fan Guo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
High
Unassigned

Bug Description

How to produce this bug:
      Compute driver: VMwareVCDriver
      vSphere API 5.1,
      VMware use: Administrator
      Image: sparse type with ide controller.
      Confiure in nova.conf: linked_clone =True.
      Boot vm with “root_disk_size” (Example: 20GB)> “image_size”(Example:10GB)
If it’s first time to do this kind of vm booting, then lead to a booting failure.
Then, try to boot a second VM with the same image and flavor of first booting, you will success this time.

I noticed that vm booting failed in the virtual disk extending phase.
… In this phase:
If there is no virtual disk cache(virtual disk with the targeted image and root disk size) in this host,
    and “root_disk_size” (defines in flavor, Example: 20GB)> “image_size”(defined in image file, Example:10GB),
then go to the virtual disk extending phase.
   Tasks of virtual disk extending phase:
     1) Invoke virtual disk extending task (t_1) to do the disk extending thing.
     2) Make one additional task (t_2) to check the status(queued, running, error, success) of task (t_1).
Task (t_2) failed with an error, and this error causes the vm booting process to fail.
So, this causes the first VM booting failed.

But, there is no clean job to clear the (t_1) task, so the virtual disk will still in extending, and at last(very fast without zero formatting) this extending will success.
So, the second time we boot VM with the same image and flavor of the previous booting, the extended virtual disk(cache) is already there.So, that’s why I successed in the second booting.

I think the following code causes this issue:
    def _poll_task(self, task_ref, done):
        """Poll the given task, and fires the given Deferred if we
        get a result.
        """
        try:
            task_info = self._call_method(vim_util, "get_dynamic_property",
                            task_ref, "Task", "info")
            task_name = task_info.name

If task_info is None, then checking status of the task will cause exception, and we just log and reraise it in the upper caller. So lead to VM booting process failed.

Tags: vmware
Revision history for this message
zhu zhu (zhuzhubj) wrote :

This looks interesting. I am not met this issue before. Do you mean the _wait_for_task failure cause boot failure. And the _wait_for_task failure is due to can not get task information. Currently I don't see cases for the task.info couldn't be retrieved?

Are you using Flat images for testing?

Revision history for this message
Fan Guo (faguo) wrote :

Hi zhuzhubj,
I am testing with Sparse image:

Size: 1048707072
Disk format: vmdk
Container format: bare
Property 'vmware_disktype': sparse
Property 'vmware_adaptertype': ide

Revision history for this message
Fan Guo (faguo) wrote :

To solve this issue, I think we should add following handle:
     if chek_state_task (t_2) meets exception, it should clear task (t_1) before exit.

Tracy Jones (tjones-i)
tags: added: vmware
Changed in nova:
importance: Undecided → High
Fan Guo (faguo)
Changed in nova:
assignee: nobody → Fan Guo (faguo)
Fan Guo (faguo)
Changed in nova:
status: New → In Progress
Fan Guo (faguo)
Changed in nova:
status: In Progress → Invalid
assignee: Fan Guo (faguo) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.