VMware: write error lost while transferring volume

Bug #1416000 reported by Matthew Booth
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Unassigned
oslo.vmware
Confirmed
Low
Unassigned

Bug Description

I'm running the following command:

cinder create --image-id a24f216f-9746-418e-97f9-aebd7fa0e25f 1

The write side of the data transfer (a VMwareHTTPWriteFile object) returns an error in write() which I haven't debugged, yet. However, this error is never reported to the user, although it does show up in the logs. The effect is that the transfer sits in the 'downloading' state until the 7200 second timeout, when it reports the timeout.

The reason is that the code which waits on transfer completion (in start_transfer) does:

    try:
        # Wait on the read and write events to signal their end
        read_event.wait()
        write_event.wait()
    except (timeout.Timeout, Exception) as exc:
        ...

That is, it waits for the read thread to signal completion via read_event before checking write_event. However, because write_thread has died, read_thread is blocking and will never signal completion. You can demonstrate this by swapping the order. If you want for write first it will die immediately, which is what you want. However, that's not right either because now you're missing read errors.

Ideally this code needs to be able to notice an error at either end and stop immediately.

Tags: vmware
Tracy Jones (tjones-i)
tags: added: vmware
Revision history for this message
Radoslav Gerganov (rgerganov) wrote :

The same problem exists in Nova as we use the same approach for image transfer:

https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/images.py#L181

Revision history for this message
Matthew Booth (mbooth-9) wrote :
Changed in cinder:
status: New → Confirmed
importance: Undecided → Low
assignee: nobody → Vipin Balachandran (vbala)
Changed in oslo.vmware:
status: New → Confirmed
assignee: nobody → Vipin Balachandran (vbala)
importance: Undecided → Low
Changed in nova:
status: New → Confirmed
importance: Undecided → Low
Revision history for this message
Radoslav Gerganov (rgerganov) wrote :

Cinder is now using oslo.vmware, so you can remove Cinder from the affected projects

no longer affects: cinder
Changed in oslo.vmware:
assignee: Vipin Balachandran (vbala) → nobody
Revision history for this message
Radoslav Gerganov (rgerganov) wrote :

This has been fixed in Nova as part of the image transfer refactoring long time ago:
https://github.com/openstack/nova/commit/2df83abaa0a5c828421fc38602cc1e5145b46ff4

Changed in nova:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.