image_transfer.ImageWriter.start() incorrectly assumes image_service.update() is a non-blocking operation

Bug #1384840 reported by Jay Pipes
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
oslo.vmware
Confirmed
Low
Unassigned

Bug Description

Code inside the ImageWriter.start() method looks like this:

            try:
                self._image_service.update(self._context,
                                           self._image_id,
                                           self._image_meta,
                                           data=self._input_file)
                self._running = True
                while self._running:
                    LOG.debug("Retrieving status of image: %s.",
                              self._image_id)
                    image_meta = self._image_service.show(self._context,
                                                          self._image_id)
                    image_status = image_meta.get('status')
                    if image_status == 'active':
                        self.stop()
                        LOG.debug("Image: %s is now active.",
                                  self._image_id)
                        self._done.send(True)
                    elif image_status == 'killed':
                        self.stop()
                        excep_msg = (_("Image: %s is in killed state.") %
                                     self._image_id)
                        LOG.error(excep_msg)
                        excep = exceptions.ImageTransferException(excep_msg)
                        self._done.send_exception(excep)
                    elif image_status in ['saving', 'queued']:
                        LOG.debug("Image: %(image)s is in %(state)s state; "
                                  "sleeping for %(sleep)d seconds.",
                                  {'image': self._image_id,
                                   'state': image_status,
                                   'sleep': IMAGE_SERVICE_POLL_INTERVAL})
                        greenthread.sleep(IMAGE_SERVICE_POLL_INTERVAL)
                    else:
                        self.stop()
                        excep_msg = (_("Image: %(image)s is in unknown "
                                       "state: %(state)s.") %
                                     {'image': self._image_id,
                                      'state': image_status})
                        LOG.error(excep_msg)
                        excep = exceptions.ImageTransferException(excep_msg)
                        self._done.send_exception(excep)
            except Exception as excep:
                self.stop()
                excep_msg = (_("Error occurred while writing image: %s") %
                             self._image_id)
                LOG.exception(excep_msg)
                excep = exceptions.ImageTransferException(excep_msg, excep)
                self._done.send_exception(excep)

The problem with the above is two-fold:

1) The code in image_service.update() is a blocking call. It doesn't complete until the image bytes in image_file either are successfully written to backend storage in Glance's store driver, or fails to do so. There is no possible way for the logic blocks for:

 elif image_status in ['saving', 'queued']:

or:

 else:

to happen. The image will either be active or killed if the call to image_service.update() did not raise an exception. Now, if the call to update() *did* raise an exception, then the status of the image /might/ be in 'saving' state (not queued), at which point the code could potentially retry the update() call.

2) I have a bad feeling about tightly-coupling the image API with the write handles in this way. I'd much prefer to see the image service/API objects removed entirely from the ImageWriter objects and used in code outside of oslo.vmware library, with the ImageWriter object being a simple wrapper around an io.BufferedWriter. In fact, I'm not entirely sure it's necessary to even have that. Instead, just remove all the custom reader/writer stuff and have oslo.vmware library implement a transfer_queue module that manages the traffic control of io.BufferedReader and io.BufferedWriter objects.

Changed in oslo.vmware:
status: New → Confirmed
importance: Undecided → Low
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/oslo.vmware 2.12.0

This issue was fixed in the openstack/oslo.vmware 2.12.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.