nova resize instance error with cinder timeout

Bug #1873940 reported by Carlos Augusto da Silva Martins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
New
Undecided
Unassigned

Bug Description

Hi, an error is occurring when I do a simple resize on a instance with a volume attached it.

During this process, the cinder-api is called to detach or attach the volume over resize progress.

On the environment has 3 controllers with Haproxy and Keepalived, giving high availability to cloud.

There is a controller with lower memory available that do container works slowly and reach timeout when some API call go to any containers in this host.
Other services works fine, because, on getting timeout on API call, next calls go to a next working controller. But with the resize process it don't works. The cinder-api service, on first failed connection try, the resize stops.

With this, some errors occurs:
- The volumes are in error state.
- The instance can belongs to another host after the resize, and these information already saved in database.
- The XML of instance isn't in your host.

After this, I must to reset the volume's 'status' and 'attach_status' to each volume attached on instance with values 'In-use' and 'attached' respectively.
I need to edit the respective XML on /etc/libvirt/qemu/####instance_obj_id.xml, remove all attachments, then copy it to the host that instance belongs to be able to start the instance without errors.

Finally, I can begin starting to attach all needed volumes to the instance.

Why the process stops on cinder's treatment?

Revision history for this message
Jordan Callicoat (jcallicoat) wrote :

This is an upstream bug with nova-compute, unrelated to openstack-ansible. Please file an issue with nova.

A quick search turned up a recent bug report that involving the same root cause (nova not handling cinder api timeout), but was only referencing detaching: https://bugs.launchpad.net/nova/+bug/1888665

You might want to file a new report there about the resize failure and reference that bug.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.