Reshdule the instance will occurs volume attachment_id not found

Bug #1933610 reported by Jorhson Deng
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Low
Jorhson Deng

Bug Description

Node1:
Traceback (most recent .cal1 last)
2021-05-14.06:3:29.859 623535 ERROR nova.compute.manacer File "/var/lib/kolla/venv/lib/pthon2.7/site-packages/nova/compute/manager.py", line 198, in allocate_network_async
2021-05-14 06:3:29.859 6235 ERROR nova.cmpute.manae File. /var/lib/kolla/venv/lib/pthon2.7/site-packages/nova/network/neutronv2/api.", line 16, in update_ports_for_instance
2021-05-14.06:33:29.859 623535 ERROR nova.compute.manager port_client, instance, port_id, port_req_body)
2021-05-14-06:3:29.859 623535 ERROR nova.compute.manager File. /var/1ib/kolka/venx/1ib/python2.7/site-packages/nova/network/neutronv2/api.py", line. 587, in update_port
2021-05-14 .06:33:29.859 623535 ERROR nova.compute.manager ensure noport binding failure (port)2021-05-1406:3:29.59625 ERO. nova.ompute.mage le /a/ib/xoia/zenz/ibpthno./sie-ackages/nova/netprk/eutov/apa.", line-29, n ensure no port binding failure
2021-05-14 .06:33:29.859 623535 ERROR nova . compute.manager. raise exception. PortBindingFailed (port id=port ['id'])
2021-05-14 06:33:29.859 623535 ERROR nova.cmpute manager PortBindingailed: Binding failed for port 401b0ea6-800f-25-02-36f45616482, please check neutron 1ogs for more information.

Node2:
 "/var/1ib/kolla/venv/lib/python2.7/site-packages/nova/volume/cinder.py". line 429, in wrapper res =method (self, ctx, attachment_id, *args, **kwargs)
 File."/var/1ib/kolla/venv/lib/python2.7/site-packages/nova/volume/cinder.gx", .1ine 848, in attachment update ....'code ':getattr(ex, .'code'. None)}) ..File."/var/1ib/kolla/venv/lib/python2.7/site-packages/oslo_utils/esuahe-R", .line. 220, in. exit....self.force reraise ()..
 File "/var/1ib/kolla/venv/lib/python2.7/site-packages/oslo utils/exsutils.Rx", .line 196, in force reraise...six.xezaise (self.type , self.value, self.tk..
 File."/var/1ib/kolla/venv/lib/python2.7/site-packages/nova/volume/cinder.py". .line 838. in attachment update ...attachmenp id, connector) ..File "/var/lib/kalla/yeny/lib/python2.7/site-packages/sindexskaent/api_versions.R", line 407, in substitutid.return method.fune (obi. axas. -kwarga).
 File."/var/1ib/kolla/venv/lib/python2.7/site-packages/sindexslient/v3/attachments.py", .line75, in _update resp.=self._update ('/attachments/%s',%id,body)
 File."/var/1ib/kolla/venv/lib/python2.7/site-packages/cinderclient/base.py". 1ine 344. .in _update resp, body=.self.api.client.put (url,body=body,**kwargs)
 File."/var/1ib/kolla/venv/lib/python2.7/site-packages/cindexclient/client.py", line 206, in put return self._cs_request (url,'pUT',**kwargs)
 File."/var/1ib/kolla/venv/lib/python2.7/site-packages/sindexslient/client.py", line .191, in _cs_request return self.request (url. method, **kwargs)
 File"/var/1ib/kolla/venv/lib/python2.7/site-packages/cindexclient/client.py", 1ine 177, in request raise exceptions.from response (resp, body)VolumeAttachmentNotFound: Volume attachment-7ca1626h-f73f-4c31-83ef-ce9ae4b49cd0 could not be found

Revision history for this message
Jorhson Deng (jorhson) wrote :

    while in one host,if creating the instance and binding port failed,it will raise the PortBindingailed exception; And in the _build_resource function,will catch the exception and call the _shutdown_instance. In the _shutdown_instance fucntion, it will call the self.volume_api.attachment_delete(context, bdm.attachment_id) to delete the attachment_id, because we have created the volume and attachment_id in the _build_resource fucntion.
    If we schedule the instance to another host, we have the volume_id and will not recreate the attachment_id again for the volume, However, the attachment_id of the volume has been deleted in the first host. So, we should optimize the code in block_device.py.

Revision history for this message
Jorhson Deng (jorhson) wrote :
Changed in nova:
assignee: nobody → Jorhson Deng (jorhson)
status: New → In Progress
importance: Undecided → Low
Revision history for this message
Wave Wang (wave-wang) wrote (last edit ):

I hit this bug too in Rocky release. when using an existing image-created volume to create an instance and some exception occured in nova/virt/libvirt/driver.py `_create_domain_and_network` like libvirt Internal exception(ram not enough) and it rescheduled and hit the bug.
I read the nova code, found that in nova/compute/api.py `_check_attach_and_reserve_volume` create the attachement_id, how about to resotre the code to using old-style volume reserve to solve the problem?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.