I manually rolled the changes into an environment today on a single cinder node. The tenant had shutdown the instances with the volumes attached prior to the maintenance. After rebooting the cinder node, it was found that the state files located in /var/lib/cinder/volumes/ were nonexistent. This manifested itself when attempting to start an instance which had a volume on the effected cinder node:
2015-06-09 18:54:58.804 19459 TRACE oslo.messaging.rpc.dispatcher libvirtError: Failed to open file '/dev/disk/by-path/ip-10.17.150.69:3260-iscsi-iqn.2010-10.org.openstack:volume-ab377917-b1fa-416a-b001-ef6b9ff09715-lun-1': No such device or address
When attempting to discover the targets the following errors were seen:
# iscsiadm -m discovery -t st -p 10.17.150.69:3260
iscsiadm: Connection to Discovery Address 10.17.150.69 closed
iscsiadm: Login I/O error, failed to receive a PDU
iscsiadm: retrying discovery login to 10.17.150.69
015-06-09 18:29:18.882 1411 ERROR cinder.volume.manager [req-62b66a03-c879-482c-8be5-942e5b35180d - - - - -] Failed to re-export volume ab377917-b1fa-416a-b001-ef6b9ff09715: setting to error state
2015-06-09 18:29:18.883 1411 ERROR cinder.volume.manager [req-62b66a03-c879-482c-8be5-942e5b35180d - - - - -] Failed to create iscsi target for volume volume-ab377917-b1fa-416a-b001-ef6b9ff09715.
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager Traceback (most recent call last):
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager File "/usr/local/lib/python2.7/dist-packages/cinder/volume/manager.py", line 276, in init_host
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager self.driver.ensure_export(ctxt, volume)
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager File "/usr/local/lib/python2.7/dist-packages/osprofiler/profiler.py", line 105, in wrapper
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager return f(*args, **kwargs)
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager File "/usr/local/lib/python2.7/dist-packages/cinder/volume/drivers/lvm.py", line 543, in ensure_export
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager self.configuration)
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager File "/usr/local/lib/python2.7/dist-packages/cinder/volume/iscsi.py", line 116, in ensure_export
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager write_cache=conf.iscsi_write_cache)
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager File "/usr/local/lib/python2.7/dist-packages/cinder/brick/iscsi/iscsi.py", line 249, in create_iscsi_target
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager raise exception.ISCSITargetCreateFailed(volume_id=vol_id)
2015-06-09 18:29:18.883 1411 TRACE cinder.volume.manager ISCSITargetCreateFailed: Failed to create iscsi target for volume volume-ab377917-b1fa-416a-b001-ef6b9ff09715.
The short term fix was to hack mysql and set attach_status=“detached”. I could then re-attach the volume to the instance. That action created the state file in /var/lib/cinder/volume/. The down side to this fix was it created duplicate volumes on the instance:
This caused issues for the tenant’s instance as the RAID volumes failed to activate. Once I cleaned up the nova.block_device_mapping and set deleted=1 for the duplicate entries, I was able to get the instance to see the correct volumes. It is unclear why the state files were removed from /var/lib/cinder/volumes. In hind site, these can be re-created by hand from the data in “nova.block_device_mapping” or prior to rolling this change out, the instances should be shutdown and volumes detached. Post update the volumes can be re-attached.
I manually rolled the changes into an environment today on a single cinder node. The tenant had shutdown the instances with the volumes attached prior to the maintenance. After rebooting the cinder node, it was found that the state files located in /var/lib/ cinder/ volumes/ were nonexistent. This manifested itself when attempting to start an instance which had a volume on the effected cinder node:
2015-06-09 18:54:58.804 19459 TRACE oslo.messaging. rpc.dispatcher libvirtError: Failed to open file '/dev/disk/ by-path/ ip-10.17. 150.69: 3260-iscsi- iqn.2010- 10.org. openstack: volume- ab377917- b1fa-416a- b001-ef6b9ff097 15-lun- 1': No such device or address
When attempting to discover the targets the following errors were seen:
# iscsiadm -m discovery -t st -p 10.17.150.69:3260
iscsiadm: Connection to Discovery Address 10.17.150.69 closed
iscsiadm: Login I/O error, failed to receive a PDU
iscsiadm: retrying discovery login to 10.17.150.69
2015-06-09 19:23:43.073 19459 TRACE oslo.messaging. rpc.dispatcher Command: sudo nova-rootwrap /etc/nova/ rootwrap. conf iscsiadm -m node -T iqn.2010- 10.org. openstack: volume- ab377917- b1fa-416a- b001-ef6b9ff097 15 -p 10.17.150.69:3260 --rescan rpc.dispatcher Exit code: 21 rpc.dispatcher Stdout: u'' rpc.dispatcher Stderr: u'iscsiadm: No session found.\n'
2015-06-09 19:23:43.073 19459 TRACE oslo.messaging.
2015-06-09 19:23:43.073 19459 TRACE oslo.messaging.
2015-06-09 19:23:43.073 19459 TRACE oslo.messaging.
On the cinder node:
015-06-09 18:29:18.882 1411 ERROR cinder. volume. manager [req-62b66a03- c879-482c- 8be5-942e5b3518 0d - - - - -] Failed to re-export volume ab377917- b1fa-416a- b001-ef6b9ff097 15: setting to error state volume. manager [req-62b66a03- c879-482c- 8be5-942e5b3518 0d - - - - -] Failed to create iscsi target for volume volume- ab377917- b1fa-416a- b001-ef6b9ff097 15. volume. manager Traceback (most recent call last): volume. manager File "/usr/local/ lib/python2. 7/dist- packages/ cinder/ volume/ manager. py", line 276, in init_host volume. manager self.driver. ensure_ export( ctxt, volume) volume. manager File "/usr/local/ lib/python2. 7/dist- packages/ osprofiler/ profiler. py", line 105, in wrapper volume. manager return f(*args, **kwargs) volume. manager File "/usr/local/ lib/python2. 7/dist- packages/ cinder/ volume/ drivers/ lvm.py" , line 543, in ensure_export volume. manager self.configuration) volume. manager File "/usr/local/ lib/python2. 7/dist- packages/ cinder/ volume/ iscsi.py" , line 116, in ensure_export volume. manager write_cache= conf.iscsi_ write_cache) volume. manager File "/usr/local/ lib/python2. 7/dist- packages/ cinder/ brick/iscsi/ iscsi.py" , line 249, in create_iscsi_target volume. manager raise exception. ISCSITargetCrea teFailed( volume_ id=vol_ id) volume. manager ISCSITargetCrea teFailed: Failed to create iscsi target for volume volume- ab377917- b1fa-416a- b001-ef6b9ff097 15.
2015-06-09 18:29:18.883 1411 ERROR cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
2015-06-09 18:29:18.883 1411 TRACE cinder.
The short term fix was to hack mysql and set attach_ status= “detached” . I could then re-attach the volume to the instance. That action created the state file in /var/lib/ cinder/ volume/ . The down side to this fix was it created duplicate volumes on the instance:
nova show: volumes: volumes_ attached | [{"id": "bc92b4ee- a6c4-430b- 98c1-dbbb2ae22a 78"}, {"id": "7cd4b5da- 8990-49d1- adf9-ea72c6a7b9 76"}, {"id": "ab377917- b1fa-416a- b001-ef6b9ff097 15"}, {"id": "ab377917- b1fa-416a- b001-ef6b9ff097 15"}, {"id": "633b5b59- 8e6c-45b3- 9245-fd3d530b01 5a"}, {"id": "633b5b59- 8e6c-45b3- 9245-fd3d530b01 5a"}]
….
os-extended-
….
This caused issues for the tenant’s instance as the RAID volumes failed to activate. Once I cleaned up the nova.block_ device_ mapping and set deleted=1 for the duplicate entries, I was able to get the instance to see the correct volumes. It is unclear why the state files were removed from /var/lib/ cinder/ volumes. In hind site, these can be re-created by hand from the data in “nova.block_ device_ mapping” or prior to rolling this change out, the instances should be shutdown and volumes detached. Post update the volumes can be re-attached.