bdm save failure leaves inconsistent data during attaching volume

Bug #1695187 reported by Felix Ma
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned

Bug Description

The bug is found in Kilo environment. It exists in master.

Before saving the bdm, the volume has been attached at the hypervisor level.
When save fails, you will see the VM has virtual disk (virsh domblklist)
but can't see it when you do nova show or cinder show.
Even worse, in my IP-SAN environment, the configuration isn't cleaned at the array,
which might make data inconsistent.

nova/virt/block_device.py

    @update_db
    def attach(self, context, instance, volume_api, virt_driver,
               do_driver_attach=False, **kwargs):
    .....
        if volume['attach_status'] == "detached":
            # NOTE(mriedem): save our current state so connection_info is in
            # the database before the volume status goes to 'in-use' because
            # after that we can detach and connection_info is required for
            # detach.
            self.save() <==== Errors here !!!
            try:
                volume_api.attach(context, volume_id, instance.uuid,
                                  self['mount_device'], mode=mode)
            except Exception:
                with excutils.save_and_reraise_exception():
                    if do_driver_attach:
                        try:
                            virt_driver.detach_volume(connection_info,
                                                      instance,
                                                      self['mount_device'],
                                                      encryption=encryption)
                        except Exception:
                            LOG.warning(_LW("Driver failed to detach volume "
                                         "%(volume_id)s at %(mount_point)s."),
                                     {'volume_id': volume_id,
                                      'mount_point': self['mount_device']},
                                     exc_info=True, instance=instance)
                    volume_api.terminate_connection(context, volume_id,
                                                    connector)

The trace:
----------
2017-05-24 17:24:23.272 3125 ERROR nova.compute.manager [req-6040188f-4338-47a1-855d-4ecc0eb71203 62f52135115f4898bd0d82c1f0cd632b 6c149dcd3cf64171b8dd972dd03bbac0 - - -] [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] Failed to attach 2421acfd-0f94-4aca-81d0-747bf803aed7 at /dev/vdb
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] Traceback (most recent call last):
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 5226, in _attach_volume
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] do_check_attach=False, do_driver_attach=True)
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 53, in wrapped
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] obj.save()
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 321, in save
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] super(DriverVolumeBlockDevice, self).save()
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/nova/virt/block_device.py", line 143, in save
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] self._bdm_obj.save()
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/nova/objects/base.py", line 192, in wrapper
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] self._context, self, fn.__name__, args, kwargs)
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/nova/conductor/rpcapi.py", line 340, in object_action
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] objmethod=objmethod, args=args, kwargs=kwargs)
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 156, in call
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] retry=self.retry)
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 90, in _send
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] timeout=timeout, retry=retry)
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 350, in send
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] retry=retry)
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 341, in _send
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] raise result
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] TypeError: 'NoneType' object has no attribute '__getitem__'
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] Traceback (most recent call last):
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57]
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 423, in _object_dispatch
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] return getattr(target, method)(*args, **kwargs)
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57]
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/nova/objects/base.py", line 208, in wrapper
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] return fn(self, *args, **kwargs)
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57]
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/nova/objects/block_device.py", line 175, in save
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] self._from_db_object(self._context, self, updated)
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57]
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] File "/usr/lib/python2.7/site-packages/nova/objects/block_device.py", line 89, in _from_db_object
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] block_device_obj[key] = db_block_device[key]
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57]
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57] TypeError: 'NoneType' object has no attribute '__getitem__'
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57]
2017-05-24 17:24:23.272 3125 TRACE nova.compute.manager [instance: 63b7baae-599c-41b3-bbee-f59bbc239e57]
----------

Felix Ma (felix23ma)
Changed in nova:
assignee: nobody → Felix Ma (felix23ma)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/470181

Sean Dague (sdague)
Changed in nova:
status: New → In Progress
Revision history for this message
Sean Dague (sdague) wrote :

Automatically discovered version kilo in description. If this is incorrect, please update the description to include 'nova version: ...'

tags: added: openstack-version.kilo
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Stephen Finucane (<email address hidden>) on branch: master
Review: https://review.opendev.org/470181
Reason: This has been in merge conflict for sometime so I'm going to abandon it to keep things clean. Please reopen if you think it's still valid :)

Changed in nova:
assignee: Felix Ma (felix23ma) → nobody
status: In Progress → New
Revision history for this message
Matt Riedemann (mriedem) wrote :

This is really old and needs to be re-triaged/re-reported if it's still a valid issue on master.

Changed in nova:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.