Backup failed when initialize_connection backup device

Bug #1607333 reported by huanan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
Michal Dulko

Bug Description

What i do:
    1. Creat a 1G size volume and then backup this volume with 'cinder create-backup volume_uuid'.
Then i get following error:

c-vol log:

2016-07-28 19:09:06.715 INFO cinder.backup.manager [req-13fd7f16-a706-44eb-b490-72b26d0ce25d aaae5b52c427482290e213ba8381d0bd b54d0b554d654d5d9dbc1a13ffd2aae7] Create backup started, backup: 3f8d2483-d254-4584-bbe9-012dd94b585a volume: 92c2bcfd-b148-4420-8f4a-06823b8edb89.
2016-07-28 19:09:14.481 ERROR oslo_messaging.rpc.server [req-13fd7f16-a706-44eb-b490-72b26d0ce25d aaae5b52c427482290e213ba8381d0bd b54d0b554d654d5d9dbc1a13ffd2aae7] Exception during message handling
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server Traceback (most recent call last):
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 133, in _process_incoming
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 150, in dispatch
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 121, in _do_dispatch
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server result = func(ctxt, **new_args)
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/manager.py", line 406, in create_backup
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server self._update_backup_error(backup, context, six.text_type(err))
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server self.force_reraise()
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/manager.py", line 400, in create_backup
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server self._run_backup(context, backup, volume)
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/manager.py", line 436, in _run_backup
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server properties, is_snapshot)
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/manager.py", line 907, in _attach_device
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server return self._attach_volume(context, backup_device, properties)
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/manager.py", line 934, in _attach_volume
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server {'volume_id', volume.id})
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server self.force_reraise()
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/backup/manager.py", line 922, in _attach_volume
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server properties)
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server File "/opt/stack/cinder/cinder/volume/rpcapi.py", line 203, in initialize_connection
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server msg_args = {'volume_id': volume.id, 'connector': connector,
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server AttributeError: 'dict' object has no attribute 'id'
2016-07-28 19:09:14.481 TRACE oslo_messaging.rpc.server

I found this patch https://review.openstack.org/#/c/238894/ maybe cause this problem, in cinder.backup.manage:

    def _run_backup(self, context, backup, volume):
        backup_service = self.service.get_backup_driver(context)

        properties = utils.brick_get_connector_properties()
        backup_dic = self.volume_rpcapi.get_backup_device(context,
                                                          backup, volume)
        try:
            backup_device = backup_dic.get('backup_device')
            is_snapshot = backup_dic.get('is_snapshot')
            attach_info = self._attach_device(context, backup_device,
                                              properties, is_snapshot)

backup_device is a dict not object, it cause backup volume function can't used properly, need modify as soon as possible.

Changed in cinder:
status: New → Confirmed
assignee: nobody → Michal Dulko (michal-dulko-f)
status: Confirmed → In Progress
Revision history for this message
Lisa Li (lisali) wrote :

Are you testing in current master branch? I did tests, and didn't have above problems.

Revision history for this message
huanan (huanan) wrote :

Yes, I 'm testing in current master branch, since this patch https://review.openstack.org/#/c/238894/ merged.

Revision history for this message
Michal Dulko (michal-dulko-f) wrote :

@hunan: What Cinder driver are you running?

Revision history for this message
huanan (huanan) wrote :

Huawei driver.

Revision history for this message
Michal Dulko (michal-dulko-f) wrote :

@hunan: Are you maybe running multiple nodes and some of them aren't newest master?

Revision history for this message
huanan (huanan) wrote :

We use one node, and the environment is newest mater.

Revision history for this message
Michal Dulko (michal-dulko-f) wrote :

I'm sorry I've made a mistake in your nick in previous comments.

This complicates things a little as I don't see codepath when this can happen. Can you share cinder-volume and cinder-backup logs gathered during such error occurring?

Revision history for this message
huanan (huanan) wrote :

Sure, I will send to you Email.

Revision history for this message
huanan (huanan) wrote :
Download full text (13.7 KiB)

/opt/stack/cinder/cinder/backup/manager.py(434)_run_backup()
-> backup_device = backup_dic.get('backup_device')
(Pdb) l
429 properties = utils.brick_get_connector_properties()
430 backup_dic = self.volume_rpcapi.get_backup_device(context,
431 backup, volume)
432 try:
433 import pdb;pdb.set_trace()
434 -> backup_device = backup_dic.get('backup_device')
435 is_snapshot = backup_dic.get('is_snapshot')
436 attach_info = self._attach_device(context, backup_device,
437 properties, is_snapshot)
438 try:
439 device_path = attach_info['device']['path']
(Pdb) p backup_dic
{u'secure_enabled': False, u'is_snapshot': False, u'backup_device': {u'migration_status': None, u'provider_id': None, u'availability_zone': u'nova', u'terminated_at': None, u'updated_at': u'2016-08-02T01:41:40.000000', u'provider_geometry': None, u'replication_extended_status': None, u'replication_status': u'disabled', u'snapshot_id': None, u'ec2_id': None, u'deleted_at': None, u'id': u'221b42b0-8540-4220-8004-c2b55c114b20', u'size': 1, u'user_id': u'aaae5b52c427482290e213ba8381d0bd', u'volume_attachment': [], u'display_description': None, u'cluster_name': None, u'volume_admin_metadata': [{u'deleted': False, u'created_at': u'2016-08-02T01:39:52.000000', u'updated_at': None, u'value': u'63400a31000d844d3ac699940000005c', u'key': u'huawei_lun_wwn', u'volume_id': u'221b42b0-8540-4220-8004-c2b55c114b20', u'deleted_at': None, u'id': 32}], u'project_id': u'b54d0b554d654d5d9dbc1a13ffd2aae7', u'launched_at': u'2016-08-02T01:39:52.000000', u'scheduled_at': u'2016-08-02T01:39:47.000000', u'status': u'backing-up', u'volume_type_id': None, u'multiattach': False, u'deleted': False, u'provider_location': u'92', u'host': u'ubuntu-hxl-160@38#StoragePool001', u'consistencygroup_id': None, u'source_volid': None, u'provider_auth': None, u'previous_status': u'available', u'display_name': u'test2', u'bootable': False, u'created_at': u'2016-08-02T01:39:40.000000', u'attach_status': u'detached', u'volume_type': None, u'consistencygroup': None, u'volume_metadata': [], u'_name_id': None, u'encryption_key_id': None, u'replication_driver_data': None}}
(Pdb) p backup_dic.get('backup_device')
{u'migration_status': None, u'provider_id': None, u'availability_zone': u'nova', u'terminated_at': None, u'updated_at': u'2016-08-02T01:41:40.000000', u'provider_geometry': None, u'replication_extended_status': None, u'replication_status': u'disabled', u'snapshot_id': None, u'ec2_id': None, u'deleted_at': None, u'id': u'221b42b0-8540-4220-8004-c2b55c114b20', u'size': 1, u'user_id': u'aaae5b52c427482290e213ba8381d0bd', u'volume_attachment': [], u'display_description': None, u'cluster_name': None, u'volume_admin_metadata': [{u'deleted': False, u'created_at': u'2016-08-02T01:39:52.000000', u'updated_at': None, u'value': u'63400a31000d844d3ac699940000005c', u'key': u'huawei_lun_wwn', u'volume_id': u'221b42b0-8540-4220-8004-c2b55c114b20', u'deleted_at': None, u'id': 32}], u'project_id': u'b54d0b554d654d5d9db...

Revision history for this message
huanan (huanan) wrote :

Hi,michal, please see pdb log above.

Revision history for this message
huanan (huanan) wrote :

I'm sorry, I forget to say that we configured backup_use_temp_snasphot=True and backup_use_same_host=True.

Revision history for this message
huanan (huanan) wrote :
Revision history for this message
Michal Dulko (michal-dulko-f) wrote :

https://review.openstack.org/#/c/350551/ - here's the fix, I'm not sure why Launchpad haven't caught that.

Changed in cinder:
importance: Undecided → High
milestone: none → newton-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/350551
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=52efd1e7fdb6d3b5f5a1d6287fc45a54e0a9f0c6
Submitter: Jenkins
Branch: master

commit 52efd1e7fdb6d3b5f5a1d6287fc45a54e0a9f0c6
Author: Michał Dulko <email address hidden>
Date: Wed Aug 3 14:03:22 2016 +0200

    Use volume o.vo in get_backup_volume_temp_snapshot

    In snapshot's path in BaseVD's get_backup_device we're fetching a
    volume using SQLAlchemy API instead of objects. In the end this caused
    that sometimes we're returning a volume dictionary instead of an o.vo
    from get_backup_device call in volume.manager. As in backup.manager
    we're assuming that backup_device will be an object, backup operation
    sometimes exploded.

    This commit fixes that by making sure that we're always returning an
    o.vo from that call.

    Change-Id: I7adf3655d9c1bc63baede8f56202fd139709e552
    Closes-Bug: 1607333

Changed in cinder:
status: In Progress → Fix Released
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/cinder 9.0.0.0b3

This issue was fixed in the openstack/cinder 9.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.