[SRU] Cinder backup of a volume is in error state with fail_reason: data must be bytes

Bug #1798917 reported by David Ames on 2018-10-19
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Undecided
Corey Bryant
Ubuntu Cloud Archive
High
Unassigned
cinder (Ubuntu)
High
Unassigned
Cosmic
High
Unassigned

Bug Description

[Impact]
When creating a cinder backup of a volume when using ceph as a back end and the python3 version of cinder, the backup is in error state with fail_reason: data must be bytes.

This clearly is a python3 problem. But it is unclear to me which package is at fault. python3-cinder, ceph-common, python3-rados.

$ cinder backup-list
+--------------------------------------+--------------------------------------+--------+----------+------+--------------+---------------+
| ID | Volume ID | Status | Name | Size | Object Count | Container |
+--------------------------------------+--------------------------------------+--------+----------+------+--------------+---------------+
| 75bfe189-6ac4-4bea-bb53-e523a8f1a9af | 64c5a514-0641-4b25-bd8f-092f40249400 | error | demo-bak | 1 | 0 | cinder-backup |
+--------------------------------------+--------------------------------------+--------+----------+------+--------------+---------------+

$ cinder backup-show 75bfe189-6ac4-4bea-bb53-e523a8f1a9af
+-----------------------------------+--------------------------------------+
| Property | Value |
+-----------------------------------+--------------------------------------+
| availability_zone | nova |
| container | cinder-backup |
| created_at | 2018-10-19T15:14:10.000000 |
| data_timestamp | 2018-10-19T15:14:10.000000 |
| description | None |
| fail_reason | data must be bytes |
| has_dependent_backups | False |
| id | 75bfe189-6ac4-4bea-bb53-e523a8f1a9af |
| is_incremental | False |
| metadata | {} |
| name | demo-bak |
| object_count | 0 |
| os-backup-project-attr:project_id | a399969417f146b0b7c8fce20c65c1e3 |
| size | 1 |
| snapshot_id | None |
| status | error |
| updated_at | 2018-10-19T15:14:16.000000 |
| volume_id | 64c5a514-0641-4b25-bd8f-092f40249400 |
+-----------------------------------+--------------------------------------+

cinder-backup.log:
e7545f300fb23 38ecce7606e84e17b28e7545f300fb23] Create backup started, backup: dadf4f46-91f0-457b-9c47-edef63381c8e volume: 5c0d8a28-e391-4533-8b9c-5304b4afeefb.
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server [req-8099dad4-2f4a-4698-b1c1-0fb4a3f2da0e b5f325fae3904bae80a9366efaf12766 143dc30f79f34c649e4b50d64dc7c62a - 38ecce7606e84e17b28e7545f300fb23 38ecce7606e84e17b28e7545f300fb23] Exception during message handling: TypeError: data must be bytes
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/cinder/backup/manager.py", line 425, in create_backup
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server self._update_backup_error(backup, six.text_type(err))
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server self.force_reraise()
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/six.py", line 693, in reraise
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server raise value
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/cinder/backup/manager.py", line 414, in create_backup
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server updates = self._run_backup(context, backup, volume)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/cinder/backup/manager.py", line 502, in _run_backup
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server tpool.Proxy(device_path))
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/cinder/backup/drivers/ceph.py", line 950, in backup
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server self._backup_metadata(backup)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/cinder/backup/drivers/ceph.py", line 903, in _backup_metadata
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server vol_meta_backup.set(json_meta)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/cinder/backup/drivers/ceph.py", line 139, in set
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server meta_obj.write(json_meta)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 186, in doit
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server result = proxy_call(self._autowrap, f, *args, **kwargs)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 144, in proxy_call
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server rv = execute(f, *args, **kwargs)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 125, in execute
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server six.reraise(c, e, tb)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/eventlet/support/six.py", line 625, in reraise
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server raise value
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 83, in tworker
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server rv = meth(*args, **kwargs)
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "rados.pyx", line 3939, in rados.set_object_locator.retfunc
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "rados.pyx", line 3949, in rados.set_object_namespace.retfunc
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "rados.pyx", line 3986, in rados.Object.write
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "rados.pyx", line 517, in rados.requires.wrapper.validate_func
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "rados.pyx", line 505, in rados.requires.check_type
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server TypeError: data must be bytes
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server

Packages:
ii cinder-api 2:13.0.0-0ubuntu3~cloud0 all Cinder storage service - API server
ii cinder-backup 2:13.0.0-0ubuntu3~cloud0 all Cinder storage service - Scheduler server
ii cinder-common 2:13.0.0-0ubuntu3~cloud0 all Cinder storage service - common files
ii cinder-scheduler 2:13.0.0-0ubuntu3~cloud0 all Cinder storage service - Scheduler server
ii cinder-volume 2:13.0.0-0ubuntu3~cloud0 all Cinder storage service - Volume server
ii python3-cinder 2:13.0.0-0ubuntu3~cloud0 all Cinder Python 3 libraries
ii ceph-common 13.2.1+dfsg1-0ubuntu2~cloud0 amd64 common utilities to mount and interact with a ceph storage cluster
ii libcephfs2 13.2.1+dfsg1-0ubuntu2~cloud0 amd64 Ceph distributed file system client library
ii python-cephfs 13.2.1+dfsg1-0ubuntu2~cloud0 amd64 Python libraries for the Ceph libcephfs library
ii librados2 13.2.1+dfsg1-0ubuntu2~cloud0 amd64 RADOS distributed object store client library
ii libradosstriper1 13.2.1+dfsg1-0ubuntu2~cloud0 amd64 RADOS striping interface
ii python-rados 13.2.1+dfsg1-0ubuntu2~cloud0 amd64 Python libraries for the Ceph librados library
ii python3-rados 13.2.1+dfsg1-0ubuntu2~cloud0 amd64 Python 3 libraries for the Ceph librados library

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: python3-cinder 2:13.0.0-0ubuntu3~cloud0 [origin: Canonical]
ProcVersionSignature: Ubuntu 4.15.0-36.39-generic 4.15.18
Uname: Linux 4.15.0-36-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.4
Architecture: amd64
CrashDB:
 {
                "impl": "launchpad",
                "project": "cloud-archive",
                "bug_pattern_url": "http://people.canonical.com/~ubuntu-archive/bugpatterns/bugpatterns.xml",
             }
Date: Fri Oct 19 23:20:06 2018
Ec2AMI: ami-000001ca
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: m1.blue
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
ExecutablePath: /usr/bin/python3-cinder-backup
InterpreterPath: /usr/bin/python3.6
PackageArchitecture: all
ProcEnviron:
 PATH=(custom, no user)
 LANG=C.UTF-8
 SHELL=/bin/false
Python3Details: /usr/bin/python3.6, Python 3.6.6, python3-minimal, 3.6.5-3ubuntu1
PythonDetails: /usr/bin/python2.7, Python 2.7.15rc1, python-minimal, 2.7.15~rc1-1
SourcePackage: cinder
UpgradeStatus: No upgrade log present (probably fresh install)

[Test Case]
Deploy OpenStack with a python 3 cinder having a ceph backend. Create a cinder volume backup and ensure it is successful.

[Regression Potential]
The patch that is cherry-picked has landed upstream, minimizing the potential of regression.

David Ames (thedac) wrote :
David Ames (thedac) wrote :

Looks like python3-cinder

2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server File "/usr/lib/python3/dist-packages/cinder/backup/drivers/ceph.py", line 139, in set
2018-10-18 22:17:20.284 15452 ERROR oslo_messaging.rpc.server meta_obj.write(json_meta)

dpkg -S /usr/lib/python3/dist-packages/cinder/backup/drivers/ceph.py
python3-cinder: /usr/lib/python3/dist-packages/cinder/backup/drivers/ceph.py

Changed in cloud-archive:
status: New → Triaged
importance: Undecided → High
Changed in cinder (Ubuntu):
status: New → Triaged
importance: Undecided → High
no longer affects: cinder (Ubuntu Dd-series)

Fix proposed to branch: master
Review: https://review.openstack.org/612490

Changed in cinder:
assignee: nobody → Corey Bryant (corey.bryant)
status: New → In Progress

After looking into this some more I think the current approach of utf-8 encoding in VolumeMetadataBackup.set() prior to the write and utf-8 decoding in VolumeMetadataBackup.get() after the read makes sense. More details:

On the way in json_meta is a json-encoded dict containing all metadata is actually set to in the calls to VolumeMetadataBackup.set(json_meta):
1) In CephBackupDriver._backup_metadata(), 'json_meta = self.get_metadata' is set before calling VolumeMetadataBackup.set(json_meta)
2) CephBackupDriver inherits from driver.BackupDriver (from cinder.backup import driver)
3) In cinder/backup/driver.py, class BackupDriver has get_metadata which calls BackupMetadataAPI.get.
4) BackupMetadataAPI.get() gets the volume metadata, returning jsonutils.dumps(container), a json-encoded dict containing all metadata [1].

And on the way back out from VolumeMetadataBackup.get(json_meta):
1) CephBackupDriver._restore_metadata() calls VolumeMetadataBackup.get() to read the object from ceph.
2) The result (meta) is then sent to self.put_metadata(volume_id, meta), from driver.BackupDriver.
3) put_metadata() calls BackupMetadataAPI.put()
4) BackupMetadataAPI.put() restores volume metadata to a volume, using jsonutils.loads(json_metadata) to decode the encoded dict.

jsonutils.dumps() is from oslo.serialization (oslo_serialization/jsonutils.py) and serialize an obj to a JSON formatted str. jsonutils.loads() deserializes the JSON formatted str back to an obj [1]

[1] For example:
>>> from oslo_serialization import jsonutils
>>> container = {'version': 1}
>>> jsonutils.dumps(container) # on the way in
'{"version": 1}'
>>> jsonutils.loads(jsonutils.dumps(container)) # on the way out
{'version': 1}

Corey Bryant (corey.bryant) wrote :

This has been uploaded to the cosmic unapproved queue where it is awaiting SRU Team review: https://launchpad.net/ubuntu/cosmic/+queue?queue_state=1&queue_text=

summary: - Cinder backup of a volume is in error state with fail_reason: data must
- be bytes
+ [SRU] Cinder backup of a volume is in error state with fail_reason: data
+ must be bytes
description: updated

Reviewed: https://review.openstack.org/612490
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=f3d6ae88add6e06107fc3ece30d8535693f739a2
Submitter: Zuul
Branch: master

commit f3d6ae88add6e06107fc3ece30d8535693f739a2
Author: Corey Bryant <email address hidden>
Date: Mon Oct 22 14:45:33 2018 -0400

    PY3: Ensure rados.Object.read/write use byte data

    rados.Object.write(string_to_write) [1] and rados.Ioctx.write(data) [2]
    expect to write byte data for the above named arguments.
    rados.Object.read() returns that data.

    Ensure that the json_meta argument passed to rados.Object.write() in
    VolumeMetadataBackup.set() is encoded as a UTF-8 bytes object and decoded
    after rados.Object.read().

    Also update the corresponding unit tests to ensure that metadata
    dictionaries are JSON serialized, encoded, and decoded similar to
    how the actual code behaves. [3]

    [1] https://github.com/ceph/ceph/blob/v13.2.1/src/pybind/rados/rados.pyx#L3984
    [2] https://github.com/ceph/ceph/blob/v13.2.1/src/pybind/rados/rados.pyx#L2641
    [3] https://bugs.launchpad.net/cinder/+bug/1798917/comments/4

    Change-Id: Idb225b5c84be3beac0c272ed4b8d69ebb04c5858
    Closes-Bug: #1798917

Changed in cinder:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/612990
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=8280fa0df61be7097bf44c93338f530bbd03b7cc
Submitter: Zuul
Branch: stable/rocky

commit 8280fa0df61be7097bf44c93338f530bbd03b7cc
Author: Corey Bryant <email address hidden>
Date: Mon Oct 22 14:45:33 2018 -0400

    PY3: Ensure rados.Object.read/write use byte data

    rados.Object.write(string_to_write) [1] and rados.Ioctx.write(data) [2]
    expect to write byte data for the above named arguments.
    rados.Object.read() returns that data.

    Ensure that the json_meta argument passed to rados.Object.write() in
    VolumeMetadataBackup.set() is encoded as a UTF-8 bytes object and decoded
    after rados.Object.read().

    Also update the corresponding unit tests to ensure that metadata
    dictionaries are JSON serialized, encoded, and decoded similar to
    how the actual code behaves. [3]

    [1] https://github.com/ceph/ceph/blob/v13.2.1/src/pybind/rados/rados.pyx#L3984
    [2] https://github.com/ceph/ceph/blob/v13.2.1/src/pybind/rados/rados.pyx#L2641
    [3] https://bugs.launchpad.net/cinder/+bug/1798917/comments/4

    Change-Id: Idb225b5c84be3beac0c272ed4b8d69ebb04c5858
    Closes-Bug: #1798917

tags: added: in-stable-rocky
Corey Bryant (corey.bryant) wrote :

Comments #9 and #10 are no longer applicable. Hiding.

Hello David, or anyone else affected,

Accepted cinder into cosmic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cinder/2:13.0.0-0ubuntu4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-cosmic to verification-done-cosmic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-cosmic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in cinder (Ubuntu Cosmic):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-cosmic
Changed in cloud-archive:
status: Triaged → Fix Committed
Corey Bryant (corey.bryant) wrote :

Verified successfully on rocky-proposed: https://paste.ubuntu.com/p/TN3FzxS9ND/

Corey Bryant (corey.bryant) wrote :

Verified successfully on cosmic-proposed: https://paste.ubuntu.com/p/G4w9hJvrSW/

tags: added: verification-done verification-done-cosmic verification-rocky-done
removed: verification-needed verification-needed-cosmic
Brian Murray (brian-murray) wrote :

This seems to still need fixing for disco.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cinder - 2:14.0.0~b1~git20181113.f0aa3665e-0ubuntu1

---------------
cinder (2:14.0.0~b1~git20181113.f0aa3665e-0ubuntu1) disco; urgency=medium

  * New upstream snapshot for OpenStack Stein.
  * d/control: Align (Build-)Depends with upstream.
  * d/p/rados-read-write-byte-data.patch: Dropped. Fixed in upstream snapshot.
  * d/rules: Ensure python3 shebangs are set to /usr/bin/python3.
  * d/rules: Enable Python 3.7 tests.
  * d/rules: Update PBR_VERSION to handle ~git versioning.
  * d/p/skip-taskflow-tests-py37.patch: Skip tests that are failing for
    Python 3.7 due to taskflow StopIteration issue.
  * d/t/cinder-shebangs-py3: Update shebang for Python 3.7.
  * d/watch: Scope to 14.x series.

 -- Corey Bryant <email address hidden> Tue, 13 Nov 2018 12:09:27 -0500

Changed in cinder (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers