Add support for Pacific to RBD driver

Bug #1931003 reported by Jon Bernard
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Cinder
New
Low
Unassigned

Bug Description

When using ceph pacific, volume-from-image operations where both glance and cinder are configured to use RBD result in an exception when calling clone():

    rbd.InvalidArgument: [errno 22] RBD invalid argument (error creating clone)

    ERROR cinder.volume.manager Traceback (most recent call last):
    ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task
    ERROR cinder.volume.manager result = task.execute(**arguments)
    ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/volume/flows/manager/create_volume.py", line 1132, in execute
    ERROR cinder.volume.manager model_update = self._create_from_image(context,
    ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/utils.py", line 638, in _wrapper
    ERROR cinder.volume.manager return r.call(f, *args, **kwargs)
    ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 411, in call
    ERROR cinder.volume.manager return self.__call__(*args, **kwargs)
    ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 423, in __call__
    ERROR cinder.volume.manager do = self.iter(retry_state=retry_state)
    ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 360, in iter
    ERROR cinder.volume.manager return fut.result()
    ERROR cinder.volume.manager File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 438, in result
    ERROR cinder.volume.manager return self.__get_result()
    ERROR cinder.volume.manager File "/usr/lib64/python3.9/concurrent/futures/_base.py", line 390, in __get_result
    ERROR cinder.volume.manager raise self._exception
    ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 426, in __call__
    ERROR cinder.volume.manager result = fn(*args, **kwargs)
    ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/volume/flows/manager/create_volume.py", line 998, in _create_from_image
    ERROR cinder.volume.manager model_update, cloned = self.driver.clone_image(context,
    ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 1571, in clone_image
    ERROR cinder.volume.manager volume_update = self._clone(volume, pool, image, snapshot)
    ERROR cinder.volume.manager File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 1023, in _clone
    ERROR cinder.volume.manager self.RBDProxy().clone(src_client.ioctx,
    ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/eventlet/tpool.py", line 190, in doit
    ERROR cinder.volume.manager result = proxy_call(self._autowrap, f, *args, **kwargs)
    ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/eventlet/tpool.py", line 148, in proxy_call
    ERROR cinder.volume.manager rv = execute(f, *args, **kwargs)
    ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/eventlet/tpool.py", line 129, in execute
    ERROR cinder.volume.manager six.reraise(c, e, tb)
    ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/six.py", line 719, in reraise
    ERROR cinder.volume.manager raise value
    ERROR cinder.volume.manager File "/usr/local/lib/python3.9/site-packages/eventlet/tpool.py", line 83, in tworker
    ERROR cinder.volume.manager rv = meth(*args, **kwargs)
    ERROR cinder.volume.manager File "rbd.pyx", line 698, in rbd.RBD.clone
    ERROR cinder.volume.manager rbd.InvalidArgument: [errno 22] RBD invalid argument (error creating clone)
    ERROR cinder.volume.manager

In Pacific a check was added to make sure during a clone operation that the child's strip unit was not less than that of its parent. Failing this condition returns -EINVAL, which is then raised by python-rbd as an exception. This maps to the 'order' argument in clone(), where order is log base 2 of the strip unit. Ceph's default is 4 megabytes. The reason we're seeing EINVAL exceptions in the Pacific CI is that: when Openstack is configured to use Ceph for both cinder and glance, volume-from-image tests fail because Glance's default stripe unit is 8 (distinctly larger than Cinder's 4). This results in an order calculation of 22, which is invalid for clone() (too small).

I see two possible solutions and have proposed patches:

1. Increase Cinder's default chunk size to match Glance's. I think this makes sense for both consistency and performance.

2. When doing a clone(), consider the configured chunk size /and/ the strip unit of the parent volume and choose the higher value.

Either of these approaches prevent the failures we're seeing, I think they are both useful individually as well.

Tags: clone glance rbd
tags: added: clone glance rbd
Revision history for this message
Sofia Enriquez (lsofia-enriquez) wrote :
Changed in cinder:
importance: Undecided → Low
Revision history for this message
Boris Lukashev (rageltman) wrote :

Cinder deals with a lot more small IO requests than Glance, so doubling the chunk size might not be ideal. Would conversion of volumes be required in an upgrade scenario? #2 seems like the more compatible approach given that Kolla-Ansible doesn't manage Ceph these days, and users are going to bring all sorts of Ceph backends into the mix on their own (we're using Canonical's for instance).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.