OpenStack provisioner should have "volume-zones" constraints as "zones" for Cinder volumes, including with root-disk=volume constraint

Bug #1844099 reported by Pedro Guimarães
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Low
Unassigned

Bug Description

Generally, OpenStack is deployed with Ceph. That means Cinder has little to worry with volume being HA since Ceph runs cross-zone replication.

However, if Cinder storage-backend does not support that type of replication, then we need to be mindful of where virtual disks are placed. I believe the simplest example is the LVM storage backend.

Therefore, besides a "zones" constraint which applies to VMs, we should use same "zones" constraints to allocate Cinder volumes or define a "volume-zones" specific for Cinder since Cinder's AZ and Nova AZ configurations may result in different names and layouts.

Revision history for this message
Pedro Guimarães (pguimaraes) wrote :
Download full text (6.6 KiB)

I've run a lab with following OpenStack bundle: https://pastebin.canonical.com/p/4jm8DmSPj4/

Counting with 2 Cinder AZs, both connected to Ceph (to make sure it will be Up):
$ cinder service-list
+------------------+-------------------------------+------+---------+-------+----------------------------+-----------------+
| Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+------------------+-------------------------------+------+---------+-------+----------------------------+-----------------+
| cinder-scheduler | cinder-az1 | az1 | enabled | up | 2019-09-16T12:15:22.000000 | - |
| cinder-scheduler | juju-d23cf7-1-lxd-1 | az1 | enabled | down | 2019-09-14T15:29:36.000000 | - |
| cinder-volume | cinder-az1@cinder-ceph | az1 | enabled | up | 2019-09-16T12:15:22.000000 | - |
| cinder-volume | cinder-volume-az2@cinder-ceph | az2 | enabled | up | 2019-09-16T12:15:27.000000 | - |
| cinder-volume | juju-d23cf7-1-lxd-1@LVM | az1 | enabled | down | 2019-09-14T15:29:28.000000 | - |
| cinder-volume | juju-d23cf7-1-lxd-2@LVM | az2 | enabled | down | 2019-09-14T15:28:59.000000 | - |
+------------------+-------------------------------+------+---------+-------+----------------------------+-----------------+

I've booted 7 cs:ubuntu VMs on this OpenStack with following bundle: https://pastebin.canonical.com/p/VnSPZY4FSm/

Notice that I am using root-disk=volume constraint.

I can see that all volumes map to "az1":

$ for i in $(openstack volume list | tail -n +4 | head -n -1 | awk '{print $2}'); do echo $i; openstack volume show $i | grep "availability_zone"; done
f69c2274-134c-4cab-b948-276b84aa18f2
| availability_zone | az1 |
4a099ccc-e150-4b42-9622-0200e68e442a
| availability_zone | az1 |
84504bc3-75f6-4fa4-a547-03e8e7807e51
| availability_zone | az1 ...

Read more...

summary: OpenStack provisioner should have "volume-zones" constraints as "zones"
- for Cinder volumes
+ for Cinder volumes, including with root-disk=volume constraint
Revision history for this message
John A Meinel (jameinel) wrote :

Are you saying that if Juju picks a zone for a given instance, it should try to request the volume storage from that zone. It sounds like Ceph can have a completely different idea of zones from Nova. Is that functionally useful?
Is it that you would like to do something like root-disk-source=volume,zone=X or some other technique?
Wouldn't it be Openstack that would do the recommended scheduling of where the instance is running vs where it gets the volume source from?

Changed in juju:
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
Changed in juju:
status: Expired → New
Revision history for this message
Pedro Guimarães (pguimaraes) wrote :

@jameinel
Indeed OpenStack has AZ concept for VMs and an entirely different concept of AZs for Cinder.

Ceph as a backend is entirely different and we do not need to worry much about availability of data. This bug refers more to cases where cinder backend is NOT ceph. It can be for instance several storage arrays or LVM in each compute node.

In this case, one can define AZs for cinder-volume services and a default AZ for cinder-api.

If you don't define which zone a volume comes from, OpenStack will always go for cinder-api's default_availability_zone value.

It means that, in practice, Juju is always deploying to the same zone, even if VMs themselves are sitting across AZs for HA. If we lose that storage array, we lose the entire cluster.

The solution I see is to have a "volume-zone" parameter where I can define from which AZ Juju should pick its disks. That parameter should be enabled on add-machine's constraints so we can also use it on bundles.

Changed in juju:
status: New → Triaged
Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote :

This bug has not been updated in 2 years, so we're marking it Low importance. If you believe this is incorrect, please update the importance.

Changed in juju:
importance: Medium → Low
tags: added: expirebugs-bot
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.