Mutiple relations allowed between nova-compute and ceph-mon for ephemeral storage

Bug #2055403 reported by Marco Marino
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Nova Compute Charm
Triaged
Medium
Unassigned

Bug Description

Hello,

According to the specs contained in [0], Nova-compute can use a single Ceph server for ephemeral disks.

What does it mean?
Let's assume that we have 2 independent Ceph storages composed of 1 ceph-mon and 3 ceph-osd.

Ceph Deployment 1:
$ juju status ceph-mon ceph-osd
Model Controller Cloud/Region Version SLA Timestamp
gend-k8s focal-controller stsstack/stsstack 2.9.44 unsupported 09:59:15Z

App Version Status Scale Charm Channel Rev Exposed Message
ceph-mon 15.2.17 active 1 ceph-mon octopus/stable 177 no Unit is ready and clustered
ceph-osd 15.2.17 active 3 ceph-osd octopus/stable 526 no Unit is ready (1 OSD)

Unit Workload Agent Machine Public address Ports Message
ceph-mon/0* active idle 0 10.5.1.75 Unit is ready and clustered
ceph-osd/0* active idle 1 10.5.2.54 Unit is ready (1 OSD)
ceph-osd/1 active idle 2 10.5.2.195 Unit is ready (1 OSD)
ceph-osd/2 active idle 3 10.5.2.110 Unit is ready (1 OSD)

Machine State Address Inst id Series AZ Message
0 started 10.5.1.75 54548819-4a36-40a3-b202-0e0d96ac64e1 focal nova ACTIVE
1 started 10.5.2.54 0b555c7c-1f32-4142-ad9a-34a97d9f8726 focal nova ACTIVE
2 started 10.5.2.195 beb10250-9368-4574-ac69-94dec6ec603f focal nova ACTIVE
3 started 10.5.2.110 5eda0b90-6705-4124-b0de-9807d489ca27 focal nova ACTIVE

Ceph Deployment 2:
$ juju status ceph-mon-d2 ceph-osd-d2
Model Controller Cloud/Region Version SLA Timestamp
gend-k8s focal-controller stsstack/stsstack 2.9.44 unsupported 09:59:54Z

App Version Status Scale Charm Channel Rev Exposed Message
ceph-mon-d2 15.2.17 active 1 ceph-mon octopus/stable 177 no Unit is ready and clustered
ceph-osd-d2 15.2.17 active 3 ceph-osd octopus/stable 526 no Unit is ready (1 OSD)

Unit Workload Agent Machine Public address Ports Message
ceph-mon-d2/0* active idle 21 10.5.1.25 Unit is ready and clustered
ceph-osd-d2/3* active idle 13 10.5.0.238 Unit is ready (1 OSD)
ceph-osd-d2/4 active idle 14 10.5.2.200 Unit is ready (1 OSD)
ceph-osd-d2/5 active idle 15 10.5.2.24 Unit is ready (1 OSD)

Machine State Address Inst id Series AZ Message
13 started 10.5.0.238 5e4dfc30-500c-4e02-b4e5-7787d42f51ac focal nova ACTIVE
14 started 10.5.2.200 50e41933-7f48-4819-aca3-d8d35d2e6d2c focal nova ACTIVE
15 started 10.5.2.24 27a9cc78-4f21-4f77-a70e-be000ae2a49f focal nova ACTIVE
21 started 10.5.1.25 aa6f2bd6-1424-4200-967a-0fafbb6f0f20 focal nova ACTIVE

Also, let's assume that we have 3 nova-compute Juju units with L3 connectivity to the Ceph nodes listed above (both Ceph storage are reachable from nova-compute units)

The initial situation for nova-compute is:
$ juju status --relations nova-compute
Model Controller Cloud/Region Version SLA Timestamp
gend-k8s focal-controller stsstack/stsstack 2.9.44 unsupported 09:46:31Z

App Version Status Scale Charm Channel Rev Exposed Message
neutron-openvswitch 16.4.2 active 3 neutron-openvswitch ussuri/stable 526 no Unit is ready
nova-compute 21.2.4 active 3 nova-compute ussuri/stable 710 no Unit is ready

Unit Workload Agent Machine Public address Ports Message
nova-compute/0 active idle 13 10.5.0.238 Unit is ready
  neutron-openvswitch/0* active idle 10.5.0.238 Unit is ready
nova-compute/1 active idle 14 10.5.2.200 Unit is ready
  neutron-openvswitch/2 active idle 10.5.2.200 Unit is ready
nova-compute/2* active idle 15 10.5.2.24 Unit is ready
  neutron-openvswitch/1 active idle 10.5.2.24 Unit is ready

Machine State Address Inst id Series AZ Message
13 started 10.5.0.238 5e4dfc30-500c-4e02-b4e5-7787d42f51ac focal nova ACTIVE
14 started 10.5.2.200 50e41933-7f48-4819-aca3-d8d35d2e6d2c focal nova ACTIVE
15 started 10.5.2.24 27a9cc78-4f21-4f77-a70e-be000ae2a49f focal nova ACTIVE

Relation provider Requirer Interface Type Message
ceph-mon:client nova-compute:ceph ceph-client regular
cinder-ceph:ceph-access nova-compute:ceph-access cinder-ceph-key regular
glance:image-service nova-compute:image-service glance regular
neutron-api:neutron-plugin-api neutron-openvswitch:neutron-plugin-api neutron-plugin-api regular
neutron-openvswitch:neutron-plugin nova-compute:neutron-plugin neutron-plugin subordinate
nova-compute:cloud-compute nova-cloud-controller:cloud-compute nova-compute regular
nova-compute:compute-peer nova-compute:compute-peer nova peer
rabbitmq-server:amqp neutron-openvswitch:amqp rabbitmq regular
rabbitmq-server:amqp nova-compute:amqp rabbitmq regular

So, nova-compute has a "ceph" relation with ceph-mon:client

The effect of this relation is in the content of /var/lib/charm/nova-compute/ceph.conf (on the hypervisor):
...
[global]
auth_supported = cephx
keyring = /etc/ceph/$cluster.$name.keyring
mon host = 10.5.1.25 #### <---- Ceph mon IP here
log to syslog = false
err to syslog = false
clog to syslog = false
...

This relation means that if you want to create an ephemeral disk, the RBD layer will be used (more specifically, the "nova" pool. More details in /etc/nova/nova.conf "libvirt" section)

Now, let's add a new relation of the same type:
juju add-relation nova-compute:ceph ceph-mon-d2:client

The command works without any error.
The counterside of this relation is (again) in the content of /var/lib/charm/nova-compute/ceph.conf

[global]
auth_supported = cephx
keyring = /etc/ceph/$cluster.$name.keyring
mon host = 10.5.1.25 10.5.1.75 #### <--- Both IP addresses here!
log to syslog = false
err to syslog = false
clog to syslog = false

As you can see, we have now 2 IP Addresses in that file. The IPs belong to the ceph-mon/0 and ceph-mon-d2/0 Juju units.
This is wrong in my opinion.
Let's assume that the first IP of the list is used by default in Nova. All ephemeral disks will be created in the "nova" pool belonging to the ceph-mon Juju application.

If the ceph-mon/0 goes down, the second IP in the list will be used. Even if the hypervisor can successfully establish a connection with the ceph-mon-d2 (and hence ceph-osd-d2), and the "nova" pool exists, it cannot find the ephemeral volumes previously created in the other Ceph storage.

This is normal behaviour because all disks were created on the other (totally independent) storage.

In my opinion, no more than one ceph-mon:client <--> nova-compute:ceph relation must be allowed by the nova-compute charm.

[0] https://docs.openstack.org/nova/latest/configuration/config.html

### ENVIRONMENT DETAILS ###
Openstack Ussuri
Ceph Octopus
Ubuntu Focal
Juju 2.9.44

Regards,
Marco

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Hi Marco, I believe that this behaviour is by coincidence rather than intentional. If you want to use multiple Ceph clusters and/or pools you can configure more than one Cinder storage backend using the cinder-ceph charm, relating it to the respective ceph cluster and the Cinder will provide the volume connection details when attaching a volume.

Revision history for this message
Marco Marino (marino-mrc) wrote :

Hi Edward,
thanks for your comment.
But the problem happens for ephemeral storage only (and only if you want to use Ceph for ephemeral disks).

As you correctly said, nova-compute has another relation for cinder-ceph:
cinder-ceph:ceph-access nova-compute:ceph-access

and it works without any problem (tested in my lab environment right now). More specifically, "it works" means the following:

- you have 2 independent Ceph servers and both are used for Cinder volumes. It's possible through the cinder-ceph charm (deployed 2 times with different names)

So, just to summarize, I think the problem is:

If you have 2 independent Ceph storages, you must choose which one you want to use for ephemeral disks (unless you deploy 2 different nova-compute applications). You can use both servers for Cinder volumes.

In my opinion, the nova-compute charm should avoid adding multiple relations like the following:
ceph-mon:client nova-compute:ceph
ceph-mon-d2:client nova-compute:ceph

and that's the bug I'm focused on. The nova-compute charm should notice that a ceph-mon:client relation is already applied and rejecting other relations of the same type with other ceph-mon Juju applications

Regards,
Marco

Changed in charm-nova-compute:
importance: Undecided → Medium
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.