Relation between Nova-compute ceph-mon

Bug #1811867 reported by Eric Kessels
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Charms Deployment Guide
Fix Released
High
Peter Matulis
OpenStack Nova Compute Charm
Triaged
Undecided
Unassigned

Bug Description

Hi,

I have an OpenStack cluster running version queens, deployed with Maas and juju (2.4.7)

Nova-compute charm = 291 series = bionic + xenial
Ceph mon = 32 series = xenial
Cinder ceph = 238

My compute host run 16.04, all fine and happy.

I want to add a nova compute node running 18.04, for nova everything is fine added to the openstack cluster, cinder-ceph is running ok.

The relation between ceph-mon and nova-compute is not completing. The error I get =

storage-backend relation's interface, ceph, is related awaiting the following data from the relationship: auth, key.

The secrets (ceph key) are not placed on the nova-compute host, secrets are not added to virsh.

1. What could be the problem ?
2. Is this configuration possible (mix 18.04 and 16.04 nova compute hosts)

Juju log on the ceph-mon:

2019-01-15 16:08:40 INFO juju.worker.uniter.operation runhook.go:135 ran "client-relation-changed" hook
2019-01-15 16:08:40 DEBUG juju.worker.uniter.operation executor.go:90 committing operation "run relation-changed (172; nova-compute-bionic/1) hook"
2019-01-15 16:08:40 DEBUG juju.machinelock machinelock.go:180 machine lock released for uniter (run relation-changed (172; nova-compute-bionic/1) hook)
2019-01-15 16:08:40 DEBUG juju.worker.uniter.operation executor.go:79 lock released
2019-01-15 16:08:40 DEBUG juju.worker.uniter resolver.go:123 no operations in progress; waiting for changes
2019-01-15 16:08:40 DEBUG juju.worker.uniter agent.go:17 [AGENT-STATUS] idle:
2019-01-15 16:11:36 DEBUG juju.worker.uniter.remotestate watcher.go:510 update status timer triggered
2019-01-15 16:11:36 DEBUG juju.worker.uniter resolver.go:123 no operations in progress; waiting for changes
2019-01-15 16:11:36 DEBUG juju.worker.uniter.operation executor.go:59 running operation run update-status hook
2019-01-15 16:11:36 DEBUG juju.machinelock machinelock.go:156 acquire machine lock for uniter (run update-status hook)
2019-01-15 16:11:36 DEBUG juju.machinelock machinelock.go:166 machine lock acquired for uniter (run update-status hook)
2019-01-15 16:11:36 DEBUG juju.worker.uniter.operation executor.go:90 preparing operation "run update-status hook"
2019-01-15 16:11:36 DEBUG juju.worker.uniter.operation executor.go:90 executing operation "run update-status hook"
2019-01-15 16:11:37 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:11:37 DEBUG juju-log Hardening function 'update_status'
2019-01-15 16:11:37 DEBUG worker.uniter.jujuc server.go:181 running hook tool "config-get"
2019-01-15 16:11:37 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:11:37 DEBUG juju-log No hardening applied to 'update_status'
2019-01-15 16:11:37 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:11:37 INFO juju-log Updating status.
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "application-version-set"
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-ids"
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-list"
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-ids"
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-list"
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:11:38 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"

Juju log on the nova-compute:

2019-01-15 16:09:20 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:20 DEBUG juju-log ceph:172: adding section 'DEFAULT'
2019-01-15 16:09:20 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:20 DEBUG juju-log ceph:172: 1 section(s) found
2019-01-15 16:09:20 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:20 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:20 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:20 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-ids"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "unit-get"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-ids"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "network-get"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "relation-get"
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:21 DEBUG juju-log ceph:172: Generating template context for cloud-credentials
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:21 DEBUG juju-log ceph:172: Generating template context for ceph
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:21 INFO juju-log ceph:172: Missing required data: auth key
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:21 DEBUG juju-log ceph:172: Generating template context for ceph
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:21 INFO juju-log ceph:172: Missing required data: auth key
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:21 INFO juju-log ceph:172: ceph relation incomplete. Peer not ready?
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:21 DEBUG juju-log ceph:172: Generating template context for ceph
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:21 INFO juju-log ceph:172: Missing required data: auth key
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:21 DEBUG juju-log ceph:172: Generating template context for ceph
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:21 INFO juju-log ceph:172: Missing required data: auth key
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "juju-log"
2019-01-15 16:09:21 INFO juju-log ceph:172: storage-backend relation's interface, ceph, is related awaiting the following data from the relationship: auth, key.
2019-01-15 16:09:21 DEBUG worker.uniter.jujuc server.go:181 running hook tool "status-set"
2019-01-15 16:09:22 DEBUG worker.uniter.jujuc server.go:181 running hook tool "application-version-set"
2019-01-15 16:09:23 INFO juju.worker.uniter.operation runhook.go:135 ran "ceph-relation-changed" hook
2019-01-15 16:09:23 DEBUG juju.worker.uniter.operation executor.go:90 committing operation "run relation-changed (172; ceph-mon/1) hook"
2019-01-15 16:09:23 DEBUG juju.machinelock machinelock.go:180 machine lock released for uniter (run relation-changed (172; ceph-mon/1) hook)
2019-01-15 16:09:23 DEBUG juju.worker.uniter.operation executor.go:79 lock released
2019-01-15 16:09:23 DEBUG juju.worker.uniter resolver.go:123 no operations in progress; waiting for changes
2019-01-15 16:09:23 DEBUG juju.worker.uniter agent.go:17 [AGENT-STATUS] idle:
2019-01-15 16:09:23 DEBUG juju.worker.uniter resolver.go:123 no operations in progress; waiting for changes

Please advise.

Eric

Revision history for this message
Drew Freiberger (afreiberger) wrote :

This is ultimately caused due to having a new ceph-mon charm that supports expected-osd-count, and your ceph-osd charm either not reporting, or not correctly reporting, the number of OSDS bootstrapped. This prevents the ceph-mon charm from being ready to hand out auth tokens.

Most like, you just need to upgrade your ceph-osd charm(s) in your environment to 18.11 so that ceph-mon un-sticks itself. Or, if you have 18.11 charms, count your OSDs and update your expected-osd-count to that number.

You can iterate on the ceph-osd relations to your mons to relation-get -r <rid> bootstrapped-osds ceph-osd/X. this should match the juju status "(XX OSDs)" output for that unit. Add them all together and they must add up to expected-osd-count or higher for ceph-mon to properly bootstrap and hand out credentials.

tags: added: canonical-bootstack
Revision history for this message
Ryan Beisner (1chb1n) wrote :

The ceph-mon and ceph-osd charms should be updated to latest stable charm revs before taking on a migration or payload upgrade task. We should make the docs explicit about that.

FWIW, that is the case for all migrations, payload upgrades, series upgrades: we expect all charms to be on the latest stable rev.

Do we need to add further clarity to the procedure, regarding osd count?

https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-ceph-migration.html

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Regarding: 2. Is this configuration possible (mix 18.04 and 16.04 nova compute hosts)?

It is not advisable or recommended. That condition will naturally exist while a cloud is being upgraded across series from Xenial to Bionic, but it shouldn't be an intended or extended state.

On the whole, a mixed-series OpenStack Charms deployment is not advisable. We should also update the docs to be explicit about this.

https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-series-upgrade.html

Changed in charm-deployment-guide:
importance: Undecided → High
Changed in charm-deployment-guide:
status: New → Triaged
Changed in charm-nova-compute:
status: New → Triaged
Changed in charm-deployment-guide:
assignee: nobody → Peter Matulis (petermatulis)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-deployment-guide (master)

Fix proposed to branch: master
Review: https://review.opendev.org/693424

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-deployment-guide (master)

Reviewed: https://review.opendev.org/693424
Committed: https://git.openstack.org/cgit/openstack/charm-deployment-guide/commit/?id=c74ce0fbdf81996331d7bddd259e15492ce7ee1d
Submitter: Zuul
Branch: master

commit c74ce0fbdf81996331d7bddd259e15492ce7ee1d
Author: Peter Matulis <email address hidden>
Date: Tue Nov 5 16:31:19 2019 -0500

    No mixing of charm releases nor series versions

    I realise now that 'Charm upgrades' should be
    on its own page as they can be done independently
    of an OpenStack upgrade. A future PR may address
    this.

    Closes-Bug: #1811867

    Change-Id: I6c9520ba2b81a358a66bdb0be92410c90dfaccf2

Changed in charm-deployment-guide:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.