Issue creating keyring.rados.gateway when deploying Openstack

Bug #1778266 reported by Simon Monette
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ceph RADOS Gateway Charm
Incomplete
High
Ryan Beisner

Bug Description

When we deploy our Openstack bundle with a ceph cluster in it, most of the time the ceph charms will be in error.

After looking at the logs, it seems ceph-radosgw tried to create its pool in ceph-mon, but when it does so before the other Openstack services can create theirs.
Wich result in the following error:

The logs on ceph-radosgw units are filled with the error messages:
2018-05-30 21:09:25.739133 7f48374f4e80 0 pidfile_write: ignore empty --pid-file
2018-05-30 21:09:25.748857 7f48374f4e80 -1 auth: unable to find a keyring on /etc/ceph/keyring.rados.gateway: (2) No such file or directory

The file "/etc/ceph/keyring.rados.gateway" is indeed not present.

Also on ceph-mon no other pools get to be created:
# juju ssh ceph-mon/0 sudo ceph df
GLOBAL:
    SIZE AVAIL RAW USED %RAW USED
    488G 488G 325M 0.07
POOLS:
    NAME ID USED %USED MAX AVAIL OBJECTS
    default.rgw.buckets 1 0 0 154G 0

The only time the deployment worked, was because the ceph-radosgw units took more time than the other charms.

We then tried to deploy the same bundle, but without adding the ceph-radosgw units to the machines and waited for the cinder pools to be created in in ceph-mon and then add the ceph-radosgw units to the machines.

This time ceph-radosgw did not had any issue creating its keyring and all pools were visible in ceph-mon.
# juju ssh ceph-mon/0 sudo ceph df
GLOBAL:
    SIZE AVAIL RAW USED %RAW USED
    488G 487G 1223M 0.24
POOLS:
    NAME ID USED %USED MAX AVAIL OBJECTS
    cinder-backup 1 0 0 154G 0
    cinder 2 19 0 154G 3
    cinder-ceph 3 19 0 154G 3
    glance 4 278M 0.18 154G 41
    default.rgw.buckets 5 0 0 154G 0
    default.rgw 6 0 0 154G 0
    default.rgw.root 7 0 0 154G 0
    default.rgw.control 8 0 0 154G 8
    default.rgw.gc 9 0 0 154G 0
    default.rgw.buckets.index 10 0 0 154G 0
    default.rgw.buckets.extra 11 0 0 154G 0
    default.log 12 0 0 154G 0
    default.intent-log 13 0 0 154G 0
    default.usage 14 0 0 154G 0
    default.users 15 0 0 154G 0
    default.users.email 16 0 0 154G 0
    default.users.swift 17 0 0 154G 0
    default.users.uid 18 0 0 154G 0
    .rgw.root 19 1113 0 154G 4
    default.rgw.meta 20 0 0 154G 0
    default.rgw.log 21 0 0 154G 175
    gnocchi 22 30624k 0.02 154G 3188

From there the Openstack deployment continued and worked without any issue.

Maybe there is a race conditions in how ceph handle its relations.

Environment:
All the latest released charms on Xenial/Queen
App Rev
ceilometer 253
ceilometer-agent 244
ceph-mon 25
ceph-osd 262
ceph-radosgw 258
cinder 272
cinder-backup 13
cinder-ceph 233
glance 265
gnocchi 8
hacluster 46
heat 252
keystone 281
memcached 21
mysql 266
neutron-api 260
neutron-gateway 252
neutron-openvswitch 250
nova-cloud-controller 310
nova-compute 284
ntp 24
ntpmaster 7
openstack-dashboard 259
rabbitmq-server 74

Tags: cpe-onsite
description: updated
information type: Public → Private
information type: Private → Public
tags: added: cpe-onsite
Changed in charm-ceph-radosgw:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Not enough info to leave in 'triaged' state, as the reproducer is not confirmed and there is no defined path forward here yet.

Changed in charm-ceph-radosgw:
status: Triaged → Incomplete
Ryan Beisner (1chb1n)
Changed in charm-ceph-radosgw:
milestone: none → 18.11
assignee: nobody → Ryan Beisner (1chb1n)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.