Issue creating keyring.rados.gateway when deploying Openstack

Bug #1778266 reported by Simon Monette on 2018-06-22
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack ceph-radosgw charm
High
Ryan Beisner

Bug Description

When we deploy our Openstack bundle with a ceph cluster in it, most of the time the ceph charms will be in error.

After looking at the logs, it seems ceph-radosgw tried to create its pool in ceph-mon, but when it does so before the other Openstack services can create theirs.
Wich result in the following error:

The logs on ceph-radosgw units are filled with the error messages:
2018-05-30 21:09:25.739133 7f48374f4e80 0 pidfile_write: ignore empty --pid-file
2018-05-30 21:09:25.748857 7f48374f4e80 -1 auth: unable to find a keyring on /etc/ceph/keyring.rados.gateway: (2) No such file or directory

The file "/etc/ceph/keyring.rados.gateway" is indeed not present.

Also on ceph-mon no other pools get to be created:
# juju ssh ceph-mon/0 sudo ceph df
GLOBAL:
    SIZE AVAIL RAW USED %RAW USED
    488G 488G 325M 0.07
POOLS:
    NAME ID USED %USED MAX AVAIL OBJECTS
    default.rgw.buckets 1 0 0 154G 0

The only time the deployment worked, was because the ceph-radosgw units took more time than the other charms.

We then tried to deploy the same bundle, but without adding the ceph-radosgw units to the machines and waited for the cinder pools to be created in in ceph-mon and then add the ceph-radosgw units to the machines.

This time ceph-radosgw did not had any issue creating its keyring and all pools were visible in ceph-mon.
# juju ssh ceph-mon/0 sudo ceph df
GLOBAL:
    SIZE AVAIL RAW USED %RAW USED
    488G 487G 1223M 0.24
POOLS:
    NAME ID USED %USED MAX AVAIL OBJECTS
    cinder-backup 1 0 0 154G 0
    cinder 2 19 0 154G 3
    cinder-ceph 3 19 0 154G 3
    glance 4 278M 0.18 154G 41
    default.rgw.buckets 5 0 0 154G 0
    default.rgw 6 0 0 154G 0
    default.rgw.root 7 0 0 154G 0
    default.rgw.control 8 0 0 154G 8
    default.rgw.gc 9 0 0 154G 0
    default.rgw.buckets.index 10 0 0 154G 0
    default.rgw.buckets.extra 11 0 0 154G 0
    default.log 12 0 0 154G 0
    default.intent-log 13 0 0 154G 0
    default.usage 14 0 0 154G 0
    default.users 15 0 0 154G 0
    default.users.email 16 0 0 154G 0
    default.users.swift 17 0 0 154G 0
    default.users.uid 18 0 0 154G 0
    .rgw.root 19 1113 0 154G 4
    default.rgw.meta 20 0 0 154G 0
    default.rgw.log 21 0 0 154G 175
    gnocchi 22 30624k 0.02 154G 3188

From there the Openstack deployment continued and worked without any issue.

Maybe there is a race conditions in how ceph handle its relations.

Environment:
All the latest released charms on Xenial/Queen
App Rev
ceilometer 253
ceilometer-agent 244
ceph-mon 25
ceph-osd 262
ceph-radosgw 258
cinder 272
cinder-backup 13
cinder-ceph 233
glance 265
gnocchi 8
hacluster 46
heat 252
keystone 281
memcached 21
mysql 266
neutron-api 260
neutron-gateway 252
neutron-openvswitch 250
nova-cloud-controller 310
nova-compute 284
ntp 24
ntpmaster 7
openstack-dashboard 259
rabbitmq-server 74

description: updated
information type: Public → Private
information type: Private → Public
tags: added: cpe-onsite
Changed in charm-ceph-radosgw:
status: New → Triaged
importance: Undecided → High
Ryan Beisner (1chb1n) wrote :

Not enough info to leave in 'triaged' state, as the reproducer is not confirmed and there is no defined path forward here yet.

Changed in charm-ceph-radosgw:
status: Triaged → Incomplete
Ryan Beisner (1chb1n) on 2018-10-03
Changed in charm-ceph-radosgw:
milestone: none → 18.11
assignee: nobody → Ryan Beisner (1chb1n)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers