2018-06-22 19:41:04 |
Simon Monette |
description |
When we deploy our Openstack bundle with a ceph cluster in it, most of the time the ceph charms will be in error.
After looking at the logs, it seems ceph-radosgw tried to create its pool in ceph-mon, but when it does so before the other Openstack services can create theirs.
Wich result in the following error:
The logs on ceph-radosgw units are filled with the error messages:
2018-05-30 21:09:25.739133 7f48374f4e80 0 pidfile_write: ignore empty --pid-file
2018-05-30 21:09:25.748857 7f48374f4e80 -1 auth: unable to find a keyring on /etc/ceph/keyring.rados.gateway: (2) No such file or directory
The file "/etc/ceph/keyring.rados.gateway" is indeed not present.
Also on ceph-mon no other pools get to be created:
# juju ssh ceph-mon/0 sudo ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
488G 488G 325M 0.07
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
default.rgw.buckets 1 0 0 154G 0
The only time the deployment worked, was because the ceph-radosgw units took more time then the other charms.
We then tried to deploy the same bundle, but without adding the ceph-radosgw units to the machines and waited for the cinder pools to be created in in ceph-mon and then add the ceph-radosgw units to the machines.
This time ceph-radosgw did not had any issue creating its keyring and all pools were visible in ceph-mon.
# juju ssh ceph-mon/0 sudo ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
488G 487G 1223M 0.24
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
cinder-backup 1 0 0 154G 0
cinder 2 19 0 154G 3
cinder-ceph 3 19 0 154G 3
glance 4 278M 0.18 154G 41
default.rgw.buckets 5 0 0 154G 0
default.rgw 6 0 0 154G 0
default.rgw.root 7 0 0 154G 0
default.rgw.control 8 0 0 154G 8
default.rgw.gc 9 0 0 154G 0
default.rgw.buckets.index 10 0 0 154G 0
default.rgw.buckets.extra 11 0 0 154G 0
default.log 12 0 0 154G 0
default.intent-log 13 0 0 154G 0
default.usage 14 0 0 154G 0
default.users 15 0 0 154G 0
default.users.email 16 0 0 154G 0
default.users.swift 17 0 0 154G 0
default.users.uid 18 0 0 154G 0
.rgw.root 19 1113 0 154G 4
default.rgw.meta 20 0 0 154G 0
default.rgw.log 21 0 0 154G 175
gnocchi 22 30624k 0.02 154G 3188
From there the Openstack deployment continued and worked without any issue.
Maybe there is a race conditions in how ceph handle its relations.
Environment:
All the latest released charms on Xenial/Queen
App Rev
ceilometer 253
ceilometer-agent 244
ceph-mon 25
ceph-osd 262
ceph-radosgw 258
cinder 272
cinder-backup 13
cinder-ceph 233
glance 265
gnocchi 8
hacluster 46
heat 252
keystone 281
memcached 21
mysql 266
neutron-api 260
neutron-gateway 252
neutron-openvswitch 250
nova-cloud-controller 310
nova-compute 284
ntp 24
ntpmaster 7
openstack-dashboard 259
rabbitmq-server 74 |
When we deploy our Openstack bundle with a ceph cluster in it, most of the time the ceph charms will be in error.
After looking at the logs, it seems ceph-radosgw tried to create its pool in ceph-mon, but when it does so before the other Openstack services can create theirs.
Wich result in the following error:
The logs on ceph-radosgw units are filled with the error messages:
2018-05-30 21:09:25.739133 7f48374f4e80 0 pidfile_write: ignore empty --pid-file
2018-05-30 21:09:25.748857 7f48374f4e80 -1 auth: unable to find a keyring on /etc/ceph/keyring.rados.gateway: (2) No such file or directory
The file "/etc/ceph/keyring.rados.gateway" is indeed not present.
Also on ceph-mon no other pools get to be created:
# juju ssh ceph-mon/0 sudo ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
488G 488G 325M 0.07
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
default.rgw.buckets 1 0 0 154G 0
The only time the deployment worked, was because the ceph-radosgw units took more time than the other charms.
We then tried to deploy the same bundle, but without adding the ceph-radosgw units to the machines and waited for the cinder pools to be created in in ceph-mon and then add the ceph-radosgw units to the machines.
This time ceph-radosgw did not had any issue creating its keyring and all pools were visible in ceph-mon.
# juju ssh ceph-mon/0 sudo ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
488G 487G 1223M 0.24
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
cinder-backup 1 0 0 154G 0
cinder 2 19 0 154G 3
cinder-ceph 3 19 0 154G 3
glance 4 278M 0.18 154G 41
default.rgw.buckets 5 0 0 154G 0
default.rgw 6 0 0 154G 0
default.rgw.root 7 0 0 154G 0
default.rgw.control 8 0 0 154G 8
default.rgw.gc 9 0 0 154G 0
default.rgw.buckets.index 10 0 0 154G 0
default.rgw.buckets.extra 11 0 0 154G 0
default.log 12 0 0 154G 0
default.intent-log 13 0 0 154G 0
default.usage 14 0 0 154G 0
default.users 15 0 0 154G 0
default.users.email 16 0 0 154G 0
default.users.swift 17 0 0 154G 0
default.users.uid 18 0 0 154G 0
.rgw.root 19 1113 0 154G 4
default.rgw.meta 20 0 0 154G 0
default.rgw.log 21 0 0 154G 175
gnocchi 22 30624k 0.02 154G 3188
From there the Openstack deployment continued and worked without any issue.
Maybe there is a race conditions in how ceph handle its relations.
Environment:
All the latest released charms on Xenial/Queen
App Rev
ceilometer 253
ceilometer-agent 244
ceph-mon 25
ceph-osd 262
ceph-radosgw 258
cinder 272
cinder-backup 13
cinder-ceph 233
glance 265
gnocchi 8
hacluster 46
heat 252
keystone 281
memcached 21
mysql 266
neutron-api 260
neutron-gateway 252
neutron-openvswitch 250
nova-cloud-controller 310
nova-compute 284
ntp 24
ntpmaster 7
openstack-dashboard 259
rabbitmq-server 74 |
|