When deploying TripleO with Ceph using Luminous the deployment fails on step 4 because the ceph metrics pool was not created.
(undercloud) [stack@undercloud74 ~]$ echo -e `heat deployment-show 37bc65f5-6986-4f27-b583-986333b648a4`|grep -i error
WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:344: SubjectAltNameWarning: Certificate for 192.168.0.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
SubjectAltNameWarning
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:344: SubjectAltNameWarning: Certificate for 192.168.0.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
SubjectAltNameWarning
\"Error running ['docker', 'run', '--name', 'gnocchi_db_sync', '--label', 'config_id=tripleo_step4', '--label', 'container_name=gnocchi_db_sync', '--label', 'managed_by=paunch', '--label', 'config_data={\\"environment\\": [\\"KOLLA_CONFIG_STRATEGY=COPY_ALWAYS\\", \\"TRIPLEO_CONFIG_HASH=1a569d012dc804939398b671bf257703\\"], \\"user\\": \\"root\\", \\"volumes\\": [\\"/etc/hosts:/etc/hosts:ro\\", \\"/etc/localtime:/etc/localtime:ro\\", \\"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro\\", \\"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro\\", \\"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro\\", \\"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro\\", \\"/dev/log:/dev/log\\", \\"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro\\", \\"/etc/puppet:/etc/puppet:ro\\", \\"/var/lib/kolla/config_files/gnocchi_db_sync.json:/var/lib/kolla/config_files/config.json:ro\\", \\"/var/lib/config-data/puppet-generated/gnocchi/:/var/lib/kolla/config_files/src:ro\\", \\"/var/log/containers/gnocchi:/var/log/gnocchi\\", \\"/var/log/containers/httpd/gnocchi-api:/var/log/httpd\\", \\"/etc/ceph:/var/lib/kolla/config_files/src-ceph:ro\\"], \\"image\\": \\"192.168.0.1:8787/rhosp13/openstack-gnocchi-api:13.0-20180112.1\\", \\"detach\\": false, \\"net\\": \\"host\\", \\"privileged\\": false}', '--env=KOLLA_CONFIG_STRATEGY=COPY_ALWAYS', '--env=TRIPLEO_CONFIG_HASH=1a569d012dc804939398b671bf257703', '--net=host', '--privileged=false', '--user=root', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '--volume=/etc/puppet:/etc/puppet:ro', '--volume=/var/lib/kolla/config_files/gnocchi_db_sync.json:/var/lib/kolla/config_files/config.json:ro', '--volume=/var/lib/config-data/puppet-generated/gnocchi/:/var/lib/kolla/config_files/src:ro', '--volume=/var/log/containers/gnocchi:/var/log/gnocchi', '--volume=/var/log/containers/httpd/gnocchi-api:/var/log/httpd', '--volume=/etc/ceph:/var/lib/kolla/config_files/src-ceph:ro', '192.168.0.1:8787/rhosp13/openstack-gnocchi-api:13.0-20180112.1']. [1]\",
\"ObjectNotFound: error opening pool 'metrics'\",
(undercloud) [stack@undercloud74 ~]$
Root Cause:
The pools were not created and ansible [1] returned the following message from ceph:
"Error ERANGE: pg_num 128 size 3 would mean 768 total pgs, which exceeds max 600 (mon_max_pg_per_osd 200 * num_in_osds 3)"
The workaround is to change any of the above three variables to satisfy the following function when we create, for OpenStack by default, seven pools:
https:/ /github. com/ceph/ ceph/blob/ e59258943bcfe3e 52d40a59ff30df5 5e1e6a3865/ src/mon/ OSDMonitor. cc#L5670- L5698
This is new to queens because it's using lumionus which has the above feature. The problem is that EVERY queens deployment that doesn't override the defaults will have this problem.
Here's one workaround which satisfies the function above:
parameter_defaults: ltSize: 3 ltPgNum: 128 rrides: max_pg_ per_osd: 3072
CephPoolDefau
CephPoolDefau
CephConfigOve
mon_
In the above case I increased mon_max_pg_per_osd based on the closest power of 2 greater than (* 128 3 7).
[1] grep Error /var/log/ mistral/ ceph-install- workflow. log | grep 128