The CephClientKey is overwritten when deploying from the GUI

Bug #1664521 reported by Udi Kalifon
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Expired
Undecided
Unassigned

Bug Description

Deployments with external ceph will result with a broken overcloud, because the CephClientKey that the user provides via the customized templates is overwritten by a random one.

Steps to Reproduce:
1. Upload templates with a customized puppet-ceph-external.yaml
2. Deploy an overcloud with controllers, computes, and the following environment files:
overcloud-resource-registry-puppet.yaml
environments/puppet-pacemaker.yaml
environments/puppet-ceph-external.yaml
3. After deployment succeeds, log in to one of the controllers and check the ceph client key (it is found in /etc/ceph/ceph.client.qe.keyring)

=> The key is wrong. You won't be able to create glance images, for example.

Revision history for this message
Julie Pichon (jpichon) wrote :

Thank you for the bug report. Were you able to try as well from the CLI by any chance? As far as I can tell the UI doesn't do anything special with that key or any key in particular, all the logic should be happening in tripleo-common - and apply to the other keys/passwords as well. Is CephClientKey the only key/password you've noticed overwritten?

tags: added: tripleo-common
Changed in tripleo:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Julie Pichon (jpichon) wrote :

Could you paste/attach the result of "mistral environment-get <plan name>" for this deployment?

Revision history for this message
Julie Pichon (jpichon) wrote :

I was able to reproduce this in a Newton environment (haven't tried on Ocata yet).

I tried both from the GUI and from the CLI ( openstack overcloud deploy --compute-scale 1 --control-scale 1 -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-ceph-external.yaml --templates ). This only failed for the GUI.

In the GUI case, the Mistral environment didn't contain the CephClientKey under "parameter_defaults", only the generated one under "passwords". The password I set manually was shown correctly under parameters_default for the CLI deployment.

Revision history for this message
Julie Pichon (jpichon) wrote :

Maybe scratch my comment about it not showing in the parameter_defaults, I'm not sure if that might be expected looking at older environments that worked with other environments. "path": "environments/puppet-ceph-external.yaml" does show up under "environments".

Revision history for this message
Julie Pichon (jpichon) wrote :

My Ocata environment is a bit unstable at the moment and deployments seem to fail before the client can be set on the controller. However I suspect we'll see similar issues because when I enable the External Ceph environment then go to Parameters, the CephclientKey etc shown comes from the generated "passwords" list, not from what was set in the YAML file.

Revision history for this message
Udi Kalifon (ukalifon) wrote :
Download full text (6.3 KiB)

As you already mentioned, this only fails when you try from the GUI. the same templates exactly deploy correctly from the CLI. My CephClientKey in the templates is different from what you see in the output of 'mistral environment-get'. I didn't notice any other keys overwritten, and we also don't have test cases that cover it.

+-------------+---------------------------------------------------------------------+
| Field | Value |
+-------------+---------------------------------------------------------------------+
| Name | extceph |
| Description | <none> |
| Variables | { |
| | "passwords": { |
| | "CephClusterFSID": "48bfb3fa-f35e-11e6-81f7-525400b8201c", |
| | "CephMonKey": "AQB/GqRYAAAAABAA5Z9ExdTVWPtUxIlf5EJfgw==", |
| | "NeutronPassword": "rtrjmePktB2uCRQjzKctcqbC2", |
| | "HeatPassword": "q2PhCREVMAd8NTrt99ypX2y6z", |
| | "SnmpdReadonlyUserPassword": "c4885c6b71f815068b9617b773a4d |
| | "GlancePassword": "sfyYxQQPRwfBaGuY7kEghArfU", |
| | "IronicPassword": "kQW2ZXtGCqxMpamv9DnU9eVhW", |
| | "HeatStackDomainAdminPassword": "rWTvVXh8zKAhByDGyxHVtKxEN" |
| | "ZaqarPassword": "K62yBdTgGhveQjDyEJBfYpA8x", |
| | "MysqlClustercheckPassword": "WpdX3Xrrd6uykGFWUpsrWamem", |
| | "GnocchiPassword": "bhQpfxgXtbVq9ywCmsmbbv88T", |
| | "CephAdminKey": "AQB/GqRYAAAAABAAO3vBKwhyY9DdFgrvyq9ybQ==", |
| | "CeilometerPassword": "4mHgyUZmJ6TpJUtUnxcYuvYJG", |
| | "CinderPassword": "uhXhpNCCUBwKBMYqH2cMTZ2Y2", |
| | "RabbitPassword": "3UjZqfBtrfjCRaFUDsYK4wEQs", |
| | "CephRgwKey": "AQB/GqRYAAAAABAAEXlbwX+n/Or3JS6nXzPJQg==", |
| | "TrovePassword": "jmHfZ3FE6X8EEY8DTZaj4dhmH", |
| | "AodhPassword": "GZDHu49gqPhEN3pKHVdDvcWFn", |
| | "KeystoneCredential0": "upkK49LbdUixDNjoMTzjhFrwqLiOFcox4Sl |
| | "KeystoneCredential1": "9iPRaE5g0INAGDa9JYIDCDjM4T7S-OU4vEs |
| | "CephClientKey": "AQB/GqRYAAAAABAAid7+yL6kjBdibN/D+N1qFw==" |
| | "NovaPassword": "zdeJDvjt3JEFDqhD4CkkGNfVs", |
| | "AdminToken": "rGcwFtW9W9FRQVqV8gVC8m6K4", |
| | "RedisPassword": "sGGgeZyMDJ7Z7wmxc3dbpEDdU", |
| | "MistralPassword": "TEBFbGhKk8mMeeunpw22DfQze", |
| | "SwiftHashSuffix": "Vn7MGfzVDCuxYPvEjYpMDsrGn", |
| | "AdminPa...

Read more...

Julie Pichon (jpichon)
Changed in tripleo:
assignee: nobody → Julie Pichon (jpichon)
milestone: none → ocata-rc1
Revision history for this message
Julie Pichon (jpichon) wrote :

Playing with auto-generated values a bit more, this definitely has something to do with "passwords" in the mistral environment.

If a value is set in "passwords", no matter the type, and no matter which environment template, the value in "passwords" will take precedence over what is in the template. Other values in that file like CephExternalMonHost are picked up correctly, but both CephClusterFSID and CephClientKey are overridden.

Changed in tripleo:
milestone: ocata-rc1 → ocata-rc2
Revision history for this message
Dougal Matthews (d0ugal) wrote :

I suspect this is different in the CLI because we use python-heatclient to merge the files before they are uploaded to Swift. We do this in the CLI mostly for legacy reasons and obviously it is tricky for the GUI to use heatclient!

I think this needs to be fixed in the tripleo-common workflows, I am not sure how easy/hard it will be. Essentially I think we will need to do a bit more processing of the templates.

It might be a good idea to have some input from a Heat expert.

tags: added: workflows
Julie Pichon (jpichon)
Changed in tripleo:
milestone: ocata-rc2 → pike-1
tags: added: newton-backport-potential ocata-backport-potential
Revision history for this message
Julie Pichon (jpichon) wrote :

From the UI's perspective there are workarounds to manually configure the values.

For Ocata: after enabling the "Externally managed Ceph" storage environment, click on the "Parameters" tab (same "Deployment Configuration" window), select "Externally managed Ceph" in the left-side menu and change the value for CephClientKey and other parameters as needed.

For Newton: after enabling the "Externally managed Ceph" storage environment, return to the main Deployment page, click on the pencil for the Controller card and navigate to the Services Tab. The CephClientKey and other settings are located under OS::TripleO::Services::CephExternal.

I'm not sure yet about the real fix to make sure the correct values are surfaced and override the default passwords correctly for UI deploys. I suspect bug 1635409 and other similar bugs around aligning the CLI and UI templates processing will help, but it's uncertain whether the relevant parts of that work will be backportable. I'm removing these tags for now based on this.

tags: removed: newton-backport-potential ocata-backport-potential
Revision history for this message
Dougal Matthews (d0ugal) wrote :

I had another look at this, and think I actually understood it this time. This works in tripleoclient because it merges the templates and environments and adds them to the Mistral environment parameter_defaults.

Currently the order in which parameters are used is this:

1. Mistral environment parameter_defaults (so the CLI wins by settings parameters here)
2. Mistral environment passwords
3. Environment files
4. Template defaults

1 and 2 are merged and added here: https://github.com/openstack/tripleo-common/blob/master/tripleo_common/actions/templates.py#L273-L288

3. is added here: https://github.com/openstack/tripleo-common/blob/master/tripleo_common/actions/templates.py#L264-L271

So, this means that the CephClientKey in parameter_defaults (user set) then in passwords (generated) always wins over environment files. Only manual edits in the GUI would overwrite the generated passwords. This isn't specific to CephClientKey and will be the same for all generated passwords.

There are a couple of options that I see.

1. Users can disable password generation if they want to set them manually like this (but this is all or nothing, you can't stop generation for specific passwords only).

2. We change the way passwords are generated so they are only generated when no password is given in the environment file or template (we can't tell the difference after merging). This would mean moving the password generation to happen when a user deploys I think. Since at the moment passwords are generated before environments are enabled.

The first is likely a good workaround for now and the second option is a possible longer term fix, but it will require a fair bit of work.

Changed in tripleo:
milestone: pike-1 → pike-2
Julie Pichon (jpichon)
Changed in tripleo:
assignee: Julie Pichon (jpichon) → nobody
Changed in tripleo:
milestone: pike-2 → pike-3
Changed in tripleo:
milestone: pike-3 → pike-rc1
Changed in tripleo:
milestone: pike-rc1 → pike-rc2
Changed in tripleo:
milestone: pike-rc2 → queens-1
Changed in tripleo:
milestone: queens-1 → queens-2
Changed in tripleo:
milestone: queens-2 → queens-3
Changed in tripleo:
assignee: nobody → Adriano Petrich (apetrich)
Changed in tripleo:
milestone: queens-3 → queens-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (master)

Fix proposed to branch: master
Review: https://review.openstack.org/542143

Changed in tripleo:
status: Triaged → In Progress
Changed in tripleo:
milestone: queens-rc1 → rocky-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-common (master)

Change abandoned by Adriano Petrich (<email address hidden>) on branch: master
Review: https://review.openstack.org/542143
Reason: this is not the way to go. We need to wait until we have https://bugs.launchpad.net/tripleo/+bug/1635409 fixed to find a solution

Revision history for this message
Adriano Petrich (apetrich) wrote :

I don't think we can move forward until we have https://bugs.launchpad.net/tripleo/+bug/1635409 fixed.

Revision history for this message
Adriano Petrich (apetrich) wrote :

More precisely this needs fixed https://review.openstack.org/#/c/448209

Changed in tripleo:
milestone: rocky-1 → rocky-2
Changed in tripleo:
milestone: rocky-2 → rocky-3
Changed in tripleo:
assignee: Adriano Petrich (apetrich) → nobody
status: In Progress → Triaged
Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Changed in tripleo:
milestone: stein-1 → stein-2
Revision history for this message
Emilien Macchi (emilienm) wrote : Cleanup EOL bug report

This is an automated cleanup. This bug report has been closed because it
is older than 18 months and there is no open code change to fix this.
After this time it is unlikely that the circumstances which lead to
the observed issue can be reproduced.

If you can reproduce the bug, please:
* reopen the bug report (set to status "New")
* AND add the detailed steps to reproduce the issue (if applicable)
* AND leave a comment "CONFIRMED FOR: <RELEASE_NAME>"
  Only still supported release names are valid (FUTURE, PIKE, QUEENS, ROCKY, STEIN).
  Valid example: CONFIRMED FOR: FUTURE

Changed in tripleo:
importance: High → Undecided
status: Triaged → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.