os-*-hostname change renders cloud unusable

Bug #1902264 reported by Facundo Ciccioli
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Keystone Charm
Fix Released
Low
Chris MacNaughton

Bug Description

We've tried changing the URL of a whole cloud and immediately after the change we lost the ability to login to horizon (giving authentication error) or using the CLI.

The way we made the change was the following:

* export the bundle
* replace fdqns (s/*.old-domain/*.new-domain/)
* replace ssl_cert option in any charm that uses it
* (if the cert is signed by a new CA) update the ssl_ca option on any charm that uses it (NOTE: openstack-service-checks uses trusted_ssl_ca)
* deploy the new bundle

The cloud is running xenial-queens and keystone charm is built from commit b7b4e43 (revision 314).

Unfortunately, this was done quite some time ago and we don't have all the logs available.

However, I did find a set of bugs that might be related:

LP 1826382
LP 1867305
LP 1663696

description: updated
Revision history for this message
Dorina Timbur (dorina-t) wrote :

To clarify further on the impact previously observed, after the FQDN was changed, the customer couldn't access the cloud from outside any more. Log in via Horizon was not working with "An error occurred authenticating. Please try again later.", they couldn't authenticate with a local or ldap account. Swift was still pointing to the previous url.
Would be great if someone from product can replicate a FQDN change in a lab and provide documentation on how best to do it in a production environment.

Changed in charm-keystone:
assignee: nobody → Chris MacNaughton (chris.macnaughton)
Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

Looking at the nova-cloud-controller charm, it looks like endpoint evaluation is only ever done in the relation-joined hook with keystone (https://github.com/openstack/charm-nova-cloud-controller/blob/25da3180b53abc9843cba37b12e08258de8644bf/hooks/nova_cc_hooks.py#L443). I haven't yet evaluated if that's the case across the rest of the charms but I suspect it is.

What that means is that whatever hostname(s) are configured when the initial relation to keystone is made is the hostname that will be put into the keystone catalog, and the config option will then never be re-evaluated in the context of the keystone relation. The charms consuming the keystone identity service relation need to be updated to ensure that they propagate URL updates to Keystone.

no longer affects: charm-nova-cloud-controller
Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

Looking more thoroughly, nova-cloud-controller does get the hostnames and things updated in the config changed, and many of the relation hooks.

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

I have validated, with the latest charms in ~openstack-charmers-next, that the hostnames in the keystone catalog change correctly when the config is updated, although it does take a few minutes. Additionally, I've confirmed that, when using a wildcard certificate (*.first-domain) and then updating that certificate (*.second.domain) at the same time as the hostname changes also works as expected, and that the radosgw and dashboard continues to work as expected. I am able to access the (updated) Swift endpoint in the catalog as well as login to the dashboard.

I'm going to close this bug as incomplete. If it can be reproduced, please include more information about how to reproduce it and update it to New!

Changed in charm-keystone:
status: New → Incomplete
Revision history for this message
James Troup (elmo) wrote :

Hi, it'd be *really* helpful if your validation could be done with released stable charms, because that's what we run on customer clouds. Proving it's broken with *next* charms is... not particularly helpful to us.

Revision history for this message
James Troup (elmo) wrote :

s/it's broken/it's NOT broken/

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

Sure, the longest part of my test was getting everything setup so that I was using hostnames, and could update everything; will re-run the same check on the stable charms

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :
Download full text (5.1 KiB)

As with the -next charms, the stable charms have successfully updated all endpoints in the keystone catalog:

+-----------+--------------+------------------------------------------------------------+
| Name | Type | Endpoints |
+-----------+--------------+------------------------------------------------------------+
| nova | compute | RegionOne |
| | | public: https://ncc.new-os.test.alph.ac:8774/v2.1 |
| | | RegionOne |
| | | internal: https://ncc.new-os.test.alph.ac:8774/v2.1 |
| | | RegionOne |
| | | admin: https://ncc.new-os.test.alph.ac:8774/v2.1 |
| | | |
| s3 | s3 | RegionOne |
| | | internal: https://swift.new-os.test.alph.ac:444/ |
| | | RegionOne |
| | | public: https://swift.new-os.test.alph.ac:444/ |
| | | RegionOne |
| | | admin: https://swift.new-os.test.alph.ac:444/ |
| | | |
| neutron | network | RegionOne |
| | | admin: https://neutron.new-os.test.alph.ac:9696 |
| | | RegionOne |
| | | internal: https://neutron.new-os.test.alph.ac:9696 |
| | | RegionOne |
| | | public: https://neutron.new-os.test.alph.ac:9696 |
| | | |
| placement | placement | RegionOne |
| | | internal: https://placement.new-os.test.alph.ac:8778 |
| | | RegionOne |
| | | admin: https://placement.new-os.test.alph.ac:8778 |
| | | RegionOne |
| | | public: https://placement.new-os.test.alph.ac:8778 |
| | | |
| keystone | identity | RegionOne |
| | | internal: https://keystone.new-os.test.alph.ac:5000/v3 |
| | | RegionOne |
| | | ...

Read more...

Revision history for this message
Facundo Ciccioli (fandanbango) wrote :

Hi there. I'm attaching the extract of the juju status with the charms' versions where the issue was observed.

Revision history for this message
Facundo Ciccioli (fandanbango) wrote :
Revision history for this message
Xav Paice (xavpaice) wrote :

Hi,

Using charm cs:keystone-319, when we changed the os-admin-hostname, os-internal-hostname, and
os-public-hostname we found that on the leader unit, the relation data provided over the identity-service relation was updated correctly, but on the two non-leader keystone units, the same data was not updated (i.e. old names). This caused some services to fail to update their config, causing issues.

By manually updating the relation data using the following, we were able to workaround:

relation-set -r identity-service:$i auth_host=$newhost service_host=$newhost
relation-set -r identity-credentials:$i credentials_host=$newhost auth_host=$newhost

Note that this was seen on both the identity-service and identity-credentials relations.

Changed in charm-keystone:
status: Incomplete → New
Changed in charm-keystone:
status: New → Triaged
importance: Undecided → Low
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-keystone (master)
Changed in charm-keystone:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-keystone (master)

Reviewed: https://review.opendev.org/c/openstack/charm-keystone/+/804802
Committed: https://opendev.org/openstack/charm-keystone/commit/9b8b81a0bc8406f03b2de884eeec91b8e8f2d442
Submitter: "Zuul (22348)"
Branch: master

commit 9b8b81a0bc8406f03b2de884eeec91b8e8f2d442
Author: Chris MacNaughton <email address hidden>
Date: Mon Aug 16 16:55:14 2021 -0500

    Use the application data bag to set id and id_service notifications

    When purely using relation-set from a leader, updates after
    the leader has changed can lead to old data being persisted
    on a relation in addition to newer data being set by the new
    leader. When this happens, there can be issues with services
    using old data to talk to other related services.

    This change introduces the use of the application data bag
    to ensure that all units related to keystone get the same
    data from the leader, regardless of leadership changes.
    While this change enables the application data bag for these
    relations, it still sends the per-unit relation data as well
    to maintain backwards compatibility. Charms that consume the
    identity-service and identity-notification relations will
    need an update to use the application data bag to complete
    this change.

    Partial-Bug: #1902264
    Change-Id: Iadd795fec605e7704e5a6673906452279bbecb34

Changed in charm-keystone:
status: In Progress → Fix Released
no longer affects: charm-keystone/ussuri
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.