Redis THT templates contain malformed metadata_settings

Bug #1838679 reported by Harry Rybacki
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Harry Rybacki
Queens
Fix Released
High
Harry Rybacki
Rocky
Fix Released
High
Harry Rybacki

Bug Description

Description
===========
Malformed THT templates (metadata_settings specifically) for Redis are resulting in service principals not being created by noavjoin service. As a result, when during Step2 of deployment the `getcert` request fails on a permission fail.

Steps to reproduce
==================
1. Deploy non-HA undercloud with queens or rocky bits using FreeIPA as your CA.
2. Attempt to deploy overcloud with internal TLS via TripleO e.g.:

openstack overcloud deploy \
    --templates \
    -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-everywhere-endpoints-dns.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/enable-internal-tls.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/services/haproxy-public-tls-certmonger.yaml \
    -e /home/stack/cloud-names.yaml \
    -e /home/stack/misc-bits.yaml

Expected result
===============
Novajoin adds service principal for Redis to FreeIPA. Overcloud deploys successfully.

Actual result
=============
Deployment blows up during Step2 when `getcert request` is invoked to fetch a certifcate for Redis because it lacks permissions (service principal for Redis was not added to IdM).

Environment
===========
1. Found bug in Queens but verified it exists in Rocky. The issue was resolved during an architectural shift between Rocky and Stein so it does not effect releases beyond Rocky.

2. Which storage type did you use?
   Default storage

3. I used FreeIPA as my CA but this should reproduce with other CAs.

Logs and Configs
================

## Overcloud deploy invocation ##

openstack overcloud deploy \
    --templates \
    --ntp-server clock1.rdu2.redhat.com \
    -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-everywhere-endpoints-dns.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/enable-internal-tls.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/services/haproxy-public-tls-certmonger.yaml \
    -e /home/stack/cloud-names.yaml \
    -e /home/stack/misc-bits.yaml

## cloud-names.yaml ##

parameter_defaults:
  CloudDomain: ooo.test
  CloudName: overcloud.ooo.test
  CloudNameInternal: overcloud.internalapi.ooo.test
  CloudNameStorage: overcloud.storage.ooo.test
  CloudNameStorageManagement: overcloud.storagemgmt.ooo.test
  CloudNameCtlplane: overcloud.ctlplane.ooo.tes

## misc-bits.yaml ##

parameter_defaults:
  DnsServers: ["192.168.1.12"] # <-- FreeIPA server

## Deployment log ##

2019-08-01 18:11:32Z [overcloud-AllNodesDeploySteps-yrw4c7uy3r3v-ControllerDeployment_Step1-2mx22mczn24y.0]: SIGNAL_IN_PROGRESS Signal: deployment 0fc2d36c-fa62-4565-92fb-cf43295675ce failed (2)
2019-08-01 18:11:33Z [overcloud-AllNodesDeploySteps-yrw4c7uy3r3v-ControllerDeployment_Step1-2mx22mczn24y.0]: UPDATE_FAILED Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
2019-08-01 18:11:33Z [overcloud-AllNodesDeploySteps-yrw4c7uy3r3v-ControllerDeployment_Step1-2mx22mczn24y]: UPDATE_FAILED Resource UPDATE failed: Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
2019-08-01 18:11:33Z [overcloud-AllNodesDeploySteps-yrw4c7uy3r3v.ControllerDeployment_Step1]: UPDATE_FAILED Error: resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
2019-08-01 18:11:33Z [overcloud-AllNodesDeploySteps-yrw4c7uy3r3v]: UPDATE_FAILED Resource UPDATE failed: Error: resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
2019-08-01 18:11:33Z [AllNodesDeploySteps]: UPDATE_FAILED Error: resources.AllNodesDeploySteps.resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
2019-08-01 18:11:34Z [overcloud]: UPDATE_FAILED Resource UPDATE failed: Error: resources.AllNodesDeploySteps.resources.ControllerDeployment_Step1.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2

 Stack overcloud UPDATE_FAILED

## Controller journalctl ##

Jul 29 17:41:23 overcloud-controller-0.ooo.test certmonger[19146]: 2019-07-29 17:41:23 [19146] Server at https://ipa.ooo.test/ipa/xml denied our request, giving up: 2100 (RPC failed at server. Insufficient acces
Jul 29 17:41:23 overcloud-controller-0.ooo.test certmonger[19318]: Request for certificate to be stored in file "/etc/pki/tls/certs/redis.crt" rejected by CA.
Jul 29 17:41:23 overcloud-controller-0.ooo.test puppet-user[18574]: Could not get certificate: Execution of '/usr/bin/getcert request -I redis -f /etc/pki/tls/certs/redis.crt -c IPA -N CN=overcloud-controller-0.i
Jul 29 17:41:23 overcloud-controller-0.ooo.test puppet-user[18574]: (/Stage[main]/Tripleo::Certmonger::Redis/Certmonger_certificate[redis]) Could not evaluate: Could not get certificate: Server at https://ipa.ooo
Jul 29 17:41:23 overcloud-controller-0.ooo.test puppet-user[18574]: (/Stage[main]/Tripleo::Certmonger::Redis/File[/etc/pki/tls/certs/redis.crt]) Dependency Certmonger_certificate[redis] has failures: true
Jul 29 17:41:23 overcloud-controller-0.ooo.test puppet-user[18574]: (/Stage[main]/Tripleo::Certmonger::Redis/File[/etc/pki/tls/certs/redis.crt]) Skipping because of failed dependencies
Jul 29 17:41:23 overcloud-controller-0.ooo.test puppet-user[18574]: (/Stage[main]/Tripleo::Certmonger::Redis/File[/etc/pki/tls/private/redis.key]) Dependency Certmonger_certificate[redis] has failures: true
Jul 29 17:41:23 overcloud-controller-0.ooo.test puppet-user[18574]: (/Stage[main]/Tripleo::Certmonger::Redis/File[/etc/pki/tls/private/redis.key]) Skipping because of failed dependencies

Revision history for this message
Harry Rybacki (hrybacki-h) wrote :

Assigning self -- working on fix for Queens/Rocky presently. As noted in the description the issue was fixed during a big code shift from Rocky-->Stein.

Changed in tripleo:
importance: Undecided → High
milestone: none → train-3
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/674106

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/rocky)

Reviewed: https://review.opendev.org/674106
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=b96b049f983662ea0badbca5d4f7b0e95b880338
Submitter: Zuul
Branch: stable/rocky

commit b96b049f983662ea0badbca5d4f7b0e95b880338
Author: Harry Rybacki <email address hidden>
Date: Thu Aug 1 14:40:19 2019 -0400

    Fix broken metadata_settings for redis templates

    metadata_settings in docker/services/redis.yaml was returning a list
    of two items rather than one as expected. As a result, the compact/
    mangedby service principals were not being created by novajoin service.
    This results ina permission issue during overcloud deploy as the
    `getcert` request will hit a permissions issue during Step2.

    Note that this only affects Rocky and earlier branches. The issue was
    resolved in Stein when redis service was flattened[1,2].

    - Push tls logic into redis-base and consume in child templates.
    - Move away from use_tls_proxy to more accurate internal_tls_enabled
    - Ensure redis service has both service principals created if internal
      tls is enabled
    [1] - https://review.opendev.org/#/c/635930/
    [2] - https://review.opendev.org/640944

    Change-Id: Ic781905b63a0635b7bd0c7079fa84ca1e7f93989
    Partial-bug: #1838679

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/674369

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/queens)

Reviewed: https://review.opendev.org/674369
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=e0f50b4b3aa1fc0c2dc4da5c596b776e1c34c6f2
Submitter: Zuul
Branch: stable/queens

commit e0f50b4b3aa1fc0c2dc4da5c596b776e1c34c6f2
Author: Harry Rybacki <email address hidden>
Date: Thu Aug 1 14:40:19 2019 -0400

    Fix broken metadata_settings for redis templates

    metadata_settings in docker/services/redis.yaml was returning a list
    of two items rather than one as expected. As a result, the compact/
    mangedby service principals were not being created by novajoin service.
    This results ina permission issue during overcloud deploy as the
    `getcert` request will hit a permissions issue during Step2.

    Note that this only affects Rocky and earlier branches. The issue was
    resolved in Stein when redis service was flattened[1,2].

    - Push tls logic into redis-base and consume in child templates.
    - Move away from use_tls_proxy to more accurate internal_tls_enabled
    - Ensure redis service has both service principals created if internal
      tls is enabled
    [1] - https://review.opendev.org/#/c/635930/
    [2] - https://review.opendev.org/640944

    Change-Id: Ic781905b63a0635b7bd0c7079fa84ca1e7f93989
    Partial-bug: #1838679
    (cherry picked from commit b96b049f983662ea0badbca5d4f7b0e95b880338)

tags: added: in-stable-queens
Changed in tripleo:
status: Triaged → Fix Released
Changed in tripleo:
status: Fix Released → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/678218

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/rocky)

Reviewed: https://review.opendev.org/678218
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=6a514fb14371c088240bcb83615a15dd78f1a9fb
Submitter: Zuul
Branch: stable/rocky

commit 6a514fb14371c088240bcb83615a15dd78f1a9fb
Author: Harry Rybacki <email address hidden>
Date: Fri Aug 23 08:58:02 2019 -0400

    Redis metadata using incorrect network/service

    Metadata was set to use mysql network and service. This resulted
    in Redis service principles never being generated for TLS-E deplo-
    yments.

    Change-Id: I55fecf3c7d02342b28570c1ad75edc927afa19ce
    Partial-bug: #1838679

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/678539

Revision history for this message
Harry Rybacki (hrybacki-h) wrote :

stable/rocky fixes have landed. Proposing cherry-pick back to stable/queens.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/queens)

Reviewed: https://review.opendev.org/678539
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=6006f14005c0c52b45ad0e5cc91b2f3a06db28d2
Submitter: Zuul
Branch: stable/queens

commit 6006f14005c0c52b45ad0e5cc91b2f3a06db28d2
Author: Harry Rybacki <email address hidden>
Date: Fri Aug 23 08:58:02 2019 -0400

    Redis metadata using incorrect network/service

    Metadata was set to use mysql network and service[1]. This resulted
    in Redis service principles never being generated for TLS-E deplo-
    yments.

    [1] - https://review.opendev.org/#/c/674369/

    Change-Id: I55fecf3c7d02342b28570c1ad75edc927afa19ce
    Partial-bug: #1838679
    (cherry picked from commit 6a514fb14371c088240bcb83615a15dd78f1a9fb)

Changed in tripleo:
milestone: train-3 → ussuri-1
Changed in tripleo:
milestone: ussuri-1 → ussuri-2
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-2 → ussuri-3
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-3 → ussuri-rc3
Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.