Legacy "volume" keystone service causes endpoint validation to fail

Bug #1897761 reported by Alan Bishop
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Medium
Alan Bishop

Bug Description

Cinder's v1 API was removed in queens, but it was necessary to retain its associated "volume" service in order to work around a bug in the queens-era tempest code. See [1] for details.

[1] https://review.opendev.org/649084

The problem is this legacy "volume" service causes tripleo's keystone endpoint validation to fail after performing an FFU from queens to train. The FFU will succeed, but subsequent attempts to update the stack will fail like this:

TASK [tripleo-keystone-resources : Check Keystone public endpoint status] ******
Sunday 13 September 2020 11:14:39 +0000 (0:00:05.000) 0:25:13.826 ******
...
failed: [undercloud] (item={'started': 1, 'finished': 0, 'ansible_job_id': '668760418547.976504', 'results_file': '/root/.ansible_async/668760418547.976504', 'changed': True, 'failed': False, 'tripleo_keystone_resources_data': {'key': 'cinderv3', 'value': {'endpoints': {'admin': 'http://172.17.1.x:8776/v3/%(tenant_id)s', 'internal': 'http://172.17.1.x:8776/v3/%(tenant_id)s', 'public': 'http://10.0.0.x:8776/v3/%(tenant_id)s'}, 'region': 'regionOne', 'service': 'volumev3', 'users': {'cinderv3': {'password': 'blah', 'roles': ['admin', 'service']}}}}, 'ansible_loop_var': 'tripleo_keystone_resources_data'}) => {"ansible_job_id": "668760418547.976504", "ansible_loop_var": "tripleo_keystone_resources_endpoint_async_result_item", "attempts": 1, "changed": false, "finished": 1, "msg": "Multiple matches found for cinderv3", "tripleo_keystone_resources_endpoint_async_result_item": {"ansible_job_id": "668760418547.976504", "ansible_loop_var": "tripleo_keystone_resources_data", "changed": true, "failed": false, "finished": 0, "results_file": "/root/.ansible_async/668760418547.976504", "started": 1, "tripleo_keystone_resources_data": {"key": "cinderv3", "value": {"endpoints": {"admin": "http://172.17.1.x:8776/v3/%(tenant_id)s", "internal": "http://172.17.1.x:8776/v3/%(tenant_id)s", "public": "http://10.0.0.x:8776/v3/%(tenant_id)s"}, "region": "regionOne", "service": "volumev3", "users": {"cinderv3": {"password": "blah", "roles": ["admin", "service"]}}}}}}

The problem can be worked around by deleting the "volume" service, which also removes its associated endpoints.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.opendev.org/755070

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/755070
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=0fdef46ea0ca9045f85a0a7857884a246a424169
Submitter: Zuul
Branch: master

commit 0fdef46ea0ca9045f85a0a7857884a246a424169
Author: Alan Bishop <email address hidden>
Date: Mon Sep 28 16:21:19 2020 -0700

    [FFU] Remove cinder's v1 keystone service

    Remove cinder's "volume" (API v1) service from the keystone catalog.
    This fixes a post-FFU bug that causes keystone endpoint validation to
    fail. Cinder stopped supporting its v1 API in queens, but tripleo
    retained the "volume" service (with API v3 endpoints) to work around
    a bug in the version of tempest used in queens (see [1] for details).
    The endpoint validation fails because the "volume" and "volume3" servces
    share the same v3 endpoints.

    [1] https://review.opendev.org/#/q/If1ef8b1ad60151c0dfd0a7804ba7e697fc4ede28

    The patch was tested locally:
    - Confirm a fresh deployment (with patch) succeeds
    - Manually create "volume" service with "cinderv3" endpoints. This
      replicates the post-FFU scenario
    - Perform a stack update (succeeds), and confirm the "volume" service
      has been deleted

    Final note: The ansible task that removes the "volume" service is a
    deployment (not upgrade) task. This ensures the service is removed from
    overcloud deployments that already performed the FFU.

    Closes-Bug: #1897761
    Change-Id: Ic0eb72f78e2a19e2f40ab12631a872d828bab46a

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/755846

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (stable/ussuri)

Change abandoned by wes hayutin (<email address hidden>) on branch: stable/ussuri
Review: https://review.opendev.org/755846
Reason: hitting infra related retries

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/ussuri)

Reviewed: https://review.opendev.org/755846
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=99220e0ca656ddd1dcada5e025e226bc86409998
Submitter: Zuul
Branch: stable/ussuri

commit 99220e0ca656ddd1dcada5e025e226bc86409998
Author: Alan Bishop <email address hidden>
Date: Mon Sep 28 16:21:19 2020 -0700

    [FFU] Remove cinder's v1 keystone service

    Remove cinder's "volume" (API v1) service from the keystone catalog.
    This fixes a post-FFU bug that causes keystone endpoint validation to
    fail. Cinder stopped supporting its v1 API in queens, but tripleo
    retained the "volume" service (with API v3 endpoints) to work around
    a bug in the version of tempest used in queens (see [1] for details).
    The endpoint validation fails because the "volume" and "volume3" servces
    share the same v3 endpoints.

    [1] https://review.opendev.org/#/q/If1ef8b1ad60151c0dfd0a7804ba7e697fc4ede28

    The patch was tested locally:
    - Confirm a fresh deployment (with patch) succeeds
    - Manually create "volume" service with "cinderv3" endpoints. This
      replicates the post-FFU scenario
    - Perform a stack update (succeeds), and confirm the "volume" service
      has been deleted

    Final note: The ansible task that removes the "volume" service is a
    deployment (not upgrade) task. This ensures the service is removed from
    overcloud deployments that already performed the FFU.

    Closes-Bug: #1897761
    Change-Id: Ic0eb72f78e2a19e2f40ab12631a872d828bab46a
    (cherry picked from commit 0fdef46ea0ca9045f85a0a7857884a246a424169)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/757414

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/757414
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=74a6ac6ebe2924411affd81b7739d960ed61941a
Submitter: Zuul
Branch: stable/train

commit 74a6ac6ebe2924411affd81b7739d960ed61941a
Author: Alan Bishop <email address hidden>
Date: Mon Sep 28 16:21:19 2020 -0700

    [FFU] Remove cinder's v1 keystone service

    Remove cinder's "volume" (API v1) service from the keystone catalog.
    This fixes a post-FFU bug that causes keystone endpoint validation to
    fail. Cinder stopped supporting its v1 API in queens, but tripleo
    retained the "volume" service (with API v3 endpoints) to work around
    a bug in the version of tempest used in queens (see [1] for details).
    The endpoint validation fails because the "volume" and "volume3" servces
    share the same v3 endpoints.

    [1] https://review.opendev.org/#/q/If1ef8b1ad60151c0dfd0a7804ba7e697fc4ede28

    The patch was tested locally:
    - Confirm a fresh deployment (with patch) succeeds
    - Manually create "volume" service with "cinderv3" endpoints. This
      replicates the post-FFU scenario
    - Perform a stack update (succeeds), and confirm the "volume" service
      has been deleted

    Final note: The ansible task that removes the "volume" service is a
    deployment (not upgrade) task. This ensures the service is removed from
    overcloud deployments that already performed the FFU.

    Closes-Bug: #1897761
    Change-Id: Ic0eb72f78e2a19e2f40ab12631a872d828bab46a
    (cherry picked from commit 0fdef46ea0ca9045f85a0a7857884a246a424169)
    (cherry picked from commit 99220e0ca656ddd1dcada5e025e226bc86409998)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 12.4.2

This issue was fixed in the openstack/tripleo-heat-templates 12.4.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.4.0

This issue was fixed in the openstack/tripleo-heat-templates 11.4.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.