[heat] Installations fails during Update Q->R

Bug #1807346 reported by Christian Zunker
52
This bug affects 11 people
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Won't Fix
Undecided
Guilherme Steinmuller Pimentel

Bug Description

During an update from Queens to Rocky, the heat installations fails at this step:
TASK [os_heat : Add service/heat user] ****************************************************************************************************************************************************************************************************************************************************************************************
changed: [ctr003_heat_api_container-b7577d8d] => (item=None)
FAILED - RETRYING: Add service/heat user (5 retries left).
FAILED - RETRYING: Add service/heat user (4 retries left).
FAILED - RETRYING: Add service/heat user (3 retries left).
FAILED - RETRYING: Add service/heat user (2 retries left).
FAILED - RETRYING: Add service/heat user (1 retries left).
failed: [ctr003_heat_api_container-b7577d8d] (item=None) => {"attempts": 5, "censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false}
fatal: [ctr003_heat_api_container-b7577d8d]: FAILED! => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": true}

Enabling logging reveals this:
TASK [os_heat : Add service/heat user] *********************************************************************************************************************************************************************************************************************************************************************************************************************
changed: [ctr003_heat_api_container-b7577d8d -> localhost] => (item={u'domain': u'default', u'password': u'***', u'name': u'heat', u'default_project': u'service'})
FAILED - RETRYING: Add service/heat user (5 retries left).
FAILED - RETRYING: Add service/heat user (4 retries left).
FAILED - RETRYING: Add service/heat user (3 retries left).
FAILED - RETRYING: Add service/heat user (2 retries left).
FAILED - RETRYING: Add service/heat user (1 retries left).
failed: [ctr003_heat_api_container-b7577d8d -> localhost] (item={u'domain': u'heat', u'password': u'***', u'name': u'stack_domain_admin', u'default_project': u'admin'}) => {"attempts": 5, "changed": false, "extra_data": null, "item": {"default_project": "admin", "domain": "heat", "name": "stack_domain_admin", "password": "***"}, "msg": "Error in creating user stack_domain_admin: Client Error for url: https://172.29.236.254:5000/v3/users, {\"error\": {\"message\": \"Conflict occurred attempting to store user - Duplicate entry found with name stack_domain_admin at domain ID 20cd88c520474bb3a061fd0d36867ec1.\", \"code\": 409, \"title\": \"Conflict\"}}"}

This is a bug in os_user:
https://github.com/ansible/ansible/issues/42901

Applying the change for get_user fixed the above step:
https://github.com/ansible/ansible/pull/42899/files

But it leads to this problem:
TASK [os_heat : Add service user to roles] ************************************************************************************************************************************************************************************************************************************************************************************
ok: [ctr003_heat_api_container-b7577d8d -> localhost] => (item={u'project': u'service', u'role': u'admin', u'user': u'heat'})
changed: [ctr003_heat_api_container-b7577d8d -> localhost] => (item={u'project': u'service', u'role': u'heat_stack_owner', u'user': u'heat'})
changed: [ctr003_heat_api_container-b7577d8d -> localhost] => (item={u'project': u'service', u'role': u'heat_stack_owner', u'user': u'admin'})
FAILED - RETRYING: Add service user to roles (5 retries left).
FAILED - RETRYING: Add service user to roles (4 retries left).
FAILED - RETRYING: Add service user to roles (3 retries left).
FAILED - RETRYING: Add service user to roles (2 retries left).
FAILED - RETRYING: Add service user to roles (1 retries left).
failed: [ctr003_heat_api_container-b7577d8d -> localhost] (item={u'project': u'admin', u'role': u'admin', u'user': u'stack_domain_admin'}) => {"attempts": 5, "changed": false, "item": {"project": "admin", "role": "admin", "user": "stack_domain_admin"}, "msg": "User stack_domain_admin is not valid"}

Next Ansible bug:
https://github.com/ansible/ansible/issues/42911

Fix:
https://github.com/ansible/ansible/pull/42913/files
+ a change to the heat task Add service user to roles:
    - name: Add service user to roles
      os_user_role:
        cloud: default
        state: present
        user: "{{ item.user }}"
        role: "{{ item.role }}"
        project: "{{ item.project }}"
        domain: "{{ item.domain }}"
        endpoint_type: admin
        verify: "{{ not keystone_service_adminuri_insecure }}"
      register: add_service
      when: not heat_service_in_ldap | bool
      until: add_service is success
      retries: 5
      delay: 10
      with_items:
        - user: "{{ heat_service_user_name }}"
          role: "{{ heat_service_role_name }}"
          project: "{{ heat_service_project_name }}"
          domain: default
        # We add the keystone role used by heat to delegate to the heat service user
        # for performing deferred operations via trusts.
        - user: "{{ heat_service_user_name }}"
          role: "{{ heat_stack_owner_name }}"
          project: "{{ heat_service_project_name }}"
          domain: default
        # Any user creating stacks needs to have the 'heat_stack_owner' role assigned.
        # We add to admin user here for testing purposes.
        - user: "{{ keystone_admin_user_name }}"
          role: "{{ heat_stack_owner_name }}"
          project: "{{ heat_service_project_name }}"
          domain: default
        # os_user_role needs a id
        - user: "{{ heat_stack_domain_admin }}"
          role: "{{ keystone_role_name | default('admin') }}"
          project:
          domain: "{{ add_stack_user_domain.id }}"

According to the heat docs the project is not specified for the stack_domain_admin:
https://docs.openstack.org/heat/rocky/install/install-ubuntu.html

The result can be verified with:
$ openstack --os-cloud default role assignment list --user-domain heat --domain heat --user stack_domain_admin
+----------------------------------+----------------------------------+-------+---------+----------------------------------+--------+-----------+
| Role | User | Group | Project | Domain | System | Inherited |
+----------------------------------+----------------------------------+-------+---------+----------------------------------+--------+-----------+
| ffe96c912b304986a90f653678a34765 | d3c07637f90a445c8c942ada536d21ef | | | 20cd88c520474bb3a061fd0d36867ec1 | | False |
+----------------------------------+----------------------------------+-------+---------+----------------------------------+--------+-----------+
$ openstack --os-cloud default role show <uuid from previous command>
+-----------+----------------------------------+
| Field | Value |
+-----------+----------------------------------+
| domain_id | None |
| id | ffe96c912b304986a90f653678a34765 |
| name | admin |
+-----------+----------------------------------+

Mohammed Naser (mnaser)
Changed in openstack-ansible:
assignee: nobody → Guilherme Steinmuller Pimentel (guilhermesp)
Revision history for this message
YG Kumar (ygk-kmr) wrote :

It is affecting our important installations. Please resolve this on priority

Revision history for this message
panic! (thomas-schend) wrote :

I can reproduce the problem after a fresh deploy of Rocky 18.1.4. Then adding mutlidomain with LDAP. Doing a minor Upgrade again. Any solution for this?

Revision history for this message
vignesh (vickysubam) wrote :

I too facing the same issue during fresh install of Rocky 18.1.4, any permanent fix would be great.

Revision history for this message
Gilles Mocellin (gilles-mocellin) wrote :

Hello,

Should we upgrade Ansible to >= 2.7 to have support for domain in os_user* ?
Or will the heat role revert to not using these Ansible modules ?

This is a blocker to new installs and breaks also upgrades to Rocky.

Revision history for this message
roly (kabols) wrote :

facing the same issue during upgrade to rocky from queens using openstack-ansible, there is any workaround for this?

Revision history for this message
vignesh (vickysubam) wrote :

This issue also affects Magnum install.

Revision history for this message
panic! (thomas-schend) wrote :

Just an update on this one. As soon as you enable multidomain even running the nova playbook fails.

This is really critical for us. Anybody has a workaround?

Revision history for this message
Adam Vinsh (adam-vinsh) wrote :

Hey all.. I hit this in Stein as well with multi-domain enabled. Work around.. check to see if the role is already added:

openstack role assignment list --domain heat --names
+-------+-------------------------+-------+---------+--------+--------+-----------+
| Role | User | Group | Project | Domain | System | Inherited |
+-------+-------------------------+-------+---------+--------+--------+-----------+
| admin | stack_domain_admin@heat | | | heat | | False |
+-------+-------------------------+-------+---------+--------+--------+-----------+

If not, manually add the role.
openstack role add --user-domain heat --domain heat --user stack_domain_admin admin

Then just comment out the lines in the role and move on:
tasks/heat_service_setup.yml
# - user: "{{ heat_stack_domain_admin }}"
# role: "{{ keystone_role_name | default('admin') }}"
# domain: "{{ add_stack_user_domain.id }}"

Revision history for this message
Marc Gariépy (mgariepy) wrote :

for stein you need the patch ansible with the following patch:
the issue is the os_user_role module in ansible doesn't pass the domain along when doing the lookup for the stack_domain_admin.

https://github.com/ansible/ansible/pull/42913/commits/405c5698ebae8de3fdeee34620e2b9581c1aeb7d

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

Sorry, at this point I must say it won't be fixed, since both Q and R have reached EOL.

I'm pretty sure though, that there are no issues with Heat/Magnum upgrades in currently supported versions, and this should work nicely at least since Train.

Changed in openstack-ansible:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.