Reconfigure failing on creating magnum_trustee_domain_admin when domain_specific_drivers_enabled = True is set in keystone.conf

Bug #1714011 reported by GrzegorzKoper
52
This bug affects 11 people
Affects Status Importance Assigned to Milestone
kolla-ansible
Fix Released
Medium
Michal Nasiadka
Train
Fix Released
Medium
Michal Nasiadka

Bug Description

After a successful deployment of stable/ocata in a multinode topology, I've enabled :

[identity]
domain_specific_drivers_enabled = True
in the keystone.conf (to facilitate LDAP auth for one of our domains).

After this change , reconfigure will fail when trying to create magnum_trustee_domain_admin in the magnum domain:

"fatal: [ucsubuxxxx.domain.name]: FAILED! => {"attempts": 10, "changed": false, "extra_data": null, "failed": true, "msg": "Error in creating user magnum_trustee_domain_******** (Inner Exception: Conflict occurred attempting to store user - Duplicate entry found with name magnum_trustee_domain_******** at domain ID 36c1697e156f403e9afb9d4606a95d93. (HTTP 409) (Request-ID: req-17b4251f-85ab-4a03-bdeb-edd1fe9784b1))"}
"

Changed in kolla:
status: New → Confirmed
description: updated
Revision history for this message
Eduardo Gonzalez (egonzalez90) wrote :

Hi, could you share logs in -vvv verbose mode to see what output and what parameters are provided.
Looks like a bug in os_user module or python-shade not being idempotent with user/domain already existing.

Regards

affects: kolla → kolla-ansible
Revision history for this message
GrzegorzKoper (grzegorz-koper) wrote :

That was exactly the problem, funny thing is that when You delete the user and let Ansible create it, reconfigure will fail in the next step (adding an Admin role for the created user).
So either both os_user and os_user_roles modules are both not idempotent or , more likely, there is something wrong with shade!
I've done some testing with my little ansible code, and it looks like both modules have problems when listing users in domain magnum. Running the same thing using API calls or CLI - works like a charm

Revision history for this message
GrzegorzKoper (grzegorz-koper) wrote :

Added verbose logs from 2 reconfigure runs ( 1 when user exists and 1 when it does not ).

yuqian (roger-yu)
Changed in kolla-ansible:
assignee: nobody → yuqian (roger-yu)
Revision history for this message
yuqian (roger-yu) wrote :

This parameter in keystone.conf is enabled:
```
[identity]
domain_specific_drivers_enabled = True
```
**Appearance:**
1> openstack user list only users in the default domain will
be displayed, not all doamin users.

**Reason:**
1> In kolla-ansible/ansible/roles/magnum/tasks/register.yml
"task:Creating Magnum trustee user" use ansible module
os_user to manage user;

2> In ansible/lib/ansible/modules/cloud/openstack/os_user.py
```
1 try:
2 user = cloud.get_user(name)
3 domain_id = None
4 if domain:
5 domain_id = _get_domain_id(cloud, domain)
6 if state == 'present':
7 ...
8 if user is None:
9 <...to create user...>
```
**In line 2, can't get any magnum user, So it will create the
user, resulting in 409 error**

**Solution:**
I will contact kolla ptl to confirm that the solution to the
problem is to go to ansible to issue or directly modify in
kolla-ansbile.

Revision history for this message
Eduardo Gonzalez (egonzalez90) wrote :

Hi,any update on this issue?

Changed in kolla-ansible:
status: Confirmed → Incomplete
Revision history for this message
Matt Faraday (lordxenu) wrote :

I am also suffering this issue. Queens. I notice it's marked as incomplete and more info needed.
Is there anything I can add / any info required ? My issue is identical to the original bug reporter. Kolla deployment with domain specific drivers, one domain with LDAP. We cannot now redeploy because of this error. We have to disable magnum which is currently preventing our deploying Kolla OpenStack in production as we need container services.

The other problem I have is that it fails to create a template or a cluster with no helpful errors but that's another bug.

Please let me know if I can help provide any logs / output / experiments etc. Since I can't release this to production i'm free to experiment with it. Thanks.

Revision history for this message
Christian Zunker (christian-zunker) wrote :

I had a similar problem, using openstack-ansible for Rocky.
Ansible already has a bug report for this: https://github.com/ansible/ansible/issues/42901

Revision history for this message
Florian Faltermeier (florianfa) wrote :

Hello,

one workaround that works for me:

Kolla:

Edit docker/kolla-toolbox/Dockerfile.j2 and change the ansible version from ansible==2.2.0.0 to ansible==2.6.13.0

Additionally add no_log=True to kolla_keystone_user.py and kolla_sanity.py files

kolla_keystone_user.py: password=dict(required=True, type='str', no_log=True),
kolla_sanity.py: password=dict(required=True, type='str', no_log=True),

Cleanup old kolla-toolbox images and containers from Openstack hosts.

Rebuild the kolla-toolbox container
Rerun kolla-ansible deployment

Note:
https://docs.ansible.com/ansible/2.4/release_and_maintenance.html (Ansible 2.2 only security fixes not bugfixes)

Revision history for this message
Florian Faltermeier (florianfa) wrote :

Can someone please verify it?

Revision history for this message
Scott Beck (scottbeck) wrote :

This exact issue appeared for me today using ubuntu source stein with kolla from git master. Running deploy with multi domain support enabled causes magnum to fail trying to create magnum_trustee_domain_admin.

One suggested work-around is to deploy in single mode and reconfigure with multi mode. This did not work for me, error on reconfigure:

 FAILED! => {"changed": false, "extra_data": null, "msg": "Error in creating user magnum_trustee_domain_admin. (409) Client Error for url: http://10.1.0.100:35357/v3/users Conflict occurred attempting to store user - Duplicate entry found with name magnum_trustee_domain_admin at domain ID 1069bcfb7bfa469d91c9d2def6ee9cad."}

Revision history for this message
Mark Goddard (mgoddard) wrote :

Scott, do you have a fix for this?

I would guess that we need some additional arguments to the 'Creating Magnum trustee domain' task in ansible/roles/magnum/tasks/register.yml. Perhaps specify the region as {{ openstack_region_name }}? Module docs are here: https://docs.ansible.com/ansible/latest/modules/os_keystone_domain_module.html.

Revision history for this message
Joseph M (noxoid) wrote :

Confirmed what Florian found, it seems to be an Ansible version issue. It may have been related to Ansible migrating from shade to openstacksdk around that time. Luckily for us this was accidentally resolved via the following commit:

https://opendev.org/openstack/kolla/commit/e3108d93c69dbad0c575da719f842d36c195a673

Those still running stable/stein can pull in the fixed master toolbox image by adding the following to globals.yml:

kolla_toolbox_tag: master

Magnum deploys fine for me using stable/stein kolla-ansible after the above one line change. I would say this can be closed as resolved in master unless devs want to backport it to stable as Magnum definitely is broken when domain_specific_drivers_enabled is true.

Mark Goddard (mgoddard)
Changed in kolla-ansible:
status: Incomplete → Confirmed
importance: Undecided → High
importance: High → Medium
Revision history for this message
Andrei Nistor (codertux) wrote :

I can also confirm that kolla_toolbox from master fixes the issue.

Revision history for this message
Mark Goddard (mgoddard) wrote :

As stated by Joseph M, and confirmed in IRC by andrein, this issue appears to have been fixed on master by updating packages in kolla-toolbox, commit https://opendev.org/openstack/kolla/commit/e3108d93c69dbad0c575da719f842d36c195a673. It's not clear exactly which part of that change fixed the issue.

Mark Goddard (mgoddard)
Changed in kolla-ansible:
status: Fix Committed → Fix Released
Revision history for this message
Rowan Potgieter (rowan-potgieter) wrote :

Hi All

I am struggling with this same issue on kolla stable/stein. I have pulled the latest stable/stein from kolla-ansible and have also built the source containers locally.

# Container Details
Kolla containers built from stable/stein
   * SHA: 4b77682911616a6cfac96df3695f0d7896f89b50
   * base: ubuntu
   * type: source

# Error

  fatal: [dc1-controller-02]: FAILED! => {
      "changed": false,
      "extra_data": null,
      "invocation": {
          "module_args": {
              "api_version": "auto",
              "module_args": {
                  "auth": {
                      "auth_url": "http://192.168.248.20:35357",
                      "domain_name": "default",
                      "password": "....",
                      "project_name": "admin",
                      "user_domain_name": "default",
                      "username": "admin"
                  },
                  "domain": "magnum",
                  "endpoint_type": "admin",
                  "name": "magnum_trustee_domain_admin",
                  "password": "...."
              },
              "module_extra_vars": null,
              "module_name": "os_user",
              "timeout": 180
          }
      },
      "msg": "Error in creating user magnum_trustee_domain_admin. (409) Client Error for url: http://192.168.248.20:35357/v3/users Conflict occurred attempting to store user - Duplicate entry found with name magnum_trustee_domain_admin at domain ID 722e2456edad46efb51b5e6a108f8c36."
  }

I have tried using the latest master version of the kolla-toolbox container but this has not helped at all.

I also tried handling the 409 error but then the deploy fails on the "Creating Magnum trustee user role" step.

Is there anything else I can try? I've been stuck mid-deploy for a bit too long now

Revision history for this message
Rowan Potgieter (rowan-potgieter) wrote :

Apologies - seems like I had missed the suggestion of "Cleanup old kolla-toolbox images and containers from Openstack hosts."

After removing the kolla_toolbox and pulling master again I managed to get past the magnum deploy.

Revision history for this message
Rowan Potgieter (rowan-potgieter) wrote :

Ok I'm back on this issue - it turns out using the kolla-toolbox from `master` is not a good idea if you are on the stein release. This is because the kolla-toolbox in master no longer uses custom ansible modules for creating keystone users.

I have opened a bug which documents all the issues I hit when using the master container version:
https://bugs.launchpad.net/kolla-ansible/+bug/1884529

The good news is the changes from Florian do resolve the issue and I would suggest they are merged in to stable/stein

Revision history for this message
Mark Goddard (mgoddard) wrote :

I'm afraid we can't really bump the ansible version in the Stein kolla-toolbox image. If that works for you, I would suggest using the train image (if it works, although mixing versions is not really supported), or building your own image locally with an ansible version bump.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.