Bug #1846789 “Creating the Keystone service and endpoint Failed:...” : Bugs : kolla-ansible

Revision history for this message

Radosław Piliszek (yoctozepto) wrote on 2019-10-04:

#1

Please attach (from the controller nodes):
* full keystone logs (those under /var/log/kolla/keystone)
* docker logs for the keystone container
* /var/log/kolla/ansible.log
(you can archive that together for a cleaner upload)

And also:
* globals.yml

ArgsAlreadyParsedError is a weird error to get from that.

Changed in kolla:
status:	New → Incomplete

Revision history for this message

Larry Lile (llile) wrote on 2019-10-04:

#2

1846789.tar.gz Edit (2.8 MiB, application/x-tar)

The requested logs have been provided, with slight redactions to domain names.

md5sum cabcf4456e6077ed20035e288abcb226 1846789.tar.gz

Revision history for this message

Larry Lile (llile) wrote on 2019-10-04:

#3

Just to add some data points here for reference:

1. Train using stable/train exhibits this problem

2. Stein using stable/stein exhibits this problem

3. Rocky using stable/rocky works as expected
   [control] on network 1
   [monitoring] on network 1
   [compute,network] on network 2
   [haproxy:children]: control (on network 1)

The scenario described in 3 is our target, where control and haproxy are co-located on hosts, on network 1, monitoring is on network 1 while compute, network, storage are on network 2.

The reasoning behind this split is control, haproxy and monitoring are in a protected data center, compute, network, storage are in a utility data center without power protection.

Radosław Piliszek (yoctozepto) on 2019-10-05

Changed in kolla:
status:	Incomplete → New

Revision history for this message

Radosław Piliszek (yoctozepto) wrote on 2019-10-05:

#4

Were there no ansible.log in logs dir?

Train is on master. stable/train is not there yet.

Are you using kolla-ansible from branch stable/stein when deploying those stein images?

Also, your def of haproxy:children differs between report (network) and comment (control) - that changes behavior.

Changed in kolla:
status:	New → Incomplete

Revision history for this message

Larry Lile (llile) wrote on 2019-10-07:

#5

Hi,

Yes, to clarify, on train we are using master and on rocky we are using stable/rocky. Train and Stein exhibit the same behavior, Rocky works successfully.

With regard to the logs and current bug report:

Yes, we are using kolla-ansible stable/stein deploying the official stein images.

Yes, currently we are deploying haproxy:children to (network) not (control).

Once we get past the keystone issue we intend to change haproxy:children over to (control) but that seems to cause a different failure. However, I assumed it would be easier to figure out the keystone issue if I simplified the configuration.

Sorry for any confusion.

Revision history for this message

Larry Lile (llile) wrote on 2019-10-07:

#6

md5sum 1846789_1.tar.gz 8cfa8fea77f77c95d1ceeab4e13d51c2 Edit (1.6 KiB, application/x-tar)

The ansible.log files are attached as requested.

Radosław Piliszek (yoctozepto) on 2019-10-10

Changed in kolla:
status:	Incomplete → New

Radosław Piliszek (yoctozepto) on 2019-10-10

affects:	kolla → kolla-ansible
Changed in kolla-ansible:
importance:	Undecided → High

Revision history for this message

Radosław Piliszek (yoctozepto) wrote on 2019-10-10:

#7

I'm puzzled. Could you also include the generated configs on the deployed nodes? The /etc/kolla stuff.

Changed in kolla-ansible:
status:	New → Incomplete

Revision history for this message

Larry Lile (llile) wrote on 2019-10-10:

#8

md5sum 1846789_2.tar.gz 7d42283787ed94df9425c37a29566d2c Edit (4.1 MiB, application/x-tar)

The cluster has been redeployed since my last update, however the same issue persists.

I have gathered all deployment and configuration details along with logs for all subsystems.

Please let me know what else I can do to assist with debugging this issue.

Revision history for this message

Larry Lile (llile) wrote on 2019-10-10:

#9

With regard to the data uploaded in 1846789_2.tar.gz, this is with haproxy running on the control nodes.

Revision history for this message

Larry Lile (llile) wrote on 2019-10-11:

#10

md5sum 1846789_3.tar.gz 3cfc70f0566f8a531c926afd82497b4f Edit (306.8 KiB, application/x-tar)

The logs attached in 1846789_3.tar.gz show the same failure, but in this iteration I have haproxy running on network. So it appears that the location of haproxy does not affect the issue.

Revision history for this message

Larry Lile (llile) wrote on 2019-10-14:

#11

I don't know what has changed but I just had a successful deployment with kolla-ansible 8.0.2.dev24 using the stein docker images. Mon Oct 14 17:03:22 EDT 2019

Revision history for this message

Mark Goddard (mgoddard) wrote on 2019-10-15:

#12

Is it working reliably for you now?

Revision history for this message

Larry Lile (llile) wrote on 2019-10-15:

#13

I tried moving haproxy to control and it failed, I am retrying with haproxy on network to see if I can reproduce the original result.

Revision history for this message

Larry Lile (llile) wrote on 2019-10-15:

#14

Download full text (9.3 KiB)

I was able to re-deploy with haproxy on the network nodes. I did hit an issue with the heat versioning API failure during bootstrap, this seems to be an intermittent problem. Performing an additional deploy got past the issue.

TASK [heat : Running Heat bootstrap container] ***********************************************************************************************************************************
fatal: [odt-rsd-ost-ctr-a1.example.com -> odt-rsd-ost-ctr-a1.example.com]: FAILED! => {"changed": true, "msg": "Container exited with non-zero return code 1", "rc": 1, "stderr": "+ sudo -E kolla_set_configs\nINFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json\nINFO:__main__:Validating config file\nINFO:__main__:Kolla config strategy set to: COPY_ALWAYS\nINFO:__main__:Copying service configuration files\nINFO:__main__:Deleting /etc/heat/heat.conf\nINFO:__main__:Copying /var/lib/kolla/config_files/heat.conf to /etc/heat/heat.conf\nINFO:__main__:Setting permission for /etc/heat/heat.conf\nINFO:__main__:Writing out command to execute\n++ cat /run_command\n+ CMD=heat-api\n+ ARGS=\n+ [[ ! -n '' ]]\n+ . kolla_extend_start\n++ [[ ! -d /var/log/kolla/heat ]]\n++ mkdir -p /var/log/kolla/heat\n+++ stat -c %a /var/log/kolla/heat\n++ [[ 2755 != \\7\\5\\5 ]]\n++ chmod 755 /var/log/kolla/heat\n+++ whoami\n++ [[ heat == \\r\\o\\o\\t ]]\n++ . /usr/local/bin/kolla_heat_extend_start\n+++ [[ -n 0 ]]\n+++ heat-manage db_sync\n++++ openstack domain list\n++++ awk '{print $4}'\n++++ grep heat\nFailed to contact the endpoint at http://10.239.195.200:5000 for discovery. Fallback to using that endpoint as the base url.\nNot Found (HTTP 404) (Request-ID: req-1e9bea57-85a9-4a84-ac4c-b1d17dde6c2e)\n+++ CURRENT_HEAT_DOMAIN_NAME=\n+++ [[ heat_user_domain != '' ]]\n+++ openstack domain create heat_user_domain\n+++ openstack user create --domain heat_user_domain heat_domain_admin --password Op3nSt@ck\nget() takes exactly 1 argument (2 given)\n", "stderr_lines": ["+ sudo -E kolla_set_configs", "INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json", "INFO:__main__:Validating config file", "INFO:__main__:Kolla config strategy set to: COPY_ALWAYS", "INFO:__main__:Copying service configuration files", "INFO:__main__:Deleting /etc/heat/heat.conf", "INFO:__main__:Copying /var/lib/kolla/config_files/heat.conf to /etc/heat/heat.conf", "INFO:__main__:Setting permission for /etc/heat/heat.conf", "INFO:__main__:Writing out command to execute", "++ cat /run_command", "+ CMD=heat-api", "+ ARGS=", "+ [[ ! -n '' ]]", "+ . kolla_extend_start", "++ [[ ! -d /var/log/kolla/heat ]]", "++ mkdir -p /var/log/kolla/heat", "+++ stat -c %a /var/log/kolla/heat", "++ [[ 2755 != \\7\\5\\5 ]]", "++ chmod 755 /var/log/kolla/heat", "+++ whoami", "++ [[ heat == \\r\\o\\o\\t ]]", "++ . /usr/local/bin/kolla_heat_extend_start", "+++ [[ -n 0 ]]", "+++ heat-manage db_sync", "++++ openstack domain list", "++++ awk '{print $4}'", "++++ grep heat", "Failed to contact the endpoint at http://10.239.195.200:5000 for discovery. Fallback to using that endpoint as the base url.", "Not Found (HTTP 404) (Request-ID: req-1e9bea57-85a9-4a84-ac4c-b1d17dde6c2e)", "+++ CURRENT_H...

I was able to re-deploy with haproxy on the network nodes.  I did hit an issue with the heat versioning API failure during bootstrap, this seems to be an intermittent problem.  Performing an additional deploy got past the issue.

TASK [heat : Running Heat bootstrap container] ***********************************************************************************************************************************
fatal: [odt-rsd-ost-ctr-a1.example.com -> odt-rsd-ost-ctr-a1.example.com]: FAILED! => {"changed": true, "msg": "Container exited with non-zero return code 1", "rc": 1, "stderr": "+ sudo -E kolla_set_configs\nINFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json\nINFO:__main__:Validating config file\nINFO:__main__:Kolla config strategy set to: COPY_ALWAYS\nINFO:__main__:Copying service configuration files\nINFO:__main__:Deleting /etc/heat/heat.conf\nINFO:__main__:Copying /var/lib/kolla/config_files/heat.conf to /etc/heat/heat.conf\nINFO:__main__:Setting permission for /etc/heat/heat.conf\nINFO:__main__:Writing out command to execute\n++ cat /run_command\n+ CMD=heat-api\n+ ARGS=\n+ [[ ! -n '' ]]\n+ . kolla_extend_start\n++ [[ ! -d /var/log/kolla/heat ]]\n++ mkdir -p /var/log/kolla/heat\n+++ stat -c %a /var/log/kolla/heat\n++ [[ 2755 != \\7\\5\\5 ]]\n++ chmod 755 /var/log/kolla/heat\n+++ whoami\n++ [[ heat == \\r\\o\\o\\t ]]\n++ . /usr/local/bin/kolla_heat_extend_start\n+++ [[ -n 0 ]]\n+++ heat-manage db_sync\n++++ openstack domain list\n++++ awk '{print $4}'\n++++ grep heat\nFailed to contact the endpoint at http://10.239.195.200:5000 for discovery. Fallback to using that endpoint as the base url.\nNot Found (HTTP 404) (Request-ID: req-1e9bea57-85a9-4a84-ac4c-b1d17dde6c2e)\n+++ CURRENT_HEAT_DOMAIN_NAME=\n+++ [[ heat_user_domain != '' ]]\n+++ openstack domain create heat_user_domain\n+++ openstack user create --domain heat_user_domain heat_domain_admin --password Op3nSt@ck\nget() takes exactly 1 argument (2 given)\n", "stderr_lines": ["+ sudo -E kolla_set_configs", "INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json", "INFO:__main__:Validating config file", "INFO:__main__:Kolla config strategy set to: COPY_ALWAYS", "INFO:__main__:Copying service configuration files", "INFO:__main__:Deleting /etc/heat/heat.conf", "INFO:__main__:Copying /var/lib/kolla/config_files/heat.conf to /etc/heat/heat.conf", "INFO:__main__:Setting permission for /etc/heat/heat.conf", "INFO:__main__:Writing out command to execute", "++ cat /run_command", "+ CMD=heat-api", "+ ARGS=", "+ [[ ! -n '' ]]", "+ . kolla_extend_start", "++ [[ ! -d /var/log/kolla/heat ]]", "++ mkdir -p /var/log/kolla/heat", "+++ stat -c %a /var/log/kolla/heat", "++ [[ 2755 != \\7\\5\\5 ]]", "++ chmod 755 /var/log/kolla/heat", "+++ whoami", "++ [[ heat == \\r\\o\\o\\t ]]", "++ . /usr/local/bin/kolla_heat_extend_start", "+++ [[ -n 0 ]]", "+++ heat-manage db_sync", "++++ openstack domain list", "++++ awk '{print $4}'", "++++ grep heat", "Failed to contact the endpoint at http://10.239.195.200:5000 for discovery. Fallback to using that endpoint as the base url.", "Not Found (HTTP 404) (Request-ID: req-1e9bea57-85a9-4a84-ac4c-b1d17dde6c2e)", "+++ CURRENT_HEAT_DOMAIN_NAME=", "+++ [[ heat_user_domain != '' ]]", "+++ openstack domain create heat_user_domain", "+++ openstack user create --domain heat_user_domain heat_domain_admin --password Op3nSt@ck", "get() takes exactly 1 argument (2 given)"], "stdout": "2019-10-15 10:02:27.641 16 INFO migrate.versioning.api [-] 70 -> 71... \u001b[00m\n2019-10-15 10:02:27.920 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:27.921 16 INFO migrate.versioning.api [-] 71 -> 72... \u001b[00m\n2019-10-15 10:02:27.974 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:27.975 16 INFO migrate.versioning.api [-] 72 -> 73... \u001b[00m\n2019-10-15 10:02:28.042 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.042 16 INFO migrate.versioning.api [-] 73 -> 74... \u001b[00m\n2019-10-15 10:02:28.053 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.053 16 INFO migrate.versioning.api [-] 74 -> 75... \u001b[00m\n2019-10-15 10:02:28.064 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.065 16 INFO migrate.versioning.api [-] 75 -> 76... \u001b[00m\n2019-10-15 10:02:28.076 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.076 16 INFO migrate.versioning.api [-] 76 -> 77... \u001b[00m\n2019-10-15 10:02:28.087 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.087 16 INFO migrate.versioning.api [-] 77 -> 78... \u001b[00m\n2019-10-15 10:02:28.108 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.108 16 INFO migrate.versioning.api [-] 78 -> 79... \u001b[00m\n2019-10-15 10:02:28.224 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.224 16 INFO migrate.versioning.api [-] 79 -> 80... \u001b[00m\n2019-10-15 10:02:28.299 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.299 16 INFO migrate.versioning.api [-] 80 -> 81... \u001b[00m\n2019-10-15 10:02:28.310 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.311 16 INFO migrate.versioning.api [-] 81 -> 82... \u001b[00m\n2019-10-15 10:02:28.322 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.322 16 INFO migrate.versioning.api [-] 82 -> 83... \u001b[00m\n2019-10-15 10:02:28.335 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.335 16 INFO migrate.versioning.api [-] 83 -> 84... \u001b[00m\n2019-10-15 10:02:28.346 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.346 16 INFO migrate.versioning.api [-] 84 -> 85... \u001b[00m\n2019-10-15 10:02:28.367 16 INFO migrate.versioning.api [-] done\u001b[00m\n2019-10-15 10:02:28.367 16 INFO migrate.versioning.api [-] 85 -> 86... \u001b[00m\n2019-10-15 10:02:28.456 16 INFO migrate.versioning.api [-] done\u001b[00m\n+-------------+----------------------------------+\n| Field       | Value                            |\n+-------------+----------------------------------+\n| description |                                  |\n| enabled     | True                             |\n| id          | da687ea738f04a8480ccd1a86848b3d9 |\n| name        | heat_user_domain                 |\n| tags        | []                               |\n+-------------+----------------------------------+\n", "stdout_lines": ["2019-10-15 10:02:27.641 16 INFO migrate.versioning.api [-] 70 -> 71... \u001b[00m", "2019-10-15 10:02:27.920 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:27.921 16 INFO migrate.versioning.api [-] 71 -> 72... \u001b[00m", "2019-10-15 10:02:27.974 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:27.975 16 INFO migrate.versioning.api [-] 72 -> 73... \u001b[00m", "2019-10-15 10:02:28.042 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.042 16 INFO migrate.versioning.api [-] 73 -> 74... \u001b[00m", "2019-10-15 10:02:28.053 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.053 16 INFO migrate.versioning.api [-] 74 -> 75... \u001b[00m", "2019-10-15 10:02:28.064 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.065 16 INFO migrate.versioning.api [-] 75 -> 76... \u001b[00m", "2019-10-15 10:02:28.076 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.076 16 INFO migrate.versioning.api [-] 76 -> 77... \u001b[00m", "2019-10-15 10:02:28.087 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.087 16 INFO migrate.versioning.api [-] 77 -> 78... \u001b[00m", "2019-10-15 10:02:28.108 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.108 16 INFO migrate.versioning.api [-] 78 -> 79... \u001b[00m", "2019-10-15 10:02:28.224 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.224 16 INFO migrate.versioning.api [-] 79 -> 80... \u001b[00m", "2019-10-15 10:02:28.299 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.299 16 INFO migrate.versioning.api [-] 80 -> 81... \u001b[00m", "2019-10-15 10:02:28.310 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.311 16 INFO migrate.versioning.api [-] 81 -> 82... \u001b[00m", "2019-10-15 10:02:28.322 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.322 16 INFO migrate.versioning.api [-] 82 -> 83... \u001b[00m", "2019-10-15 10:02:28.335 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.335 16 INFO migrate.versioning.api [-] 83 -> 84... \u001b[00m", "2019-10-15 10:02:28.346 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.346 16 INFO migrate.versioning.api [-] 84 -> 85... \u001b[00m", "2019-10-15 10:02:28.367 16 INFO migrate.versioning.api [-] done\u001b[00m", "2019-10-15 10:02:28.367 16 INFO migrate.versioning.api [-] 85 -> 86... \u001b[00m", "2019-10-15 10:02:28.456 16 INFO migrate.versioning.api [-] done\u001b[00m", "+-------------+----------------------------------+", "| Field       | Value                            |", "+-------------+----------------------------------+", "| description |                                  |", "| enabled     | True                             |", "| id          | da687ea738f04a8480ccd1a86848b3d9 |", "| name        | heat_user_domain                 |", "| tags        | []                               |", "+-------------+----------------------------------+"]}

Revision history for this message

Larry Lile (llile) wrote on 2019-10-17:

#15

This is currently working on Stein.

However, I have tried this on master (Train) with kolla-ansible 8.1.0.dev589 and it is still broken but less broken. The kolla-ansible deploy completes successfully, but horizon or keystone seem to be broken. Attempting to login through the horizon UI as admin results in "Unable to retrieve authorized projects."

Does that give us any insight into what fixed Stein?

Errors on master from keystone seem the same

2019-10-14 11:13:24.616538 mod_wsgi (pid=24): Target WSGI script '/usr/bin/keystone-wsgi-public' cannot be loaded as Python module.
2019-10-14 11:13:24.616648 mod_wsgi (pid=24): Exception occurred processing WSGI script '/usr/bin/keystone-wsgi-public'.
2019-10-14 11:13:24.616710 Traceback (most recent call last):
2019-10-14 11:13:24.616766 File "/usr/bin/keystone-wsgi-public", line 52, in <module>
2019-10-14 11:13:24.617004 application = initialize_public_application()
2019-10-14 11:13:24.617040 File "/usr/lib/python2.7/site-packages/keystone/server/wsgi.py", line 24, in initialize_public_application
2019-10-14 11:13:24.617102 name='public', config_files=flask_core._get_config_files())
2019-10-14 11:13:24.617133 File "/usr/lib/python2.7/site-packages/keystone/server/flask/core.py", line 151, in initialize_application
2019-10-14 11:13:24.617239 keystone.server.configure(config_files=config_files)
2019-10-14 11:13:24.617274 File "/usr/lib/python2.7/site-packages/keystone/server/__init__.py", line 28, in configure
2019-10-14 11:13:24.617368 keystone.conf.configure()
2019-10-14 11:13:24.617406 File "/usr/lib/python2.7/site-packages/keystone/conf/__init__.py", line 139, in configure
2019-10-14 11:13:24.617470 deprecated_since=versionutils.deprecated.STEIN))
2019-10-14 11:13:24.617500 File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2045, in __inner
2019-10-14 11:13:24.617557 result = f(self, *args, **kwargs)
2019-10-14 11:13:24.617587 File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2323, in register_cli_opt
2019-10-14 11:13:24.617667 raise ArgsAlreadyParsedError("cannot register CLI option")
2019-10-14 11:13:24.617735 ArgsAlreadyParsedError: arguments already parsed: cannot register CLI option

This is currently working on Stein.

However, I have tried this on master (Train) with kolla-ansible 8.1.0.dev589 and it is still broken but less broken.  The kolla-ansible deploy completes successfully, but horizon or keystone seem to be broken.  Attempting to login through the horizon UI as admin results in "Unable to retrieve authorized projects."

Does that give us any insight into what fixed Stein?

Errors on master from keystone seem the same

2019-10-14 11:13:24.616538 mod_wsgi (pid=24): Target WSGI script '/usr/bin/keystone-wsgi-public' cannot be loaded as Python module.
2019-10-14 11:13:24.616648 mod_wsgi (pid=24): Exception occurred processing WSGI script '/usr/bin/keystone-wsgi-public'.
2019-10-14 11:13:24.616710 Traceback (most recent call last):
2019-10-14 11:13:24.616766   File "/usr/bin/keystone-wsgi-public", line 52, in <module>
2019-10-14 11:13:24.617004     application = initialize_public_application()
2019-10-14 11:13:24.617040   File "/usr/lib/python2.7/site-packages/keystone/server/wsgi.py", line 24, in initialize_public_application
2019-10-14 11:13:24.617102     name='public', config_files=flask_core._get_config_files())
2019-10-14 11:13:24.617133   File "/usr/lib/python2.7/site-packages/keystone/server/flask/core.py", line 151, in initialize_application
2019-10-14 11:13:24.617239     keystone.server.configure(config_files=config_files)
2019-10-14 11:13:24.617274   File "/usr/lib/python2.7/site-packages/keystone/server/__init__.py", line 28, in configure
2019-10-14 11:13:24.617368     keystone.conf.configure()
2019-10-14 11:13:24.617406   File "/usr/lib/python2.7/site-packages/keystone/conf/__init__.py", line 139, in configure
2019-10-14 11:13:24.617470     deprecated_since=versionutils.deprecated.STEIN))
2019-10-14 11:13:24.617500   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2045, in __inner
2019-10-14 11:13:24.617557     result = f(self, *args, **kwargs)
2019-10-14 11:13:24.617587   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2323, in register_cli_opt
2019-10-14 11:13:24.617667     raise ArgsAlreadyParsedError("cannot register CLI option")
2019-10-14 11:13:24.617735 ArgsAlreadyParsedError: arguments already parsed: cannot register CLI option

Revision history for this message

Larry Lile (llile) wrote on 2019-10-18:

#16

My string of successful deployments seems to have ended abruptly. This is failing again.

Revision history for this message

Larry Lile (llile) wrote on 2019-10-22:

#17

I'm still having consistent failures, with the occasional successful deployment. However, even when the deploy appears to be successful keystone is still producing the error. Any ideas?

2019-10-22 10:51:13.797434 mod_wsgi (pid=18): Target WSGI script '/usr/bin/keystone-wsgi-public' cannot be loaded as Python module.
2019-10-22 10:51:13.797531 mod_wsgi (pid=18): Exception occurred processing WSGI script '/usr/bin/keystone-wsgi-public'.
2019-10-22 10:51:13.797596 Traceback (most recent call last):
2019-10-22 10:51:13.797652 File "/usr/bin/keystone-wsgi-public", line 52, in <module>
2019-10-22 10:51:13.797741 application = initialize_public_application()
2019-10-22 10:51:13.797776 File "/usr/lib/python2.7/site-packages/keystone/server/wsgi.py", line 24, in initialize_public_application
2019-10-22 10:51:13.797840 name='public', config_files=flask_core._get_config_files())
2019-10-22 10:51:13.797873 File "/usr/lib/python2.7/site-packages/keystone/server/flask/core.py", line 151, in initialize_application
2019-10-22 10:51:13.797933 keystone.server.configure(config_files=config_files)
2019-10-22 10:51:13.797966 File "/usr/lib/python2.7/site-packages/keystone/server/__init__.py", line 28, in configure
2019-10-22 10:51:13.798024 keystone.conf.configure()
2019-10-22 10:51:13.798057 File "/usr/lib/python2.7/site-packages/keystone/conf/__init__.py", line 139, in configure
2019-10-22 10:51:13.798174 deprecated_since=versionutils.deprecated.STEIN))
2019-10-22 10:51:13.798273 File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2045, in __inner
2019-10-22 10:51:13.798344 result = f(self, *args, **kwargs)
2019-10-22 10:51:13.798377 File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2323, in register_cli_opt
2019-10-22 10:51:13.798434 raise ArgsAlreadyParsedError("cannot register CLI option")
2019-10-22 10:51:13.798490 ArgsAlreadyParsedError: arguments already parsed: cannot register CLI option

I'm still having consistent failures, with the occasional successful deployment.  However, even when the deploy appears to be successful keystone is still producing the error.  Any ideas?

2019-10-22 10:51:13.797434 mod_wsgi (pid=18): Target WSGI script '/usr/bin/keystone-wsgi-public' cannot be loaded as Python module.
2019-10-22 10:51:13.797531 mod_wsgi (pid=18): Exception occurred processing WSGI script '/usr/bin/keystone-wsgi-public'.
2019-10-22 10:51:13.797596 Traceback (most recent call last):
2019-10-22 10:51:13.797652   File "/usr/bin/keystone-wsgi-public", line 52, in <module>
2019-10-22 10:51:13.797741     application = initialize_public_application()
2019-10-22 10:51:13.797776   File "/usr/lib/python2.7/site-packages/keystone/server/wsgi.py", line 24, in initialize_public_application
2019-10-22 10:51:13.797840     name='public', config_files=flask_core._get_config_files())
2019-10-22 10:51:13.797873   File "/usr/lib/python2.7/site-packages/keystone/server/flask/core.py", line 151, in initialize_application
2019-10-22 10:51:13.797933     keystone.server.configure(config_files=config_files)
2019-10-22 10:51:13.797966   File "/usr/lib/python2.7/site-packages/keystone/server/__init__.py", line 28, in configure
2019-10-22 10:51:13.798024     keystone.conf.configure()
2019-10-22 10:51:13.798057   File "/usr/lib/python2.7/site-packages/keystone/conf/__init__.py", line 139, in configure
2019-10-22 10:51:13.798174     deprecated_since=versionutils.deprecated.STEIN))
2019-10-22 10:51:13.798273   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2045, in __inner
2019-10-22 10:51:13.798344     result = f(self, *args, **kwargs)
2019-10-22 10:51:13.798377   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2323, in register_cli_opt
2019-10-22 10:51:13.798434     raise ArgsAlreadyParsedError("cannot register CLI option")
2019-10-22 10:51:13.798490 ArgsAlreadyParsedError: arguments already parsed: cannot register CLI option

Revision history for this message

Mark Goddard (mgoddard) wrote on 2019-10-23:

#18

That suggests that this error might not be the problematic one.

Revision history for this message

Klemen Pogacnik (kemopq) wrote on 2019-12-02:

#19

I'm working on rocky release and have the same problem. It happens on each deploy on three-node cluster. I haven't notice this problem on single-node deployment.
After deploying keystone, its endpoint (port 5000) is quite unstable. It works five, six times and then fails with "500 Internal Server Error". Then it works again some time and again fails. When Internal server error is returned, the keystone error logs are the same as Larry had:

mod_wsgi (pid=20): Target WSGI script '/var/lib/kolla/venv/bin/keystone-wsgi-admin' cannot be loaded as Python module.
mod_wsgi (pid=20): Exception occurred processing WSGI script '/var/lib/kolla/venv/bin/keystone-wsgi-admin'.

I checked on keystone container on each node and there is no rule, sometimes only one has this problem, sometimes two.
Cloud deploymend then fails when some other module needs keystone endpoint.

I found out, that restarting keystone containers and redeploying cloud resolve this situation.

One more note, I did't have those problems with the rocky version from Nov 7th and I and my collegues deployed single and multicloud many times.

Revision history for this message

Michal Nasiadka (mnasiadka) wrote on 2019-12-12:

#20

Probably found the cause - it seems we have some race-ish condition, and sometimes public keystone wsgi comes up without fernet keys in /etc/keystone/fernet-keys/ - visible in this logfile:
https://88763c12f13d1aeca43c-63681721353a54dab1064b012b97b3cb.ssl.cf1.rackcdn.com/698452/2/check/kolla-ansible-ubuntu-source-ceph/09da3d0/primary/logs/kolla/keystone/keystone-apache-public-error.txt.gz

And then persists on not coming up properly.

Working on patches in Kolla (fix fernet bootstrap so it properly sets changed flag) and Kolla-Ansible (restart keystone containers after fernet bootstrap or a similar solution).

Changed in kolla-ansible:
status:	Incomplete → Triaged
assignee:	nobody → Michal Nasiadka (mnasiadka)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-12-12: Fix proposed to kolla-ansible (master)

#21

Fix proposed to branch: master
Review: https://review.opendev.org/698710

Changed in kolla-ansible:
status:	Triaged → In Progress

Revision history for this message

Larry Lile (llile) wrote on 2020-01-24:

#22

Hi Michal,

I have been tracking the updates to change 698710 on OpenDev. We are successfully using one of the intermediate patches you supplied earlier with great success.

I wanted to thank you, and the reviewers, for all your hard work on this issue so far!

I guessing your final patch will not only address this presentation of the problem but many others that have yet to be reported or identified.

Thank you!

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-02-05: Fix merged to kolla-ansible (master)

#23

Reviewed: https://review.opendev.org/698710
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=0799782ce83d1057f262b44c979a15f9a1b05c72
Submitter: Zuul
Branch: master

commit 0799782ce83d1057f262b44c979a15f9a1b05c72
Author: Michal Nasiadka <email address hidden>
Date: Thu Dec 12 13:19:48 2019 +0100

Fix keystone fernet bootstrap

There are cases when a multinode deployment ends up in unusable
keystone public wsgi on some nodes.

    The root cause is that keystone public wsgi doesn't find fernet
    keys on startup - and then persists on sending 500 errors to any
    requests - due to a race condition between
    fernet_setup/fernet-push.sh and keystone startup.

    Depends-On: https://review.opendev.org/703742/
    Change-Id: I63709c2e3f6a893db82a05640da78f492bf8440f
    Closes-Bug: #1846789

Changed in kolla-ansible:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-02-05: Fix proposed to kolla-ansible (stable/train)

#24

Fix proposed to branch: stable/train
Review: https://review.opendev.org/705949

Revision history for this message

Doug Szumski (dszumski) wrote on 2020-03-11:

#25

I believe I have just hit this on Train also.

Revision history for this message

Radosław Piliszek (yoctozepto) wrote on 2020-03-11:

#26

Very likely, this is Train-affecting.

Revision history for this message

Radosław Piliszek (yoctozepto) wrote on 2020-05-11:

#27

Neither Train nor Stein had any related fixes. The fix by Michał is awaiting followup and backporting.

Revision history for this message

Doug Szumski (dszumski) wrote on 2020-06-29:

#28

I'm seeing this on master in CI too now:

https://zuul.opendev.org/t/openstack/build/78feeaaa93014e1ab7ef8b7fafa70eb7/log/primary/logs/kolla/keystone/keystone-apache-public-error.txt#14944

```
2020-06-24 15:21:10.668579 mod_wsgi (pid=34, process='keystone-public', application=''): Loading Python script file '/var/lib/kolla/venv/bin/keystone-wsgi-public'.
2020-06-24 15:21:10.669624 mod_wsgi (pid=34): Failed to exec Python script file '/var/lib/kolla/venv/bin/keystone-wsgi-public'.
2020-06-24 15:21:10.669662 mod_wsgi (pid=34): Exception occurred processing WSGI script '/var/lib/kolla/venv/bin/keystone-wsgi-public'.
2020-06-24 15:21:10.669837 Traceback (most recent call last):
2020-06-24 15:21:10.669879 File "/var/lib/kolla/venv/bin/keystone-wsgi-public", line 52, in <module>
2020-06-24 15:21:10.669882 application = initialize_public_application()
2020-06-24 15:21:10.669889 File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/server/wsgi.py", line 24, in initialize_public_application
2020-06-24 15:21:10.669892 name='public', config_files=flask_core._get_config_files())
2020-06-24 15:21:10.669898 File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/server/flask/core.py", line 157, in initialize_application
2020-06-24 15:21:10.669901 keystone.server.configure(config_files=config_files)
2020-06-24 15:21:10.670073 File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/server/__init__.py", line 28, in configure
2020-06-24 15:21:10.670076 keystone.conf.configure()
2020-06-24 15:21:10.670081 File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/conf/__init__.py", line 137, in configure
2020-06-24 15:21:10.670084 deprecated_since=versionutils.deprecated.STEIN))
2020-06-24 15:21:10.670089 File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_config/cfg.py", line 2055, in __inner
2020-06-24 15:21:10.670092 result = f(self, *args, **kwargs)
2020-06-24 15:21:10.670098 File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_config/cfg.py", line 2333, in register_cli_opt
2020-06-24 15:21:10.670100 raise ArgsAlreadyParsedError("cannot register CLI option")
2020-06-24 15:21:10.670115 oslo_config.cfg.ArgsAlreadyParsedError: arguments already parsed: cannot register CLI option
```

I'm seeing this on master in CI too now:

https://zuul.opendev.org/t/openstack/build/78feeaaa93014e1ab7ef8b7fafa70eb7/log/primary/logs/kolla/keystone/keystone-apache-public-error.txt#14944

```
2020-06-24 15:21:10.668579 mod_wsgi (pid=34, process='keystone-public', application=''): Loading Python script file '/var/lib/kolla/venv/bin/keystone-wsgi-public'.
2020-06-24 15:21:10.669624 mod_wsgi (pid=34): Failed to exec Python script file '/var/lib/kolla/venv/bin/keystone-wsgi-public'.
2020-06-24 15:21:10.669662 mod_wsgi (pid=34): Exception occurred processing WSGI script '/var/lib/kolla/venv/bin/keystone-wsgi-public'.
2020-06-24 15:21:10.669837 Traceback (most recent call last):
2020-06-24 15:21:10.669879   File "/var/lib/kolla/venv/bin/keystone-wsgi-public", line 52, in <module>
2020-06-24 15:21:10.669882     application = initialize_public_application()
2020-06-24 15:21:10.669889   File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/server/wsgi.py", line 24, in initialize_public_application
2020-06-24 15:21:10.669892     name='public', config_files=flask_core._get_config_files())
2020-06-24 15:21:10.669898   File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/server/flask/core.py", line 157, in initialize_application
2020-06-24 15:21:10.669901     keystone.server.configure(config_files=config_files)
2020-06-24 15:21:10.670073   File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/server/__init__.py", line 28, in configure
2020-06-24 15:21:10.670076     keystone.conf.configure()
2020-06-24 15:21:10.670081   File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/conf/__init__.py", line 137, in configure
2020-06-24 15:21:10.670084     deprecated_since=versionutils.deprecated.STEIN))
2020-06-24 15:21:10.670089   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_config/cfg.py", line 2055, in __inner
2020-06-24 15:21:10.670092     result = f(self, *args, **kwargs)
2020-06-24 15:21:10.670098   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_config/cfg.py", line 2333, in register_cli_opt
2020-06-24 15:21:10.670100     raise ArgsAlreadyParsedError("cannot register CLI option")
2020-06-24 15:21:10.670115 oslo_config.cfg.ArgsAlreadyParsedError: arguments already parsed: cannot register CLI option
```

Revision history for this message

Doug Szumski (dszumski) wrote on 2020-06-29:

#29

Download full text (10.8 KiB)

I think this goes beyond the Fernet issues - on a single node Train (Centos7) deploy I see the same issue. It is sporadic - sometimes for example a keystone endpoint list works and others not. As soon as I restart Keystone the issue is gone.

It appears to be related to two keystone calls in close proximity. A keystone token issue never fails. An endpoint list always gets a token, but then the GET/v3/endpoints fails. An example:

```==== # openstack endpoint list (FAILS) ==================

==> keystone-apache-admin-access.log <==
192.168.33.2 - - [29/Jun/2020:16:29:15 +0000] "GET /v3 HTTP/1.1" 200 253 3251 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.22.0 CPython/2.7
.5"
192.168.33.2 - - [29/Jun/2020:16:29:15 +0000] "POST /v3/auth/tokens HTTP/1.1" 201 2347 331455 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.
22.0 CPython/2.7.5"
192.168.33.2 - - [29/Jun/2020:16:29:16 +0000] "POST /v3/auth/tokens HTTP/1.1" 201 2347 341851 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.
22.0 CPython/2.7.5"

==> keystone-apache-public-access.log <==
192.168.33.2 - - [29/Jun/2020:16:29:16 +0000] "GET / HTTP/1.1" 300 267 2879 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.22.0 CPython/2.7.5
"
192.168.33.2 - - [29/Jun/2020:16:29:16 +0000] "GET /v3/endpoints HTTP/1.1" 200 2926 37411 "-" "python-keystoneclient"

==> keystone-apache-public-error.log <==
2020-06-29 16:29:16.734104 mod_wsgi (pid=23): Target WSGI script '/usr/bin/keystone-wsgi-public' cannot be loaded as Python module.
2020-06-29 16:29:16.734150 mod_wsgi (pid=23): Exception occurred processing WSGI script '/usr/bin/keystone-wsgi-public'.
2020-06-29 16:29:16.734195 Traceback (most recent call last): ...

I think this goes beyond the Fernet issues - on a single node Train (Centos7) deploy I see the same issue. It is sporadic - sometimes for example a keystone endpoint list works and others not. As soon as I restart Keystone the issue is gone.

It appears to be related to two keystone calls in close proximity. A keystone token issue never fails. An endpoint list always gets a token, but then the GET/v3/endpoints fails. An example:

```====  # openstack endpoint list (FAILS) ==================

==> keystone-apache-admin-access.log <==                                                                                                                      
	192.168.33.2 - - [29/Jun/2020:16:29:15 +0000] "GET /v3 HTTP/1.1" 200 253 3251 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.22.0 CPython/2.7
	.5"                                                                                                                                                           
	192.168.33.2 - - [29/Jun/2020:16:29:15 +0000] "POST /v3/auth/tokens HTTP/1.1" 201 2347 331455 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.
	22.0 CPython/2.7.5"                                                                                                                                           
	192.168.33.2 - - [29/Jun/2020:16:29:16 +0000] "POST /v3/auth/tokens HTTP/1.1" 201 2347 341851 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.
	22.0 CPython/2.7.5"                                                                                                                                           
		                                                                                                                                                      
	==> keystone-apache-public-access.log <==                                                                                                                     
	192.168.33.2 - - [29/Jun/2020:16:29:16 +0000] "GET / HTTP/1.1" 300 267 2879 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.22.0 CPython/2.7.5
	"                                                                                                                                                             
	192.168.33.2 - - [29/Jun/2020:16:29:16 +0000] "GET /v3/endpoints HTTP/1.1" 200 2926 37411 "-" "python-keystoneclient"                                         
		                                                                                                                                                      
	==> keystone-apache-public-error.log <==                                                                                                                      
	2020-06-29 16:29:16.734104 mod_wsgi (pid=23): Target WSGI script '/usr/bin/keystone-wsgi-public' cannot be loaded as Python module.                           
	2020-06-29 16:29:16.734150 mod_wsgi (pid=23): Exception occurred processing WSGI script '/usr/bin/keystone-wsgi-public'.                                      
	2020-06-29 16:29:16.734195 Traceback (most recent call last):                                                                                                 
	2020-06-29 16:29:16.734240   File "/usr/bin/keystone-wsgi-public", line 52, in <module>                                                                       
	2020-06-29 16:29:16.734312     application = initialize_public_application()                                                                                  
	2020-06-29 16:29:16.734336   File "/usr/lib/python2.7/site-packages/keystone/server/wsgi.py", line 24, in initialize_public_application                       
	2020-06-29 16:29:16.734364     name='public', config_files=flask_core._get_config_files())                                                                    
	2020-06-29 16:29:16.734382   File "/usr/lib/python2.7/site-packages/keystone/server/flask/core.py", line 157, in initialize_application                       
	2020-06-29 16:29:16.734414     keystone.server.configure(config_files=config_files)                                                                           
	2020-06-29 16:29:16.734434   File "/usr/lib/python2.7/site-packages/keystone/server/__init__.py", line 28, in configure                                       
	2020-06-29 16:29:16.734471     keystone.conf.configure()                                                                                                      
	2020-06-29 16:29:16.734489   File "/usr/lib/python2.7/site-packages/keystone/conf/__init__.py", line 137, in configure                                        
	2020-06-29 16:29:16.734512     deprecated_since=versionutils.deprecated.STEIN))                                                                               
	2020-06-29 16:29:16.734530   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2055, in __inner                                                
	2020-06-29 16:29:16.734553     result = f(self, *args, **kwargs)                                                                                              
	2020-06-29 16:29:16.734570   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2333, in register_cli_opt                                       
	2020-06-29 16:29:16.734602     raise ArgsAlreadyParsedError("cannot register CLI option")                                                                     
	2020-06-29 16:29:16.734637 ArgsAlreadyParsedError: arguments already parsed: cannot register CLI option                                                       
		                                                                                                                                                      
	==> keystone-apache-public-access.log <==                                                                                                                     
	192.168.33.2 - - [29/Jun/2020:16:29:16 +0000] "GET /v3/services HTTP/1.1" 500 527 1924 "-" "python-keystoneclient"

====  # openstack endpoint list (WORKS) ==================       
                                                                                                                                                              
	==> keystone-apache-admin-access.log <==                                                                                                                      
	192.168.33.2 - - [29/Jun/2020:16:29:21 +0000] "GET /v3 HTTP/1.1" 200 253 2769 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.22.0 CPython/2.7
	.5"                                                                                                                                                           
	192.168.33.2 - - [29/Jun/2020:16:29:21 +0000] "POST /v3/auth/tokens HTTP/1.1" 201 2347 330306 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.
	22.0 CPython/2.7.5"                                                                                                                                           
	192.168.33.2 - - [29/Jun/2020:16:29:21 +0000] "POST /v3/auth/tokens HTTP/1.1" 201 2347 361409 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.
	22.0 CPython/2.7.5"                                                                                                                                           
		                                                                                                                                                      
	==> keystone-apache-public-access.log <==                                                                                                                     
	192.168.33.2 - - [29/Jun/2020:16:29:22 +0000] "GET / HTTP/1.1" 300 267 3450 "-" "openstacksdk/0.36.3 keystoneauth1/3.17.2 python-requests/2.22.0 CPython/2.7.5
	"                                                                                                                                                             
	192.168.33.2 - - [29/Jun/2020:16:29:22 +0000] "GET /v3/endpoints HTTP/1.1" 200 2926 49522 "-" "python-keystoneclient"                                         
		                                                                                                                                                      
	==> keystone.log <==                                                                                                                                          
	2020-06-29 16:29:22.144 24 WARNING py.warnings [req-258fde27-f856-48af-b666-76782cf29da0 eb8fe17826af43bcba56b16f942364df b55a100fe0e548ca9ae39a5a2c2e908c - d
	efault default] /usr/lib/python2.7/site-packages/oslo_policy/policy.py:970: UserWarning: Policy identity:list_services failed scope check. The token used to m
	ake the request was project scoped but the policy requires ['system'] scope. This behavior may change in the future where using the intended scope is required
	  warnings.warn(msg)                                                                                                                                          
		                                                                                                                                                      
		                                                                                                                                                      
	==> keystone-apache-public-error.log <==                                                                                                                      
	2020-06-29 16:29:22.145436 2020-06-29 16:29:22.144 24 WARNING py.warnings [req-258fde27-f856-48af-b666-76782cf29da0 eb8fe17826af43bcba56b16f942364df b55a100fe
	0e548ca9ae39a5a2c2e908c - default default] /usr/lib/python2.7/site-packages/oslo_policy/policy.py:970: UserWarning: Policy identity:list_services failed scope
	 check. The token used to make the request was project scoped but the policy requires ['system'] scope. This behavior may change in the future where using the
	 intended scope is required                                                                                                                                   
	2020-06-29 16:29:22.145476   warnings.warn(msg)                                                                                                               
	2020-06-29 16:29:22.145507 \x1b[00m                                                                                                                           
		                                                                                                                                                      
	==> keystone-apache-public-access.log <==                                                                                                                     
	192.168.33.2 - - [29/Jun/2020:16:29:22 +0000] "GET /v3/services HTTP/1.1" 200 793 36592 "-" "python-keystoneclient"                                           
```

Revision history for this message

Mark Goddard (mgoddard) wrote on 2020-06-30:

#30

I do see a fernet error in the link you posted though Doug.

SystemExit: /etc/keystone/fernet-keys/ does not contain keys, use keystone-manage fernet_setup to create Fernet keys.

Revision history for this message

Mark Goddard (mgoddard) wrote on 2020-06-30:

#31

Could you try again with a depends-on to this patch? https://review.opendev.org/#/c/707080/

Revision history for this message

Doug Szumski (dszumski) wrote on 2020-06-30:

#32

Thanks, I missed that. Trying with the above patch.

On my single node Train deploy I also see:

```
keystone/keystone-apache-public-error.log:2020-06-29 17:11:21.562830 File "/usr/lib/python2.7/site-packages/keystone/receipt/providers/fernet/core.py", line
45, in __init__
keystone/keystone-apache-public-error.log:2020-06-29 17:11:21.562879 'Fernet keys.') % subs)
keystone/keystone-apache-public-error.log:2020-06-29 17:11:21.562902 SystemExit: /etc/keystone/fernet-keys/ does not contain keys, use keystone-manage fernet_
setup to create Fernet keys.
keystone/keystone-apache-public-error.log:2020-06-29 17:11:25.810353 File "/usr/lib/python2.7/site-packages/keystone/receipt/providers/fernet/core.py", line
45, in __init__
keystone/keystone-apache-public-error.log:2020-06-29 17:11:25.810403 'Fernet keys.') % subs)
keystone/keystone-apache-public-error.log:2020-06-29 17:11:25.810427 SystemExit: /etc/keystone/fernet-keys/ does not contain keys, use keystone-manage fernet_
setup to create Fernet keys.
```

With the following fernet-keys folder in the container (but perhaps not when Keystone started):

(keystone)[root@controller1 fernet-keys]# ls -la
total 8
drwxrwx--- 2 keystone keystone 24 Jun 29 17:11 .
drwxr-x--- 1 root keystone 46 Jun 30 08:30 ..
-rw------- 1 keystone keystone 44 Jun 29 17:11 0
-rw------- 1 keystone keystone 44 Jun 29 17:11 1

Thanks, I missed that. Trying with the above patch.

On my single node Train deploy I also see:

```
keystone/keystone-apache-public-error.log:2020-06-29 17:11:21.562830   File "/usr/lib/python2.7/site-packages/keystone/receipt/providers/fernet/core.py", line
 45, in __init__                                                                                                                                              
keystone/keystone-apache-public-error.log:2020-06-29 17:11:21.562879     'Fernet keys.') % subs)                                                              
keystone/keystone-apache-public-error.log:2020-06-29 17:11:21.562902 SystemExit: /etc/keystone/fernet-keys/ does not contain keys, use keystone-manage fernet_
setup to create Fernet keys.                                                                                                                                  
keystone/keystone-apache-public-error.log:2020-06-29 17:11:25.810353   File "/usr/lib/python2.7/site-packages/keystone/receipt/providers/fernet/core.py", line
 45, in __init__                                                                                                                                              
keystone/keystone-apache-public-error.log:2020-06-29 17:11:25.810403     'Fernet keys.') % subs)                                                              
keystone/keystone-apache-public-error.log:2020-06-29 17:11:25.810427 SystemExit: /etc/keystone/fernet-keys/ does not contain keys, use keystone-manage fernet_
setup to create Fernet keys.                                                                                                                                  
```

With the following fernet-keys folder in the container (but perhaps not when Keystone started):

(keystone)[root@controller1 fernet-keys]# ls -la  
total 8                                           
drwxrwx--- 2 keystone keystone 24 Jun 29 17:11 .  
drwxr-x--- 1 root     keystone 46 Jun 30 08:30 .. 
-rw------- 1 keystone keystone 44 Jun 29 17:11 0  
-rw------- 1 keystone keystone 44 Jun 29 17:11 1

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-08-28: Fix merged to kolla-ansible (stable/train)

#33

Reviewed: https://review.opendev.org/705949
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=ac38a48754de8484c2ee83534ba2df530caa2db3
Submitter: Zuul
Branch: stable/train

commit ac38a48754de8484c2ee83534ba2df530caa2db3
Author: Michal Nasiadka <email address hidden>
Date: Thu Dec 12 13:19:48 2019 +0100

Fix keystone fernet bootstrap

There are cases when a multinode deployment ends up in unusable
keystone public wsgi on some nodes.

    The root cause is that keystone public wsgi doesn't find fernet
    keys on startup - and then persists on sending 500 errors to any
    requests - due to a race condition between
    fernet_setup/fernet-push.sh and keystone startup.

    Depends-On: https://review.opendev.org/705958
    Change-Id: I63709c2e3f6a893db82a05640da78f492bf8440f
    Closes-Bug: #1846789
    (cherry picked from commit 0799782ce83d1057f262b44c979a15f9a1b05c72)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2021-01-07: Fix included in openstack/kolla-ansible 9.3.0

#34

This issue was fixed in the openstack/kolla-ansible 9.3.0 release.

kolla-ansible

Creating the Keystone service and endpoint Failed: Internal Server Error (HTTP 500)

Bug Description

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

	Status	Importance	Assigned to	Milestone
kolla-ansible	Fix Released	High	Michal Nasiadka	kolla-ansible 10.0.0 "ussuri"
Stein	Triaged	High	Unassigned
Train	Fix Released	High	Mark Goddard
Ussuri	Fix Released	High	Michal Nasiadka	kolla-ansible 10.0.0 "ussuri"