keystone role failure on deploy

Bug #1866008 reported by Radosław Piliszek
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
Invalid
Undecided
Unassigned
Ussuri
Invalid
Undecided
Unassigned
oslo.cache
Fix Released
Undecided
Unassigned

Bug Description

At least Ussuri CI is red now.

Task:
keystone : Creating admin project, user, role, service, and endpoint

Error:
The conditional check '(keystone_bootstrap.stdout | from_json).changed' failed. The error was: Expecting ',' delimiter: line 1 column 9325 (char 9324)

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Nothing relevant changed in keystone nor kolla(-ansible) to cause this.

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :
Download full text (7.2 KiB)

2020-03-04 08:26:48.528 269 CRITICAL keystone [req-862984ff-cc4c-4c39-a4d9-9576eee6c2c5 - - - - -] Unhandled error: TypeError: __init__() got an unexpected keyword argument 'dead_retry'
2020-03-04 08:26:48.528 269 ERROR keystone Traceback (most recent call last):
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_cache/_memcache_pool.py", line 150, in _get
2020-03-04 08:26:48.528 269 ERROR keystone conn = self.queue.pop().connection
2020-03-04 08:26:48.528 269 ERROR keystone IndexError: pop from an empty deque
2020-03-04 08:26:48.528 269 ERROR keystone
2020-03-04 08:26:48.528 269 ERROR keystone During handling of the above exception, another exception occurred:
2020-03-04 08:26:48.528 269 ERROR keystone
2020-03-04 08:26:48.528 269 ERROR keystone Traceback (most recent call last):
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/bin/keystone-manage", line 8, in <module>
2020-03-04 08:26:48.528 269 ERROR keystone sys.exit(main())
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/cmd/manage.py", line 41, in main
2020-03-04 08:26:48.528 269 ERROR keystone cli.main(argv=sys.argv, developer_config_file=developer_config)
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/cmd/cli.py", line 1379, in main
2020-03-04 08:26:48.528 269 ERROR keystone CONF.command.cmd_class.main()
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/cmd/cli.py", line 198, in main
2020-03-04 08:26:48.528 269 ERROR keystone klass.do_bootstrap()
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/cmd/cli.py", line 189, in do_bootstrap
2020-03-04 08:26:48.528 269 ERROR keystone self.bootstrapper.bootstrap()
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/cmd/bootstrap.py", line 63, in bootstrap
2020-03-04 08:26:48.528 269 ERROR keystone self._bootstrap_default_domain()
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/cmd/bootstrap.py", line 84, in _bootstrap_default_domain
2020-03-04 08:26:48.528 269 ERROR keystone domain=default_domain)
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/common/manager.py", line 115, in wrapped
2020-03-04 08:26:48.528 269 ERROR keystone __ret_val = __f(*args, **kwargs)
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/resource/core.py", line 766, in create_domain
2020-03-04 08:26:48.528 269 ERROR keystone domain_id, project_from_domain, initiator)
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone/common/manager.py", line 115, in wrapped
2020-03-04 08:26:48.528 269 ERROR keystone __ret_val = __f(*args, **kwargs)
2020-03-04 08:26:48.528 269 ERROR keystone File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystone...

Read more...

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

We hide the real issue behind json parser error.

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Could seemingly be oslo.cache https://review.opendev.org/#/c/634457/ but it should have happened earlier.

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Mark found: https://review.opendev.org/711027 [update constraint for oslo.cache to new release 2.1.0], merged today

Changed in oslo.cache:
status: New → Confirmed
Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

So it's not happening in devstack just because devstack does:
  [cache]
  backend = dogpile.cache.memcached
and we do:
  [cache]
  backend = oslo_cache.memcache_pool

Revision history for this message
Mark Goddard (mgoddard) wrote :

Also, this doesn't affect centos 7 jobs since the UC bump was for python 3 only.

Revision history for this message
Herve Beraud (herveberaud) wrote :

Indeed oslo.cache could be a track to follow in this context.

As discussed on IRC with Mark he will start by removing the oslo.cache changes (before 2.1.0) which move from memcached to pymemcache.

Also it could be worth to grep oslo.cache/pymemcache/memcache errors to seen where is exactly the issue here.

I used the serde module [1] to keep serialization/deserialization compatible with the data previously stored [2], maybe something went wrong on this side and your data are not compatible for some reasons, it could be a clue.

[1] https://github.com/openstack/oslo.cache/blob/8a8248d764bbb1db6c0089a58745803c03e38fdb/oslo_cache/_memcache_pool.py#L41,L42
[2] https://pymemcache.readthedocs.io/en/latest/apidoc/pymemcache.serde.html#module-pymemcache.serde

Revision history for this message
Herve Beraud (herveberaud) wrote :

According to your previous email on the ML I put the traceback that you pointed out:

```
The error is TypeError: __init__() got an unexpected keyword argument
'dead_retry'
```

I think our issue is due to the fact that python-memcached accept a param named `dead_retry` [1] which is not defined in pymemcache.

We just need to define it in our oslo.cache mapping. During testing we faced the same kind of issue with connection timeout.

[1] https://github.com/linsomniac/python-memcached/blob/bad41222379102e3f18f6f2f7be3ee608de6fbff/memcache.py#L183
[2] https://github.com/openstack/oslo.cache/blob/8a8248d764bbb1db6c0089a58745803c03e38fdb/oslo_cache/_memcache_pool.py#L193,L201

Revision history for this message
Herve Beraud (herveberaud) wrote :
Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Makes sense, Herve. Thanks.
Note we are not actively setting this param.

Revision history for this message
Herve Beraud (herveberaud) wrote :

`dead_timeout` [1] looks more appropriate in this case.

[1] https://github.com/pinterest/pymemcache/blob/master/pymemcache/client/hash.py#L58

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Right, because the other one cares about pre-failure state. Good catch.

Revision history for this message
Herve Beraud (herveberaud) wrote :

I proposed a fix https://review.opendev.org/#/c/711220/

Please try it in your env and tell us if it fix your issue.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.cache (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/711422

Revision history for this message
Herve Beraud (herveberaud) wrote :

oslo.cache 2.1.0 blacklist => https://review.opendev.org/711427

Revision history for this message
Mark Goddard (mgoddard) wrote :

Worked around the issue by pinning requirements: https://review.opendev.org/711470

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.cache (master)

Reviewed: https://review.opendev.org/711439
Committed: https://git.openstack.org/cgit/openstack/oslo.cache/commit/?id=cec9b8801c41f132c5466974ccec1e0d7a8b7c3a
Submitter: Zuul
Branch: master

commit cec9b8801c41f132c5466974ccec1e0d7a8b7c3a
Author: Radosław Piliszek <email address hidden>
Date: Tue Mar 10 12:02:00 2020 +0100

    Revert "Switch from python-memcached to pymemcache."

    pymemcache turned out to be far from drop-in replacement.
    It will be reintroduced as another backend. See [1].

    Closes-bug: #1866008
    Related-bug: #1812935

    This reverts commit 8a8248d764bbb1db6c0089a58745803c03e38fdb.

    [1] https://review.opendev.org/711220

    Co-Authored-By: Hervé Beraud <email address hidden>
    Change-Id: Ia3b8d42435b0eac1c87b866449b08a1d11818986

Changed in oslo.cache:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to oslo.cache (master)

Reviewed: https://review.opendev.org/711422
Committed: https://git.openstack.org/cgit/openstack/oslo.cache/commit/?id=460211d86cf5aff8cf04564c44bc6a8a26e8d06d
Submitter: Zuul
Branch: master

commit 460211d86cf5aff8cf04564c44bc6a8a26e8d06d
Author: Radosław Piliszek <email address hidden>
Date: Thu Mar 5 11:15:36 2020 +0100

    Add memcache_pool backend job

    This is to be able to validate in CI that it actually works.

    Change-Id: Iadad6d2d4643412e4d846a05c3e3eb4a3deecb84
    Related-bug: #1866008

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.cache 2.2.0

This issue was fixed in the openstack/oslo.cache 2.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.