When keystone is slow to respond: getting user fails

Bug #1597357 reported by Sofer Athlan-Guyot on 2016-06-29
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
puppet-keystone
High
Sofer Athlan-Guyot
tripleo
Critical
Sofer Athlan-Guyot

Bug Description

To test if an user exists we check the keystone db by using

    openstack show user 'foo' ...

If the user doesn't exists then we get an error. The usual retry of
openstack lib would imply that we wait the full request_timeout to get
this. This is currently ~170s. So 170s times the number of user
in the catalog!

To overcome this a the call is wrapped inside a no retry outer
function[1]

The problem is that on very slow platform legit timeout can occur,
this is especially true for CI. Here is an example of such failure:

    Error: /Stage[main]/Keystone::Roles::Admin/Keystone_user[admin]: Could not evaluate: Command: 'openstack ["user", "show", "--format", "shell", ["admin", "--domain", "default"]]' has been running for more then 20 seconds (tried 0, for a total of 0 seconds)

From http://logs.openstack.org/58/322858/11/check-tripleo/gate-tripleo-ci-centos-7-ha/7e5b0a6/logs/postci.txt.gz

[1] https://github.com/openstack/puppet-keystone/blob/master/lib/puppet/provider/keystone_user/openstack.rb#L81

Tags: ci Edit Tag help
no longer affects: puppet-keystone
affects: keystone → puppet-keystone
Changed in puppet-keystone:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Sofer Athlan-Guyot (sofer-athlan-guyot)
summary: - When kestone is slow to respond getting user fails
+ When keystone is slow to respond: getting user fails
Changed in tripleo:
importance: Undecided → Critical
status: New → Confirmed
assignee: nobody → Sofer Athlan-Guyot (sofer-athlan-guyot)
tags: added: ci
Changed in tripleo:
milestone: none → newton-2

Fix proposed to branch: master
Review: https://review.openstack.org/335600

Changed in puppet-keystone:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/335600
Committed: https://git.openstack.org/cgit/openstack/puppet-keystone/commit/?id=07cee48dfc63641e9390ea6fd4c7034fe06f28d2
Submitter: Jenkins
Branch: master

commit 07cee48dfc63641e9390ea6fd4c7034fe06f28d2
Author: Sofer Athlan-Guyot <email address hidden>
Date: Wed Jun 29 19:25:47 2016 +0200

    Add retry to keystone_user.exists?

    Put back exists? method in keystone_user in line with the usual
    openstacklib mechanism. This is done by adding the possibility for
    request call to pass regexp messages that shouldn't be retried.

    Now we can safely call fetch_user without worrying about having the call
    retried by opentacklib.

    Fetch_project has the same behavior, so I added it to the mix. It may
    be a performance killer somewhere.

    Change-Id: I368cf6a06d21d018337af3e6d09cdabee839a563
    Closes-Bug: 1597357

Changed in puppet-keystone:
status: In Progress → Fix Released
Changed in tripleo:
status: Confirmed → Fix Released

This issue was fixed in the openstack/puppet-keystone 9.1.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers