Expired Keystone tokens should be cleaned up regularly

Bug #1274756 reported by Dmitry Borodaenko
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Meg McRoberts

Bug Description

Fuel documentation should include instructions on how to clean up expired Keystone tokens, with an explanation that failing to do such clean up regularly would lead to rapid degradation of OpenStack performance.

One option is to use keystone-manage token-flush command:
https://blueprints.launchpad.net/keystone/+spec/keystone-manage-token-flush

In larger high-load environments it would be safer to script an archival solution based on pt-archiver so that Keystone database doesn't get blocked for a significant amount of time while the rows with expired tokens are being deleted:
http://www.percona.com/doc/percona-toolkit/2.1/pt-archiver.html

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Please consider the following change (memcached as backend for tokens) as well
https://bugs.launchpad.net/fuel/+bug/1269819

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
status: New → Confirmed
Changed in fuel:
assignee: nobody → Fuel Library Team (fuel-library)
Revision history for this message
Roman Alekseenkov (ralekseenkov) wrote :

As discussed with Dmitry, we need to put a safe code fix in place.

Customers are usually running external queries against OpenStack (Scalr, monitoring, or something else). That creates new keystone tokens every 5-10 seconds, which floods the database pretty quickly. It has to be a code fix, not docs.

tags: added: customer-found
Changed in fuel:
status: Confirmed → Triaged
tags: removed: docs
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

We can't solve this problem in 4.1 easily for the following reasons:
1 - We need to build (or mirror) all of percona tools to get pt-archiver. This will require more strain on OSCI as they're pretty heavily loaded righ tnow.
2 - percona tools requires galera server, which will conflict with our existing packages. We'll have to add extra engineering time to split up and make pt-archiver available separately.
3 - Sergey V started work on implementing memcached, but ran into some issues with HA memchached and synchronization. Since he is loaded with many other bugs, this won't come soon.

The alternative is to log into MySQL and run the following command:
USE keystone ; DELETE FROM token WHERE NOT DATE_SUB(CURDATE(),INTERVAL 2 DAY) <= expires;
Or something similar. The problem with this many records are locked during the time this query runs and you'll eventually hit a deadlock and your query fails. You need to limit to about 1000 records per query and run this between 1 and 5000 times to purge old tokens.

I agree pt-archiver is the way to go, but we need a way to get this added in without too much burden on OSCI. I'm taking this bug and I will try to split out pt-archiver into its own package just for the short term.

Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Matthew Mosesohn (raytrac3r)
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

At least in Debian, pt-archiver is part of percona-toolkit package which has very trivial build and binary dependencies:

Build-Depends: debhelper (>= 9)
Build-Depends-Indep: perl
Depends: ${misc:Depends}, ${perl:Depends}, libdbi-perl, libdbd-mysql-perl, libterm-readkey-perl, gdb

http://anonscm.debian.org/gitweb/?p=collab-maint/percona-toolkit.git;a=blob;f=debian/control;h=85af048dad8bf55d08c8a6edf08a307ff4513cc2;hb=HEAD

For CentOS, we should be able to reuse the SRPM from OpenSUSE:

%if 0%{?suse_version} < 1140
Requires: perl = %{perl_version}
%else
%{perl_requires}
%endif
Requires: perl(DBD::mysql) >= 1.0
Requires: perl(DBI) >= 1.13
Requires: perl(IO::Socket::SSL)
Requires: perl(Term::ReadKey) >= 2.10
Requires: perl(Time::HiRes)

http://rpmfind.net//linux/RPM/opensuse/updates/13.1/noarch/percona-toolkit-2.2.6-2.4.1.noarch.html

Why do you think building these would be a strain on OSCI?

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

Dmitry, I say it's a burden on OSCI because they're working to get 2013.2.2 and pacemaker/corosync/libqb bugs hammered out as they pop up. Those take priority over new features.

In any case, I managed to get Dmitry Burmistrov and he pieced together some older percona-toolkit RPM source with updated sources, and found a great ubuntu candidate and built both for me. I will have results tomorrow.

The SuSE package doesn't meet our needs as easily because SuSE perl RPM macros don't line up with CentOS.

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

Update: puppet code is ready to test once we get pt-archiver (and its necessary perl deps) into the ISO.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-main (master)

Fix proposed to branch: master
Review: https://review.openstack.org/75043

Changed in fuel:
status: Triaged → In Progress
Revision history for this message
Mike Scherbakov (mihgen) wrote :

I consider this risky to include into 4.1 now. Let's postpone, and see if we need to do 4.1.1 with this fix.

Changed in fuel:
milestone: 4.1 → 5.0
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

How about a half-way solution: include percona-toolkit in our repo so that it's readily available in 4.1 deployments, postpone the puppet side of the fix to 5.0, and add a note in the documentation?

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

Dmitry B, looks like we missed the window. I didn't manage to get QA to +1 this.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-main (master)

Reviewed: https://review.openstack.org/75043
Committed: https://git.openstack.org/cgit/stackforge/fuel-main/commit/?id=65abb3f1b7935833316c75a947c2d14602ca4e47
Submitter: Jenkins
Branch: master

commit 65abb3f1b7935833316c75a947c2d14602ca4e47
Author: Matthew Mosesohn <email address hidden>
Date: Thu Feb 20 18:08:11 2014 +0400

    Add percona-toolkit for cleanup of tokens

    Change-Id: I2ecd828f24da6039e81c5ff3290e541811e9e92a
    Partial-Bug: #1274756

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-main (stable/4.1)

Fix proposed to branch: stable/4.1
Review: https://review.openstack.org/76547

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-main (stable/4.1)

Reviewed: https://review.openstack.org/76547
Committed: https://git.openstack.org/cgit/stackforge/fuel-main/commit/?id=c1cb209b6872eee06fed936b1d365a4de31eccc4
Submitter: Jenkins
Branch: stable/4.1

commit c1cb209b6872eee06fed936b1d365a4de31eccc4
Author: Matthew Mosesohn <email address hidden>
Date: Thu Feb 20 18:08:11 2014 +0400

    Add percona-toolkit for cleanup of tokens

    Change-Id: I2ecd828f24da6039e81c5ff3290e541811e9e92a
    Partial-Bug: #1274756

Andrew Woodward (xarses)
tags: added: backports-4.1.1
Changed in fuel:
status: In Progress → Triaged
Changed in fuel:
milestone: 5.0 → 5.1
tags: added: docs
removed: backports-4.1.1
Changed in fuel:
milestone: 5.1 → 5.0
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

Now that we have pt-archiver in the repository, we can address this in the Operations Guide like this:

"Keystone Token Cleanup

Keystone does not automatically remove expired tokens from its database. To prevent the buildup of entries in the token table from slowing down Keystone and the rest of OpenStack with it, consider setting up a periodic cleanup job. We recommend using pt-archiver command from percona-toolkit, as demonstrated in the "cleanup-keystone-tokens.sh" <https://github.com/openstack/tripleo-image-elements/blob/master/elements/keystone/cleanup-keystone-tokens.sh> script from TripleO:

pt-archiver --source h=$KEYSTONE_DB_HOST,u=$KEYSTONE_DB_USER,p=$KEYSTONE_DB_PASS,D=$KEYSTONE_DB_NAME,t=token \
    --charset utf8 \
    --where "expires < UTC_TIMESTAMP()" \
    --purge \
    --txn-size 500 \
    --run-time 59m \
    --statistics \
    --primary-key-only

Changed in fuel:
assignee: Matthew Mosesohn (raytrac3r) → Meg McRoberts (dreidellhasa)
Changed in fuel:
milestone: 5.0 → 5.1
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

I think this bug is too important to postpone to 5.1, considering that it can easily be addressed in the documentation as described in comment #15.

Changed in fuel:
importance: Medium → High
milestone: 5.1 → 5.0
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :
Changed in fuel:
status: Triaged → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-docs (master)

Fix proposed to branch: master
Review: https://review.openstack.org/101332

Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.