Fernet tokens fail tempest runs

Bug #1473567 reported by Eric Brown
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Dashboard (Horizon)
Fix Released
High
Timur Sufiev
OpenStack Identity (keystone)
Won't Fix
Undecided
Unassigned
tempest
Fix Released
Undecided
Lance Bragstad

Bug Description

It seems testing an OpenStack instance that was deployed with Fernet tokens fails on some of the tempest tests. In my case these tests failed:
http://paste.openstack.org/show/363017/

bknudson also found similar in a test patch:
   https://review.openstack.org/#/c/195780

Tags: fernet
Brant Knudson (blk-u)
tags: added: fernet
Revision history for this message
Dolph Mathews (dolph) wrote :
Revision history for this message
Roxana Gherle (roxana-gherle) wrote :

Is anyone working on a fix for this?
Dolph,
I saw you had a patch which adds some tests that would reveal this problem, but do we have a patch to fix the reason why tests are failing?
I can look into it if no one has already started on it, since I need this fix.

Thanks,
Roxana

Revision history for this message
Dolph Mathews (dolph) wrote :

Roxana: I had a fix for bug 1459790, but it needs to be reworked on top of a recent refactor in the fernet provider. If someone wants to tackle this before I get to it, please do.

Changed in keystone:
importance: Undecided → High
status: New → Triaged
Revision history for this message
Dolph Mathews (dolph) wrote :

There appear to be multiple issues with Tempest + Fernet, which are tracked in separate bug reports, but let's leave this one open until tempest runs work as expected.

On that note, another related bug: bug 1491926.

Revision history for this message
Lance Bragstad (lbragstad) wrote :

The following are failures in Tempest;

tempest.api.identity.admin.v2.test_roles_negative.RolesNegativeTestJSON.test_list_roles_request_without_token
tempest.api.identity.admin.v3.test_tokens.TokensV3TestJSON.test_rescope_token
tempest.api.identity.admin.v2.test_tokens.TokensTestJSON.test_create_get_delete_token (addressed by bug 1491926)

Revision history for this message
Lance Bragstad (lbragstad) wrote :

There are still several tempest tests that are failing after the following patches have merged:

https://review.openstack.org/#/c/220272/
https://review.openstack.org/#/c/220242/

Here is a list of tempest tests that are still failing;

http://cdn.pasteraw.com/olrwqjf96eci3tp5ammro0zpu251lxr

Revision history for this message
Dolph Mathews (dolph) wrote :

Bug 1496998 is possibly a dupe.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (master)

Fix proposed to branch: master
Review: https://review.openstack.org/227015

Changed in keystone:
assignee: nobody → Dolph Mathews (dolph)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to keystone (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/227030

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (master)

Fix proposed to branch: master
Review: https://review.openstack.org/227034

Dolph Mathews (dolph)
Changed in keystone:
status: In Progress → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on keystone (master)

Change abandoned by Dolph Mathews (<email address hidden>) on branch: master
Review: https://review.openstack.org/227034
Reason: alrighty then

Changed in keystone:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to keystone (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/227995

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on keystone (master)

Change abandoned by Dolph Mathews (<email address hidden>) on branch: master
Review: https://review.openstack.org/227015
Reason: Patchset 1 and 2 both fail quite badly against https://review.openstack.org/#/c/227995/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tempest (master)

Fix proposed to branch: master
Review: https://review.openstack.org/231191

Changed in tempest:
assignee: nobody → Lance Bragstad (lbragstad)
status: New → In Progress
Revision history for this message
Dolph Mathews (dolph) wrote :

To provide a summary update on this bug, the problem is that Fernet token creation timestamps are only precise to 1-second. Tempest, reasonably, expects to be able to trigger a revocation event, create a new token, and expects that token to be valid. With fast request-response times and such imprecise timestamps, we're left with a race condition where it's possible for tokens to appear to be created before or at the same time as Fernet tokens.

This race condition is demonstrated here:

  https://review.openstack.org/#/c/227995/

This has been reported upstream:

  https://github.com/fernet/spec/issues/12

A change to the spec would mean a new version of Fernet that must be implemented by clients (such as pypi/cryptography) before we can consume the fix.

Without an upstream solution, we're left with two rather ugly solutions. The first option is to introduce a sleep in keystone to slow requests down which result in revocation events, such that those requests are not returned until the client can be guaranteed that it's "safe" to create new tokens):

  https://review.openstack.org/#/c/227030/

The second option is to introduce a similar sleep to all clients, starting with tempest:

  https://review.openstack.org/#/c/231191/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on keystone (master)

Change abandoned by Dolph Mathews (<email address hidden>) on branch: master
Review: https://review.openstack.org/227030
Reason: This won't ever have a chance of passing tests as-is, even if it is the only solution to this bug that is isolated to keystone.

Revision history for this message
Dolph Mathews (dolph) wrote :

At the Mitaka summit in Tokyo, the deployers in the room expressed apathy in improving this issue, because even if we did improve the timestamp precision in Fernet, clock skew would still be a larger, unresolved issue (and either way, you get the same sort of race conditions). The consensus was that 401s are normal (and you must re-authenticate as necessary) and accept that tokens created immediately following revocation events may not be useable.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tempest (master)

Reviewed: https://review.openstack.org/231191
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=a2c4ebc4fac75c0889489e4bed5a0aa89f8193f1
Submitter: Jenkins
Branch: master

commit a2c4ebc4fac75c0889489e4bed5a0aa89f8193f1
Author: Lance Bragstad <email address hidden>
Date: Mon Oct 5 20:34:39 2015 +0000

    Fix race condition when changing passwords

    This patch makes it so that there is a one second wait when changing a password
    with Keystone. This is done because when we lose sub-second precision with
    Fernet tokens there is the possibility of a token being issued and revoked
    within the same second. Keystone will err on the side of security and return a
    404 NotFound when validating a token that was issued in the same second as a
    revocation event.

    For example, it is possible for a revocation event to happen at .000001, but it
    will be stored in MySQL as .000000 because of sub-second truncation. A token can
    be created at .000002, but the creation time of that token, according to
    Fernet, will be .000000, because Fernet tokens don't have sub-second precision.
    When that token is validated, it will appear invalid even though it was created
    *after* the revocation event.

    Change-Id: Ied83448de8af1b0da9afdfe6ce9431438215bfe0
    Closes-Bug: 1473567

Changed in tempest:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tempest (master)

Fix proposed to branch: master
Review: https://review.openstack.org/247678

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tempest (master)

Reviewed: https://review.openstack.org/247678
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=144c2f4a3b1f7960a4d2f10b2d1757eb04273dd3
Submitter: Jenkins
Branch: master

commit 144c2f4a3b1f7960a4d2f10b2d1757eb04273dd3
Author: Lance Bragstad <email address hidden>
Date: Thu Nov 19 16:42:37 2015 +0000

    Move identity wait until after password is updated

    With Ied83448de8af1b0da9afdfe6ce9431438215bfe0 merging, we should be waiting
    after the password is updated and before we reauthenticate. This change fixes
    this to be correct. For some reason a transient failure didn't cause the tests
    to fail.

    Change-Id: I7286085325c3578958234176adb7588afdcac578
    Closes-Bug: 1473567

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tempest (master)

Fix proposed to branch: master
Review: https://review.openstack.org/249863

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tempest (master)

Reviewed: https://review.openstack.org/249863
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=f602644ce81356867470cf05432252f474bfedec
Submitter: Jenkins
Branch: master

commit f602644ce81356867470cf05432252f474bfedec
Author: Lance Bragstad <email address hidden>
Date: Wed Nov 25 15:29:50 2015 +0000

    Fix race condition when changing passwords for admin tests

    In a previous fix, Ied83448de8af1b0da9afdfe6ce9431438215bfe0, we applied a
    sleep in order to mitigate a race condition in keystone with Fernet tokens.
    This change moves that same concept to the tempest.api.identity.admin.v3 tests
    so those pass with Fernet as well.

    This change also separates the token from the v2 update user password response
    in test_user_update_own_password() for the same reason. By isolating the token
    from the update user call, we ensure we don't hit the same issue with the v2
    update user password tests.

    Change-Id: I34b7b7c34fa34551c0fdf9bdeb1ffaa432c10adc
    Closes-Bug: 1473567
    Related-Bug: 1517697

Changed in keystone:
milestone: none → mitaka-3
Dolph Mathews (dolph)
Changed in keystone:
assignee: Dolph Mathews (dolph) → Lance Bragstad (lbragstad)
Revision history for this message
Lance Bragstad (lbragstad) wrote :

Morgan has a patch up that fixes the comments that I made in tempest [0]. Just throwing this here so that the context is documented.

[0] https://review.openstack.org/#/c/275497/

Changed in keystone:
status: In Progress → Won't Fix
Changed in keystone:
milestone: mitaka-3 → none
assignee: Lance Bragstad (lbragstad) → nobody
importance: High → Undecided
Revision history for this message
Timur Sufiev (tsufiev-x) wrote :

Encountered a similar issue in Horizon integration tests recently: http://logs.openstack.org/42/236042/62/check/gate-horizon-dsvm-integration/d8d590e/logs/apache/keystone.txt.gz#_2016-05-10_18_10_43_036 - this happens in tests which change password and then login back with new password.

Changed in horizon:
assignee: nobody → Timur Sufiev (tsufiev-x)
importance: Undecided → High
milestone: none → newton-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to horizon (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/315050

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to horizon (master)

Reviewed: https://review.openstack.org/315050
Committed: https://git.openstack.org/cgit/openstack/horizon/commit/?id=c6bfe2d13362cf266f88c4df0c1bcaf9281345f1
Submitter: Jenkins
Branch: master

commit c6bfe2d13362cf266f88c4df0c1bcaf9281345f1
Author: Timur Sufiev <email address hidden>
Date: Wed May 11 16:52:06 2016 +0300

    In integration tests wait 1 second after changing the password

    This should give us 1 second that's needed for Keystone to distinguish
    between the moment of token revocation and a login attempt, since
    Fernet tokens don't have subsecond precision.

    Change-Id: Iedfb9ea5efe11d05ba7b12fa590fdf9e01c3d734
    Related-Bug: #1578866
    Related-Bug: #1473567

Revision history for this message
Rob Cresswell (robcresswell-deactivatedaccount) wrote :

A better status for the Horizon patch would be "workaround released" but oh well.

Changed in horizon:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on keystone (master)

Change abandoned by Dolph Mathews (<email address hidden>) on branch: master
Review: https://review.openstack.org/227995

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.