When polling a juju environment, MAAS frequnetly returns 401

Bug #1269640 reported by Adam Gandelman
This bug report is a duplicate of:  Bug #1190986: [SRU] ERROR Nonce already used. Edit Remove
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
New
Undecided
Unassigned

Bug Description

This is really breaking our ability to use juju-deployer to do continuous testing of juju/maas environments. deployer polls juju at various phases. This frequently fails with a 401 UNAUTHORIZED error being returned by the juju client. Using the following simple test, I am able to reproduce reliably:

#!/bin/bash
i=1
while [[ 1 ]] ; do
    juju status >/dev/null
    [[ $? != 0 ]] && echo "FAIL" && exit 1
    echo "OK: $i"
    i=$[$i+1]
done

The test sometimes fails in the first ~10 requests. Other times after ~75. The errors returned by Juju vary, but I've yet to be able to succeed in making 100 requests.

gomaasapi: got error back from server: 401 UNAUTHORIZED

could not access file 'e2fa8553-b864-44aa-861c-28c72b607029-provider-state': gomaasapi: got error back from server: 401 UNAUTHORIZED

could not access file 'e2fa8553-b864-44aa-861c-28c72b607029-bootstrap-verify': gomaasapi: got err or back from server: 401 UNAUTHORIZED

In the maas.log I see OAuth errors:

ERROR 2014-01-15 18:18:28,633 maasserver ################################ Exception: ################################
ERROR 2014-01-15 18:18:28,633 maasserver Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/django/core/handlers/base.py", line 115, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
  File "/usr/lib/python2.7/dist-packages/django/views/decorators/vary.py", line 19, in inner_func
    response = func(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/piston/resource.py", line 128, in __call__
    actor, anonymous = self.authenticate(request, rm)
  File "/usr/lib/python2.7/dist-packages/maasserver/api_support.py", line 47, in authenticate
    RestrictedResource, self).authenticate(request, rm)
  File "/usr/lib/python2.7/dist-packages/piston/resource.py", line 103, in authenticate
    if not authenticator.is_authenticated(request):
  File "/usr/lib/python2.7/dist-packages/maasserver/api_auth.py", line 57, in is_authenticated
    raise OAuthUnauthorized(error)
OAuthUnauthorized

This is reproducible across 2 different MAAS+juju environments, setup a bit different:
  A) juju client connecting to a MAAS server running on localhost.
  B) juju client connecting to a MAAS server running externally.

It was mentioned that syncing the clocks should help. In both cases, I've ensured clocks have been synced MAAS API node + the environments bootstrap node (machine 0). In the case of cluster B, I've ensured clocks synced between juju client node, MAAS API node and machine 0. The problem still persists.

Revision history for this message
Adam Gandelman (gandelman-a) wrote :

On both clusters:

maas 1.4+bzr1693+dfsg-0ubuntu2.2
juju-core 1.16.5-0ubuntu1~ubuntu13.04.1~juju1

Revision history for this message
Julian Edwards (julian-edwards) wrote :

I initially thought this was bug 1190986 (Nonce in use) but it shows an actual error about Nonces in that case.

Revision history for this message
Adam Gandelman (gandelman-a) wrote :

Looking a bit closer with some debug output in api_auth.py, the original oauth.OAuthError carries with it an error related to Nonces, ie:

Nonce already used: 78485338

... but the MAAS exception class that wraps this isn't reporting it anywhere.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Ok it turns out that it *is* a nonce re-use problem. However to find this out Adam had to add debugging statements to pull out the real error.

So the bug here is really, "show the real error". I will file a new bug.

Revision history for this message
Raphaël Badin (rvb) wrote :

If you're using a earlier version of MAAS that doesn't contain the fix yet, it's easy to reproduce the fix:
Save the script from http://paste.ubuntu.com/6762313/ into nonces_cleanup.py and then run it using cron, every 5 minutes:
cat nonces_cleanup.py | sudo maas shell

Note that the cleanup will not happen the first time you run the script, only the second time (and all the other times after that); Read the script if you want to know why ;).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.