Authorization Error: 'Nonce already used' error when deploying machines

Bug #1851708 reported by Marcelo Subtil Marcal
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Expired
Undecided
Unassigned

Bug Description

When deploying 24 machines using 2.6.1, nine (9) of them failed when trying to get /MAAS/metadata/curtin/2012-03-01/user-data.

The network analysis showed an "Authorization Error: 'Nonce already used."

https://pastebin.canonical.com/p/FkXvvvK7Ch/

The error happens inconsistently, ie, it doesn't happen on the same nodes between the deployment.

Revision history for this message
Marcelo Subtil Marcal (msmarcal) wrote :

Also I got some errors on regiond.log:
2019-11-06 21:25:28 regiond: [info] 10.243.165.2 GET /MAAS/metadata/curtin/2012-03-01/user-data HTTP/1.0 --> 401 UNAUTHORIZED (referrer: -; agent: python-requests/2.18.4)
2019-11-06 21:26:27 regiond: [info] 10.243.165.2 GET /MAAS/metadata/curtin/2012-03-01/user-data HTTP/1.0 --> 401 UNAUTHORIZED (referrer: -; agent: python-requests/2.18.4)
2019-11-07 12:37:31 regiond: [info] 10.243.165.2 GET /MAAS/metadata/curtin/2012-03-01/user-data HTTP/1.0 --> 401 UNAUTHORIZED (referrer: -; agent: python-requests/2.18.4)

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

Just to add more info, the PXE process goes well, kernel and rootfs are downloaded ok, boot starts and at some point in the commissioning process cloud-init says that could not find meta-data and gives up, the commission fails.

Around that moment in time, we can see in a packet capture dump analyzed in wireshark that the machine gets this error from maas. All other machines that succeed do not get it.

Revision history for this message
Marcelo Subtil Marcal (msmarcal) wrote :

subscribed ~field-critical

Revision history for this message
Alberto Donato (ack) wrote :

@Marcelo, from IRC log you linked a part of region log showing that a rackcontroller is failing to connect to the region. Is that happening concurrently with this issue?

Revision history for this message
Marcelo Subtil Marcal (msmarcal) wrote :

@Alberto, I thing that was related to a service restart I did. I'm testing again just to check if a connection error will be shown on logs.

Revision history for this message
Marcelo Subtil Marcal (msmarcal) wrote :

After redeploying the three MAAS controllers, the error doesn't occur anymore.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Being that you no longer have the original infrastructure that was causing this bug, I am going to mark the bug as incomplete.

If it occurs again or you can provide steps for reproduction then we can open it back up.

Changed in maas:
status: New → Incomplete
Changed in maas:
status: Incomplete → Invalid
status: Invalid → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for MAAS because there has been no activity for 60 days.]

Changed in maas:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.