maas provider, hwclock out of sync means juju will not work
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Fix Released
|
Critical
|
Gavin Panella | ||
cloud-init |
Expired
|
Medium
|
Unassigned | ||
curtin |
Triaged
|
Undecided
|
Unassigned | ||
falkor |
Fix Released
|
High
|
Chris Glass | ||
juju-core |
Invalid
|
Undecided
|
Unassigned |
Bug Description
MAAS provides no means to ensure the hardware clock is set, and juju relies on accurate clocks.
Leading to errors like this when you bootstrap on machines that otherwise works fine:
"ERROR juju.cmd supercommand.go:430 gomaasapi: got error back from server:
401 OK (Authorization Error: \'Expired timestamp: given 1446087606 and now
1446094822 has a greater difference than threshold 300\')\nERROR failed to
bootstrap environment: subprocess encountered error code 1\n\')'), 1),
(u'waiting', 179), (u'succeeded', 10)]"
The only thing a user can do is touch each machine, sometimes booting them into an OS to fix their hwclock (which can still drift from that point, of course).
This error path is exposed when the stock 'ntpdate' from ubuntu does not work, for instance, if your lab is behind a proxy.
description: | updated |
tags: | removed: kanban-cross-team |
description: | updated |
Changed in falkor: | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in falkor: | |
assignee: | nobody → Chris Glass (tribaal) |
Changed in falkor: | |
status: | Triaged → In Progress |
Changed in falkor: | |
status: | In Progress → Fix Committed |
Changed in falkor: | |
milestone: | none → 0.15 |
status: | Fix Committed → Fix Released |
Changed in maas: | |
status: | Incomplete → Confirmed |
Changed in curtin: | |
status: | New → Triaged |
Changed in maas: | |
status: | Incomplete → Confirmed |
milestone: | none → 2.1.0 |
tags: | added: hs-arm64 |
Changed in maas: | |
milestone: | 2.0.1 → 2.1.0 |
importance: | Undecided → Critical |
status: | Confirmed → In Progress |
assignee: | nobody → Gavin Panella (allenap) |
Hi David,
I don't fully understand what the involvement of MAAS is here in Juju failing, however, when MAAS deploys a machine, it ensures that the clock is the same among all machines. Otherwise, machines wouldn't be able to access the metadata server on the deployment process (this also affects enlistment and commissioning).
Now, my question is whether the Juju client isbeing run on a different machine that has a different clock time than the MAAS server? Hence causing the maas provider to fail?