api cannot connect, fills deployed machine log with spam

Bug #1199915 reported by William Reade
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
Critical
John A Meinel

Bug Description

Apparently trying to connect as the correct entity, but repeatedly rejected. Seen on trunk as upgraded from 1.10; not investigated further.

Related branches

Revision history for this message
John A Meinel (jameinel) wrote :

Looking at the log, it is spinning trying to login with "oldpassword". When I look at a freshly started 1.11 trunk, oldpassword is set to "". So it seems to try a different password, and then calls SetPasswords.

Changed in juju-core:
milestone: 1.11.2 → 1.11.3
Revision history for this message
John A Meinel (jameinel) wrote :

Found the problem.

Specifically, 1.10 doesn't set entity.PasswordHash when you set the password for an agent. It just sets the Mongo password.

Which means that after an upgrade, an agent can connect to the DB and do stuff, but cannot connect over the API because it has no (valid) password.

So we need to change the agent runners so that after being upgraded, if they have a State connection, they go check if their entity has a valid password hash, and if not, give it one.

Go Bot (go-bot)
Changed in juju-core:
status: Triaged → Fix Committed
John A Meinel (jameinel)
Changed in juju-core:
status: Fix Committed → In Progress
Revision history for this message
John A Meinel (jameinel) wrote :

I just realized things are probably worse than we expected.

Specifically, on machine-0 being unable to connect the API is bad, but it doesn't stop any of the actual workers.

However, the code in MachineAgent is:

if a.MachineId == "0" {
  ...
  ensureStateWorker()
}
a.runner.StartWorker("api", func() (worker.Worker, error) {
  return a.APIWorker(ensureStateWorker)
}

And the APIWorker call actually starts the state worker *when it sees we need a state connection after reading the API*.

Which means that upgrading juju 1.10 => 1.11 will spam the log file for machine-0, but fail to start any actual worker tasks for all the other machine agents.

Go Bot (go-bot)
Changed in juju-core:
status: In Progress → Fix Committed
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.