Comment 13 for bug 1577949

Revision history for this message
Cheryl Jennings (cherylj) wrote :

I was able to find that the machine agent, after the upgrade, thinks that the unit is not deployed and tries to redeploy it (which includes resetting the unit agent's password).

The determination of whether or not the unit is deployed is done by querying services that are running on the machine. Inspecting running services after the failed upgrade showed that the unit agent was running.

I added some extra logging and this time, and the upgrade test passed in CI.

The list of services that are running on the machine is populated when the deployer worker starts, and that list is queried when we get our initial event from the watcher. I suspect that this is just a timing issue where the unit agent isn't started yet when we do the initial query.

We can narrow the gap by re-querying the list of running services in the handler right before we attempt to deploy new services.

Long term (aka 2.0), there may be a better way to determine if we've already deployed a particular service.