Comment 8 for bug 1514874

Revision history for this message
Jorge Niedbalski (niedbalski) wrote : Re: Invalid entity name or password error, causes Juju to uninstall

Eric,

Thanks for your comments on this.

Well, please consider one important fact; we didn't make any specific change in the 1.25.5 environment including state or units, the only action that was triggered by the operator was to reboot some of the units.

You can easily reproduce this behavior by altering the agent.conf file, but as I mentioned before, this was not the case on the 2 environments experiencing this issue, it just occurs on some units after rebooting them. So, some sort of condition is causing
the API authentication to fail and this is causing the agent termination to take place.

What I would really like to understand in first place is why the "unauthorized access" or "not provisioned" error is
returned in first place without intervening the environment.

- Does is this error can be wrapping/hiding any other error code or not well handled exception?

The discussion around if worker.ErrTerminateAgent should cause the agent to uninstall, obviously requires to track down
that code's history in order to understand what was the rationale behind this decision. In my opinion, this should be never a decision taken by a piece
of software unless this is directly instructed by the machine operator, so having a flag such as DO_NOT_UNINSTALL enabled by default , makes a lot of sense to me.

Some of the given suggestions make a lot of sense, specially the option 2. You can retry the connection a few times,
and after that , disable the agent without reinstalling, at least this gives us some possibilities to recover or inspect the unit manually.

At the moment we are afraid of this behavior to be repeated on other units and leave the units not operable by
juju, any mechanism that can allows to re-take that specific unit after the uninstall would be also a great addition.

By the moment, we are looking for a way to prevent this to happen in 1.25.5 and understand the root causes of this.