Comment 4 for bug 1453096

Revision history for this message
Cheryl Jennings (cherylj) wrote :

From Menno:

I remember seeing this when I was working on status just after I started with Canonical. The "down" status isn't set or decided within state and is never reflected in the database. Instead the API server switches out the status when generating the result of the FullStatus API call. See the large comment I wrote to explain this towards the bottom of processAgent in apiserver/client/status.go. It even mentioned that the down status won't be seen by clients using a watcher.

So the "down" status is kinda of synthetic when it really shouldn't be. To fix this ticket, "down" needs to be somehow reflected in the database. If that's done then the AllWatcher API will start reporting machine or unit change events when the agent goes down (I don't think anything needs to change with the watcher code at all).

This might not be easy, especially to do efficiently (this might require some changes in state/presence). Something running in the state server needs to notice when any agent presence has changed and then update the agent's status in the database. It'll need to remember the previous status so that when the agent comes back the old status can be restored (the agent might not set the status back to "started" again itself).

This is the kind of thing that is worth talking to Will or John about as they will probably have some thoughts on how it should be done.