instance-status doesn't update on AllWatcher

Bug #1695335 reported by Cory Johns
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Menno Finlay-Smits
2.2
Fix Released
High
Menno Finlay-Smits

Bug Description

When using the AllWatcher, we get change deltas for machine updates, but it doesn't seem that the instance-status updates after going to pending. The agent-status does get updated as expected, but it means that we can't watch for provisioning failures without polling FullStatus.

Here is an example delta where the instance-status is clearly out of date (since the application has already started installing on that machine): http://pastebin.ubuntu.com/24750344/

I'm also wondering why the agent version isn't populated at all in that delta.

Cory Johns (johnsca)
tags: added: conjure libjuju matrix
Revision history for this message
Adam Stokes (adam-stokes) wrote :

If possible, could we get this looked at for 2.2.1?

Revision history for this message
Cory Johns (johnsca) wrote :

There may be some overlap with lp:1453096

Revision history for this message
Cory Johns (johnsca) wrote :

Tested on both LXD and AWS. There is a test case in libjuju for this at https://github.com/juju/python-libjuju/blob/49fe19ff5754ae8ce9365cd7bddbcd33f565bd69/tests/integration/test_machine.py#L10 (libjuju current has a work-around to make this test case pass, but if you comment out https://github.com/juju/python-libjuju/blob/49fe19ff5754ae8ce9365cd7bddbcd33f565bd69/juju/machine.py#L14 it will fail).

On a side note, there seems to be a difference in the info / message field for the instance-status between providers, with the LXD provider giving a message of "Ready" while the AWS provider gives a message of "ready", hence the need for normalizing it in the test (https://github.com/juju/python-libjuju/blob/49fe19ff5754ae8ce9365cd7bddbcd33f565bd69/tests/integration/test_machine.py#L35). However, since this is just an informational message, I don't think that really matters.

Revision history for this message
Anastasia (anastasia-macmood) wrote :

I am pretty that "Ready" vs "ready" is coming from provider. It probably deserves a separate low priority bug if it is an issue for you... We might be able to translate all provider statuses for consistency sake.

However, AllWatcher part of the bug still needs to be addressed, I believe... Just going through the code to confirm...

Changed in juju:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Menno Finlay-Smits (menno.smits)
milestone: none → 2.3-alpha1
Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

I've tried to replicate this with the LXD provider and I see the instance-status changes.

See the test client I used and observed output here: https://gist.github.com/mjs/484dd863dd6bc593bb2690f7450d7cc6

This was tested with the latest "develop" branch (what will become 2.3) and 2.2.2.

The odd thing I do see is that the instance status is reported as changing from pending to running and then back to pending before going to running again. Checking the status history for the machine in the DB reveals that the instance status never actually made these transitions so it appears there is an allwatcher bug lurking there but not the one that was reported.

Changed in juju:
status: Triaged → In Progress
Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

Cory & Adam: Can you confirm the version of Juju you were seeing this with and report any further information that may be relevant?

Revision history for this message
Cory Johns (johnsca) wrote :

Menno: I just reproduced it on a Juju 2.3-alpha1.1 LXD controller using the test mentioned in comment #3 and captured all of the raw websocket frame data that the libjuju client saw: http://pastebin.ubuntu.com/25184849/

Note that `juju status` does show the right statuses, presumably since it's making a new API connection each time (which is also why we can work around this by issuing a FullStatus request), but no delta frame ever comes through with anything other than "pending" for "instance-status".

Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

I didn't realise that libjuju had a workaround built in! This is why I didn't see the problem.

When using a simple Go based client I can reproduce the issue.

https://gist.github.com/mjs/43c83cc6e0e9d3b85d7f5f50f7e501c5

Digging now...

Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

(I should have read your earlier comment more carefully)

Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

Bug found and the fix is easy. I'll get a PR up soon (and address the gap in unit test coverage).

Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :
Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :
Changed in juju:
status: In Progress → Fix Committed
Revision history for this message
Cory Johns (johnsca) wrote :

Awesome! Glad it was an easy fix.

Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.