MAAS should warn on version skew between controllers

Bug #1703035 reported by Mark Shuttleworth
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Mike Pontillo

Bug Description

Each MAAS controller should report its version as part of its regular handshake or database interaction. This version should be displayed in the controllers listing, and if there is a difference between versions in the region then a global notification / warning should be displayed.

I just realised that my rack controllers have a different PPA set than my region controllers :)

Related branches

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Seems like I read your mind, I was just about to file this bug myself!

Changed in maas:
milestone: none → 2.3.0
importance: Undecided → High
status: New → Triaged
Revision history for this message
Mark Shuttleworth (sabdfl) wrote : Re: [Bug 1703035] Re: MAAS should warn on version skew between controllers

:)

Changed in maas:
assignee: nobody → Mike Pontillo (mpontillo)
Revision history for this message
Mike Pontillo (mpontillo) wrote :

For MAAS 2.3 alpha 2, MAAS will start sending, receiving, and logging version strings during rack registration. To finish this work item, we'll also need to:

 - decide what conditions require a notification
 - model the version strings in the database
 - ensure they are presented in the API and UI
 - write a service to monitor version update events, parse the rack/region versions, and create/update/delete notifications

I assume we'd notify on minor version differences. For example, if you have a 2.2.0 rack talking to a 2.2.2 region, that's a notification. But if you have 2.3.0-alpha1 installed and a 2.3.0-alpha2 rack connects, no need to notify.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

On 05/08/17 01:19, Mike Pontillo wrote:
> - decide what conditions require a notification

Any variance in versions warrants a notification that links to the list
of controllers, which should show their versions, where the
administrator can decide. Version skew may be briefly normal during an
upgrade, but it's not normal as a matter of course. No need to
complicate things - either they are all the same, or there is something
odd going on.

> - model the version strings in the database

They are just strings :)

> I assume we'd notify on minor version differences. For example, if you
> have a 2.2.0 rack talking to a 2.2.2 region, that's a notification. But
> if you have 2.3.0-alpha1 installed and a 2.3.0-alpha2 rack connects, no
> need to notify.

I disagree. We don't have effective process in place to test scenarios
like the latter, so we should notify on any discrepancy.

Later, I think we might have a process to let the user say "yes, I am
aware of that particular oddity, don't notify me again" which would
support deliberate testing by users of odd-version rack controllers, for
example. But our starting position should be "all the same, or notify".

Mark

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Sounds good; thanks for clarifying the requirement.

The existing notification facility already supports dismissal, so I don't think we need a new process for that.

Changed in maas:
status: Triaged → In Progress
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
milestone: 2.3.0 → 2.3.0alpha3
Changed in maas:
status: Fix Committed → Fix Released
Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

In testing this over the weekend, I saw 'Unknown' versions for a bunch
of the 2.3.0a3 rack/region controllers. Did we land this without a
mechanism to update the db as to the versions of the various controllers?

Mark

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi Mark,

Since the mechanism to report the version on a running controller was only introduced in MAAS 2.3.0a3, controllers running older versions will be unable to report their running version.

For example, a rack controller running on 2.2 doesn't know how to report its version back to the region, and hence, the UI will show 'unknown' as the version of such controller.

I hope this helps clarify the behavior. If you feel we should improve the wording, we can also find alternate ways of highlighting the lack of version.

Thanks!

Revision history for this message
Mike Pontillo (mpontillo) wrote :

No; as soon as a 2.3.0~alpha3 (or higher) controller registers and completes its first refresh cycle, its version should be recorded in the database, and should be seen on the controller details and listing page.

If you aren't seeing any notifications about a version mismatch, my guess is that it's a UI refresh issue. I would try reloading the UI (shift-reload for good measure) and see if the unknown versions change to known versions. If that doesn't solve the issue, we'll look closer at the code to see if it's possible for the version update to be missed.

Revision history for this message
Mike Pontillo (mpontillo) wrote :

I just looked at the code and verified that a trigger is in place; controller versions /should/ be updated in real time in the UI. We'll keep an eye out for this issue while testing; please file another bug if you narrow down the specifics. The following commands will determine what MAAS has recorded for each version, which would be helpful debug info if you see this again:

$ sudo maas-region dbshell
maasdb=# SELECT n.hostname, ci.version
    FROM maasserver_node n
    LEFT OUTER JOIN maasserver_controllerinfo ci ON n.id=ci.node_id
    WHERE n.node_type IN (2,3,4) ORDER BY version NULLS FIRST;

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Ok, so while preparing a demo, I've managed to reproduce an issue where after a fresh install, the version was not immediately reported. I have not seen this issue in a multinode maas where it was just an upgrade though:

1. Fresh install
2. Version reports 'unknown' (images are still being downloaded by region, rack has no images).
3. I restart maas-rackd, it reconnects and immediately reports the version.

So, I've not waited for a subsequent rack refresh, but it sounds like that should make the version appear.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

Well, I have been running MAAS 2.3.0a3 for a few days, and done a full
set of reboots, and I am seeing:

  * 3 out of 4 controllers reported as 'Unknown'
  * the warning about a version mismatch

The database shows:

  hostname | version
-----------+---------------------------------------------
  scylla |
  charybdis |
  lapsi |
  maas | 2.3.0~alpha3-6250-g58f83f3-0ubuntu1~16.04.1

I believe all the controllers are using the same version:

mark@scylla:~$ sudo apt-cache policy maas-rack-controller
maas-rack-controller:
   Installed: 2.3.0~alpha3-6250-g58f83f3-0ubuntu1~16.04.1
   Candidate: 2.3.0~alpha3-6250-g58f83f3-0ubuntu1~16.04.1

So something is not right.

Mark

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Thanks; looks like a separate issue is preventing the other controllers from refreshing successfully. We'll get to the bottom of it.

Revision history for this message
Mike Pontillo (mpontillo) wrote :

I believe I found the root cause of this issue; filed bug #1718270 to track this.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.