UI doesn't warn about missing power management tools

Bug #1381000 reported by Graham Binns on 2014-10-14
48
This bug affects 6 people
Affects Status Importance Assigned to Milestone
MAAS
Critical
Jeffrey C Jones

Bug Description

With a fresh install of MAAS powering my NUCs, I can auto-enlist all the machines, but then get red dots next to their names… why? Because amtterm isn't installed on the cluster, so amttool doesn't exist and the cluster can't query the nodes (or power them up, for that matter).

Of course, I'm familiar with the problem, but I'd forgotten about it. Nothing in the UI tells me that this is the issue; I had to look in maas.log to find out what was going on. The errors are clear there, but they should be displayed in the UI, tool, especially since we already warn about missing power parameters. (So you can enter power parameters, the warning will go away, and the nodes will *still not work*).

Bug #1382075 also demonstrates this: ipmitool is missing on the cluster, but nothing tells the user that except the logs.

Related branches

Graham Binns (gmb) on 2014-10-14
tags: added: confusing-ui ui
Raphaël Badin (rvb) wrote :

Although MAAS could warn in advance, if you try to power up your nodes the node event log will contain an error telling you about the missing package.

Changed in maas:
importance: High → Low
Graham Binns (gmb) on 2014-10-20
Changed in maas:
milestone: none → next
Graham Binns (gmb) on 2014-10-20
summary: - UI doesn't warn about missing amtterm package for AMT nodes
+ UI doesn't warn about missing power management tools
description: updated
Changed in maas:
assignee: nobody → Graham Binns (gmb)
status: Triaged → In Progress
Graham Binns (gmb) wrote :

The way I think we should fix this is as follows:

 0. Have power drivers declare which tools they require in order to work.
 1. Have a periodic service running on the cluster which checks for the power
    management tools required by installed power drivers
 2. When the service runs and finds that a tool is missing, it reports the
    missing tool to the region.
 3. The region shows a warning:
    i. To admins.
    ii. Only if there are nodes that require the missing tool (so if you only
       have NUCs you won't get warnings about a missing ipmitool.
 4. When the service runs and finds all the tools it needs, it tells the
    region; the region can then cancel any outstanding warnings.

The only problem with this approach is that there'll be a lag between
installing the necessary packages and the warning going away.

Julian Edwards (julian-edwards) wrote :

I think this approach is overkill.

Let's make it so that the UI has a second stage (can be JS) where you need to test the parameters entered before allowing the edit node form to be saved. All of the infrastructure is already in place to do this because we have the "test power" button on the main node page.

The API can also do a power check when editing a node power parameter (or adding a node that has power defined).

On 21 October 2014 01:18, Julian Edwards <email address hidden> wrote:
> All of the infrastructure is already in place to do this
> because we have the "test power" button on the main node page.

Ah! That didn't occur to me. Yes, in light of that, my approach really
is overkill :).

Julian Edwards (julian-edwards) wrote :

On Tuesday 21 Oct 2014 06:46:39 you wrote:
> On 21 October 2014 01:18, Julian Edwards <email address hidden> wrote:
> > All of the infrastructure is already in place to do this
> > because we have the "test power" button on the main node page.
>
> Ah! That didn't occur to me. Yes, in light of that, my approach really
> is overkill :).

We may want an override confirmation though so people can say "yes, yes I know
I am being a moron at the moment but I will fix the cluster later"

Graham Binns (gmb) wrote :

> We may want an override confirmation though so people can say "yes, yes I know
> I am being a moron at the moment but I will fix the cluster later"

An easier (read: less JS-y) fix, I think, would be to add a warning message when you save the node. I think that would be sufficient (and also easier to re-use in some way for the API).

Graham Binns (gmb) wrote :

Actually, no, scratch that… it could take up to 30 seconds for the power controller's reply to come through, and it ties up an appserver thread in the process.

Graham Binns (gmb) wrote :

Pushing this back to triaged; it's low priority and there's still work to be done around Node.start() and Node.stop().

Changed in maas:
assignee: Graham Binns (gmb) → nobody
status: In Progress → Triaged
Carla Berkers (carlaberkers) wrote :

Fixed in 1.8b3: missing power info is indicated by a red power icon on node listing pages, and in the header on node detail pages.

Blake Rouse (blake-rouse) wrote :

This is different Carla. Its about the tool itself that is used to power a node on, not that its just missing the power type. So it has not been fixed.

Changed in maas:
assignee: nobody → Blake Rouse (blake-rouse)
Changed in maas:
importance: Low → High
Mark Shuttleworth (sabdfl) wrote :

Glad to see Blake taking this on, it bites me every time I setup a MAAS with NUCs.

In summary, MAAS should check out what else is available, and maintain a flag for the JS to check if there might be a problem. So on startup MAAS might check for things like amtterm, and if they are not available, it would include a warning flag in a JSON that the UI has access to. The UI can then auto-warn whenever appropriate. For example:

 * when adding a machine with AMT
 * when looking at a machine with AMT

The equivalent tests / JSON flags would be done for each different set of tools.

Dustin Kirkland  (kirkland) wrote :

Moving from Low --> High.

This is biting users more and more, now, as we're seeing newer AMT nodes (e.g. new NUCs) that don't support amttool, and require a bunch of wsmancli tools.

Changed in maas:
importance: High → Critical
milestone: next → 1.8.3
milestone: 1.8.3 → 1.9.0
Changed in maas:
assignee: Blake Rouse (blake-rouse) → nobody
Changed in maas:
assignee: nobody → Jeffrey C Jones (trapnine)
status: Triaged → Confirmed
status: Confirmed → In Progress
Carla Berkers (carlaberkers) wrote :

When a user selects a power type for which no package is installed
- we should display a warning
- on the row
- the text should tell you which action is required from the user
- copy: "To control power on this node, install the [PACKAGENAME] package on [CLUSTER CONTROLLER NAME]"
- style: in row notification style with a warning icon

When a user tries to deploy a node for which the power package isn't installed
- we should display an error
- on the action panel (white area at the top of the page) so it can appear both on node listing and node details pages
- the error should tell the user 1. what is happening 2. what is causing the error and 3. how they can fix it
- copy: "[#] nodes cannot be deployed due to missing power configuration. To proceed, install the [PACKAGENAME] package on [CLUSTER CONTROLLER NAME]"
- action panel notification style with error icon

Changed in maas:
status: In Progress → Fix Committed
no longer affects: maas/1.8
Jacob Gadikian (faddat) wrote :

REcommendation: when MAAS is installed, install ALL of the power management packages that COULD be needed. This way, we achieve "It Just Works."

Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers