MAAS failing to enlist nodes in the Lenovo lab

Bug #1289485 reported by Diogo Matsubara
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Critical
Julian Edwards

Bug Description

1.5+bzr1977+2101+246~ppa0~ubuntu14.04.1

In http://d-jenkins.ubuntu-ci:8080/view/MAAS/job/trusty-adt-maas-manual/78/console job run, the nodes failed the enlistment process with the following error: {"architecture": ["amd64/generic is not a valid architecture. It should be one of: armhf, i386."]}

Node console screenshot: http://people.canonical.com/~matsubara/maas-declared-node-failing.png

After the screenshot above the node turned itself off and no nodes were listed in the MAAS UI

Tags: hwe

Related branches

Revision history for this message
Diogo Matsubara (matsubara) wrote :
Revision history for this message
Raphaël Badin (rvb) wrote :

This is related to the changes from this week's sprint. The authoritative source for available architectures is now the BootImage ta ble (instead of the late Architecture enum). The generation of the choice object used in the API and the UI need to be fixed to use the 'subarchitecture' field of the BootImage objects, in addition to the 'architecture' field.

More precisely, src/maasserver/models/bootimage.py: BootImageManager.get_usable_architectures needs to be fixed to return a list of 'arch/subarch' instead of a list of 'arch'.

Changed in maas:
milestone: none → 14.04
status: New → Triaged
Revision history for this message
Raphaël Badin (rvb) wrote :

I see a second problem in the screenshot you posted: maas-enlist says "successfully enlisted"! It shouldn't print that message since the enlistment failed.

tags: added: hwe
Changed in maas:
status: Triaged → Fix Committed
Raphaël Badin (rvb)
Changed in maas:
assignee: nobody → Julian Edwards (julian-edwards)
Revision history for this message
Diogo Matsubara (matsubara) wrote :

Hi Julian, I ran the test again in the lab and got a similar error: http://people.canonical.com/~matsubara/maas-declared-node-failing-02.png

The main difference is that in the OP it said: [...It should be one of: armhf, i386."] and now with your fix it says: [... It should be one of: armhf/generic, armhf/highbank, i386/generic."]

The package used was: 1.5+bzr1977+2110+246~ppa0~ubuntu14.04.1 and the console output is: http://d-jenkins.ubuntu-ci:8080/view/MAAS/job/trusty-adt-maas-daily/137/console

Changed in maas:
status: Fix Committed → In Progress
Raphaël Badin (rvb)
Changed in maas:
status: In Progress → Triaged
Revision history for this message
Diogo Matsubara (matsubara) wrote :

@Raphaël, filed bug 1290848 for the successful enlistment message.

Revision history for this message
Raphaël Badin (rvb) wrote :

By the looks of it, the image for amd64/generic hasn't been imported.

Revision history for this message
Diogo Matsubara (matsubara) wrote :

Looks like the image is there:

ubuntu@autopkgtest:/var/lib/maas/ephemeral$ ls -alh trusty/ephemeral/amd64/20140211/
total 1.4G
drwxr-xr-x 2 root root 4.0K Mar 11 09:31 .
drwxr-xr-x 3 root root 4.0K Mar 11 09:30 ..
-rw-r--r-- 1 root root 1.4G Feb 11 00:34 disk.img
-rw-r--r-- 1 root root 311M Mar 11 09:31 dist-root.tar.gz
-rw-r--r-- 1 root root 93 Mar 11 09:31 info
-rw-r--r-- 1 root root 23M Feb 11 00:33 initrd.gz
-rw-r--r-- 1 root root 5.5M Feb 11 00:33 linux
-rw-r--r-- 1 root root 180 Mar 11 09:31 tgt.conf

Revision history for this message
Raphaël Badin (rvb) wrote :

Here is the list of the bootimages on the lab machine:

>>> for b in BootImage.objects.all():
... print b.__repr__()
...
<BootImage amd64/generic-saucy-install>
<BootImage amd64/generic-quantal-install>
<BootImage amd64/generic-lucid-install>
<BootImage amd64/generic-trusty-install>
<BootImage amd64/generic-precise-install>
<BootImage i386/generic-saucy-install>
<BootImage i386/generic-quantal-install>
<BootImage i386/generic-lucid-install>
<BootImage i386/generic-trusty-install>
<BootImage i386/generic-trusty-xinstall>
<BootImage i386/generic-trusty-commissioning>
<BootImage i386/generic-precise-install>
<BootImage armhf/highbank-quantal-install>
<BootImage armhf/highbank-trusty-xinstall>
<BootImage armhf/highbank-trusty-commissioning>
<BootImage armhf/highbank-precise-install>
<BootImage armhf/generic-saucy-install>
<BootImage armhf/generic-trusty-install>
<BootImage armhf/generic-trusty-xinstall>
<BootImage armhf/generic-trusty-commissioning>
<BootImage amd64/generic-trusty-xinstall>
<BootImage amd64/generic-trusty-commissioning>

>>> ng = NodeGroup.objects.all()
>>> BootImage.objects.get_usable_architectures(ng)
set([u'amd64/generic', u'i386/generic', u'armhf/generic', u'armhf/highbank'])

The amd64/generic images have been imported all right…

Revision history for this message
Raphaël Badin (rvb) wrote :

Looks like the int. tests are not waiting long enough and not the presence of the amd64/generic images is not yet reported when the nodes try to enlist.

Changed in maas:
status: Triaged → Fix Committed
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Ok this is still a problem. Can we force the reporting of images right after m-i-e has run?

Revision history for this message
Raphaël Badin (rvb) wrote :

> Ok this is still a problem. Can we force the reporting of images right after m-i-e has run?

That would be nice indeed. In the meantime, we plan to change the code of the int. tests so that it waits until the bootimages have been reported.

Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.