Unable to use the fast-installer in Maas 1.7 Beta6 Trusty with the HP moonshot x64 Avoton cartridges

Bug #1377969 reported by Sean Feole
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Invalid
High
Newell Jensen

Bug Description

Maas, 1.7 beta4
maas:
  Installed: 1.7.0~beta4+bzr3168-0ubuntu1~trusty1
  Candidate: 1.7.0~beta4+bzr3168-0ubuntu1~trusty1
  Version table:
 *** 1.7.0~beta4+bzr3168-0ubuntu1~trusty1 0
        500 http://ppa.launchpad.net/maas-maintainers/experimental/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     1.5.4+bzr2294-0ubuntu1.1 0
        500 http://archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages
     1.5+bzr2252-0ubuntu1 0
        500 http://archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages

I am trying to add Avaton X64 boards from a HP Moonshot to Maas Trusty 1.7 Beta4. It appears that the fast path installer no longer works. This did appear to have functioned properly in MaaS 1.6/Stable

http://pastebin.ubuntu.com/8507636/

   58.892184] sd 2:0:0:1: Attached scsi generic sg2 type 0
[ 58.956496] sd 2:0:0:1: [sdb] 2883584 512-byte logical blocks: (1.47 GB/1.37 GiB)
[ 59.046271] sd 2:0:0:1: [sdb] 4096-byte physical blocks
[ 59.109799] sd 2:0:0:1: [sdb] Write Protect is on
[ 59.166588] sd 2:0:0:1: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 59.279639] sdb: unknown partition table
[ 59.330512] sd 2:0:0:1: [sdb] Attached SCSI disk
[ 59.392449] EXT4-fs (sdb): mounted filesystem with ordered data mode. Opts: (null)
[ 59.715355] random: init urandom read with 121 bits of entropy available
[ 59.826700] random: nonblocking pool is initialized

The commisioning hangs at this point. I have imported the default amd64/generic boot images. Which should be sufficient in booting these cartridges.

Upon switching the node to use the debian installer that does appear to fix the issue, however, the user is prompted to input specific selections during the course of install. it would be ideal to get the fast installer working again as this appears to be a regression from the previous version of maas

I did not see any logs in /var/log/maas that stood out from the ordinary.

Raghuram Kota (rkota)
tags: added: hs-arm64
Sean Feole (sfeole)
summary: - Unable to use the fast-installer in Maas 1.7 Beta4 with the HP moonshot
- x64 Avaton cartridges
+ Unable to use the fast-installer in Maas 1.7 Beta4 Trusty with the HP
+ moonshot x64 Avaton cartridges
description: updated
Revision history for this message
Julian Edwards (julian-edwards) wrote : Re: [Bug 1377969] Re: Unable to use the fast-installer in Maas 1.7 Beta4 Trusty with the HP moonshot x64 Avaton cartridges

On Monday 06 Oct 2014 15:09:01 you wrote:
> The commisioning hangs at this point. I have imported the default

Just checking, but you mean "installation" and not commissioning here, right?

tags: added: server-hwe
Changed in maas:
assignee: nobody → Andres Rodriguez (andreserl)
Revision history for this message
Julian Edwards (julian-edwards) wrote : Re: Unable to use the fast-installer in Maas 1.7 Beta4 Trusty with the HP moonshot x64 Avaton cartridges

Andres, I don't have a way to triage this, so would you mind assigning to someone that does please.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

This sounds like an issue with the images to me. Newell, can you please triage this ?

Changed in maas:
assignee: Andres Rodriguez (andreserl) → Newell Jensen (newell-jensen)
Christian Reis (kiko)
Changed in maas:
milestone: none → 1.7.0
summary: Unable to use the fast-installer in Maas 1.7 Beta4 Trusty with the HP
- moonshot x64 Avaton cartridges
+ moonshot x64 Avoton cartridges
Changed in maas:
importance: Undecided → High
Revision history for this message
Christian Reis (kiko) wrote : Re: Unable to use the fast-installer in Maas 1.7 Beta4 Trusty with the HP moonshot x64 Avoton cartridges

Sean, can you reproduce this on any other system or cartridge?

Revision history for this message
Sean Feole (sfeole) wrote :

This is also reproducible on the Anders ProLiant m500 Server Cartridge.

Revision history for this message
Christian Reis (kiko) wrote :

There is something odd about this image. Is it really taking 60s to boot up to this point in kernelspace alone?

Neither m500 nor m300 are AMD, by the way; they are Intel-based. m300 is Avoton (Atom C2000) and the m500 is a custom Haswell part.

Changed in maas:
status: New → Incomplete
Revision history for this message
Christian Reis (kiko) wrote :

Are these the 14.04 LTS release images?

Revision history for this message
Sean Feole (sfeole) wrote :

So i have tested with released images and daily,

Now using daily.

maas my-maas boot-source read 1
{
    "url": "http://maas.ubuntu.com/images/ephemeral-v2/daily/",
    "keyring_data": "",
    "keyring_filename": "/usr/share/keyrings/ubuntu-cloudimage-keyring.gpg",
    "id": 1,
    "resource_uri": "/MAAS/api/1.0/boot-sources/1/"
}

Both images appear to hang at the following lines:

[ 59.715355] random: init urandom read with 121 bits of entropy available
[ 59.826700] random: nonblocking pool is initialized

I believe I have recently discovered the cause. Upon reverting back to another installation of 1.6/stable I commissioned the same avaton and it reported to be in the READY state after a few minutes.

This was a fresh installation of 1.6/stable, I removed the node and tested again. This time I added the global kernel params in maas settings so I could watch the startup logs ("console=ttyS0,9600n8r"). Then ran into the original issue again.

After discovering this could be a problem for the x64 nodes , i removed the global kernel params in my maas 1.7 environment and attempted to commission both cartridges. Both the M300 and M500 cartridges appeared to be in the READY state after a few minutes.

It appears that when Global Kernel Params are present in MaaS this causes the x64 bare metal nodes to fail on commissioning. It does not appear to affect the X64 VMs

Revision history for this message
Sean Feole (sfeole) wrote :

I also want to add that I have not confirmed if a specific set of kernel parameters cause the failure. In my case , simply having the following was enough to hinder the commissioning process.

console=ttyS0,9600n8r

Changed in maas:
status: Incomplete → Confirmed
Revision history for this message
Sean Feole (sfeole) wrote :

During the debug process I upgraded my version of Maas to 1.7 Beta5 , newell assisted me with Triage as well

Revision history for this message
Christian Reis (kiko) wrote :

Interestingly Sean told me on IRC that this does not hang when supplied on the kernel commanline on a non-MAAS installation. So the question is, what is actually causing the hang?

Getting the actual commandline from the log Sean posted is fun:

nomodeset
iscsi_target_name=iqn.2004-05.com.ubuntu:maas:ephemeral-ubuntu-amd64-generic-trusty-release
iscsi_target_ip=10.228.65.104 iscsi_target_port=3260
iscsi_initiator=c5n1-avaton ip=::::c5n1-avaton:BOOTIF ro
root=/dev/disk/by-path/ip-10.228.65.104:3260-iscsi-iqn.2004-05.com.ubuntu:maas:ephemeral-ubuntu-amd64-generic-trusty-release-lun-1
overlayroot=tmpfs
cloud-config-url=http://10.228.65.104/MAAS/metadata/latest/by-id/node-7c8777ea-4d61-11e4-a982-f0921cb4d66c/?op=get_preseed
log_host=10.228.65.104 log_port=514 -- console=ttyS0,9600n8r
initrd=ubuntu/amd64/generic/trusty/release/boot-initrd
BOOT_IMAGE=ubuntu/amd64/generic/trusty/release/boot-kernel
BOOTIF=01-f0-92-1c-b4-d4-28.

Could it be that the "--" is what is hanging the output? What do double-dashes mean in kernel commandline syntax?

Revision history for this message
Christian Reis (kiko) wrote :

Hah! https://www.kernel.org/doc/Documentation/kernel-parameters.txt

The kernel parses parameters from the kernel command line up to "--";
if it doesn't recognize a parameter and it doesn't contain a '.', the
parameter gets passed to init: parameters with '=' go into init's
environment, others are passed as command line arguments to init.
Everything after "--" is passed as an argument to init.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

The "--" was originally used as a way to tell the installer (d-i and then Curtin) to persist kernel options in the installed node. Without it the options don't "stick" after installation.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Are you sure it's hanging and not just losing serial console output? Does the node stay in the commissioning state for a very long time? Is there an IP address listed for it in the UI? Can you ping that IP when it's in this state? What if you disconnect/reconnect from the serial console?

Changed in maas:
status: Confirmed → Incomplete
Revision history for this message
Sean Feole (sfeole) wrote :

I apologize about the delay in this update as I have tending to other tasks.

I spent some time on this tonight, to answer Jasons questions, I'm not entirely sure if the node is hanging 100% at this point only because I can ping the node. It does most certainly stay in a commissioning state for over 10 minutes. I can ping the IP address however the ssh daemon does not appear as if it's running, so constantly get a connection refused. Disconnecting and reconnecting the serial console does not appear to have any affect. The MAAS UI does not change in any way, the node never updates its CPU/MEMORY statistics. Even after 20 minutes it still remains in the commissioned state. smoser has giving me some suggestions to try and get more information from the console. I will try those and update the bug

Furthermore,

I removed the kernel options lines in MAAS preferences again and re-commissioned the node. During the process I took screenshots of the node in the MAAS UI. Which I can attach to this bug. In 3 minutes the node updated its CPU/MEMORY stats, and in 9 minutes switched to a READY state and powered off the node.

Interestingly enough, after the node powered off, I was not able to start my x64 avaton, after an selecting "Acquire/Start". The node failed with the following error:

----------------------------
Failed to power on node — Node could not be powered on: moonshot failed with return code 127: /bin/sh: 42: /usr/bin/ipmitool: not found Got unknown power state from ipmipower: '' /bin/sh: 42: /usr/bin/ipmitool: not found
----------------------------

Under the node -> EDIT NODE the power type had changed from "Moonshot HP ILO Manager" to "ILO4 Moonshot Chassis"
and the power address had changed from "c5n1" to "-B 0 -T 0x8a -b 7 -t 0x72 -m 0x20" \

This is wrong and perhaps a regression which might even require a separate bug. Please see screenshot ("edit_node.png")

Revision history for this message
Sean Feole (sfeole) wrote :

Attached is a tarball with screenshots as described in the previous comment.

Changed in maas:
status: Incomplete → Confirmed
Revision history for this message
Sean Feole (sfeole) wrote :

Also I have updated to Maas Beta6 , fyi

ii maas 1.7.0~beta6+bzr3246-0ubuntu1~trusty1 all MAAS server all-in-one metapackage

summary: - Unable to use the fast-installer in Maas 1.7 Beta4 Trusty with the HP
+ Unable to use the fast-installer in Maas 1.7 Beta6 Trusty with the HP
moonshot x64 Avoton cartridges
Revision history for this message
Newell Jensen (newell-jensen) wrote :

Sean,

Are you still able to show that this is related to setting the Kernel Parameters?

Revision history for this message
Sean Feole (sfeole) wrote :

Yes, I have 9 avaton nodes running on my Maas env at the moment, I left my global kernel parameters and created a new tag via the MaaS CLI, I added all of the system id's of my avatons to that custom tag,

As you can see below the new tag called "avaton" simply sends the kernel param "nomodeset", I just chose a random parameter to send, nothing special about it, I just wanted to override the nodes so that they do not inherit the global kern param "console=ttyS0,9600n8r"

This seemed to have remedied the issue temporarily

ubuntu@ubuntu:~$ maas my-maas tags list
[
    {
        "comment": "",
        "definition": "",
        "resource_uri": "/MAAS/api/1.0/tags/virtual/",
        "name": "virtual",
        "kernel_opts": null
    },
    {
        "comment": "avaton tag , no console redirect",
        "definition": "",
        "resource_uri": "/MAAS/api/1.0/tags/avaton/",
        "name": "avaton",
        "kernel_opts": "nomodeset"
    }
]

Revision history for this message
Christian Reis (kiko) wrote :

Sean, everything here suggests it's a problem with the image being used (in combination with the arguments). To help confirm, can you show what the kernel commandline looks like for the working boot (i.e. with no console)?

Revision history for this message
Christian Reis (kiko) wrote :

Sean, as for:

> ----------------------------
> Failed to power on node — Node could not be powered on: moonshot failed with return code 127: /bin/sh: 42: /usr/bin
> /ipmitool: not found Got unknown power state from ipmipower: '' /bin/sh: 42: /usr/bin/ipmitool: not found
> ----------------------------
>
> Under the node -> EDIT NODE the power type had changed from "Moonshot HP ILO Manager" to "ILO4 Moonshot Chassis"
> and the power address had changed from "c5n1" to "-B 0 -T 0x8a -b 7 -t 0x72 -m 0x20" \

I am thinking this changed because the ephemeral image detected these settings as part of enlistment. Is this cartridge in an fponk or a chassis?

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

All the Avaton cartridges are loaded into the Moonshot chassis. We're currently only running the Slaytons and ancient "A2" McDivitts in FPonks.

Revision history for this message
Christian Reis (kiko) wrote :

Then I don't understand Sean's comment -- he said that the power type /was/ ILO Manager and was changed to Moonshot Chassis. Was it the other way around?

Revision history for this message
Sean Feole (sfeole) wrote :

In comment #15 wrote:

Under the node -> EDIT NODE the power type had changed from "Moonshot HP ILO Manager" to "ILO4 Moonshot Chassis"
The power type was originally correct, otherwise I could never get the node to begin commissioning. I have filed another bug to address this issue.

Please see bug : https://bugs.launchpad.net/maas/+bug/1377969

--------------------------------

In response to Christians question here is the boot cmd used when powering on a cartridge that works.

[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.13.0-35-generic root=UUID=f16343fc-8582-4ffa-ad03-d0024e9a4100 ro nomodeset

tags: added: hs-moonshot
Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Hi Sean, I think "X64 HP Moonshot M300 Cartridges change their power type and power address after commissioning on Maas 1.7 Beta6" https://bugs.launchpad.net/maas/+bug/1382075, is the one you've spun out to cover the change of power type.

tags: added: hs-moonshot-maas-juju
removed: hs-moonshot
Revision history for this message
Christian Reis (kiko) wrote :

One last question. Do the same boot arguments which hang curtin work with debian installer? If so, what does the kernel commandline look like in the d-i case?

Revision history for this message
Newell Jensen (newell-jensen) wrote : Re: [Bug 1377969] Re: Unable to use the fast-installer in Maas 1.7 Beta6 Trusty with the HP moonshot x64 Avoton cartridges

boot-initrd and boot-kernel (boot images for curtin) are by their very
nature different than di-initrd and di-kernel (boot images for di), so the
answer to your first question is 'no'.

On Sat, Oct 18, 2014 at 6:28 AM, Christian Reis <email address hidden> wrote:

> One last question. Do the same boot arguments which hang curtin work
> with debian installer? If so, what does the kernel commandline look like
> in the d-i case?
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1377969
>
> Title:
> Unable to use the fast-installer in Maas 1.7 Beta6 Trusty with the HP
> moonshot x64 Avoton cartridges
>
> Status in MAAS:
> Confirmed
>
> Bug description:
> Maas, 1.7 beta4
> maas:
> Installed: 1.7.0~beta4+bzr3168-0ubuntu1~trusty1
> Candidate: 1.7.0~beta4+bzr3168-0ubuntu1~trusty1
> Version table:
> *** 1.7.0~beta4+bzr3168-0ubuntu1~trusty1 0
> 500
> http://ppa.launchpad.net/maas-maintainers/experimental/ubuntu/
> trusty/main amd64 Packages
> 100 /var/lib/dpkg/status
> 1.5.4+bzr2294-0ubuntu1.1 0
> 500 http://archive.ubuntu.com/ubuntu/ trusty-updates/main amd64
> Packages
> 1.5+bzr2252-0ubuntu1 0
> 500 http://archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages
>
> I am trying to add Avaton X64 boards from a HP Moonshot to Maas Trusty
> 1.7 Beta4. It appears that the fast path installer no longer works.
> This did appear to have functioned properly in MaaS 1.6/Stable
>
> http://pastebin.ubuntu.com/8507636/
>
> 58.892184] sd 2:0:0:1: Attached scsi generic sg2 type 0
> [ 58.956496] sd 2:0:0:1: [sdb] 2883584 512-byte logical blocks: (1.47
> GB/1.37 GiB)
> [ 59.046271] sd 2:0:0:1: [sdb] 4096-byte physical blocks
> [ 59.109799] sd 2:0:0:1: [sdb] Write Protect is on
> [ 59.166588] sd 2:0:0:1: [sdb] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [ 59.279639] sdb: unknown partition table
> [ 59.330512] sd 2:0:0:1: [sdb] Attached SCSI disk
> [ 59.392449] EXT4-fs (sdb): mounted filesystem with ordered data mode.
> Opts: (null)
> [ 59.715355] random: init urandom read with 121 bits of entropy
> available
> [ 59.826700] random: nonblocking pool is initialized
>
> The commisioning hangs at this point. I have imported the default
> amd64/generic boot images. Which should be sufficient in booting
> these cartridges.
>
> Upon switching the node to use the debian installer that does appear
> to fix the issue, however, the user is prompted to input specific
> selections during the course of install. it would be ideal to get
> the fast installer working again as this appears to be a regression
> from the previous version of maas
>
> I did not see any logs in /var/log/maas that stood out from the
> ordinary.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/maas/+bug/1377969/+subscriptions
>

Revision history for this message
Sean Feole (sfeole) wrote :

I had some time to spend on the x64 nodes this morning, I also upgraded to the latest "released" FW on these cartridges supplied to me on friday. All of my 14 nodes are now running the same released version I am able to now successfully redirect serial output and watch commissioning

Definitely safe to close this bug out as it turning out to be the hw. however I'm still going to have to resort to using MAAS tags since the serial framing parameters are going to be different on x64 vs armhf.

Changed in maas:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.