juju bootstrap failing on trusty

Bug #1271285 reported by Diogo Matsubara
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Triaged
Critical
Unassigned

Bug Description

maas-integration.TestMAASIntegration.test_juju_bootstrap test fails to bootstrap a precise node. During the test run, I was looking at the node console and confirmed the OS gets installed and at the end, when the installer tries to reboot the machine to complete installation, the machine turns off rather than reboot.

See http://d-jenkins.ubuntu-ci:8080/view/MAAS/job/trusty-adt-maas-daily/55/console for the tests logs.

MAAS version: 1.4+bzr1820+dfsg+1837+227~ppa0~ubuntu14.04.1 (trusty PPA)
Juju version: 1.17.0-0ubuntu2
Preseed passed to the installer: http://pastebin.ubuntu.com/6793012/
Rsyslog from the node: http://pastebin.ubuntu.com/6793302/
juju config: http://pastebin.ubuntu.com/6793123/

Attached are all the logs collected from the test run.

Revision history for this message
Diogo Matsubara (matsubara) wrote :
description: updated
description: updated
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Seems like you already talked to Colin about an installer bug perhaps?

Changed in maas:
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
Colin Watson (cjwatson) wrote :

This is installing precise, and there've been no changes to the precise installer code itself since the last test run that Jenkins reports as passing (a week ago); I think it's very unlikely that an installer bug is responsible. However, the kernel has a fair bit of involvement with whether reboot(LINUX_REBOOT_CMD_RESTART) (what busybox init eventually calls when told to reboot) turns into an actual reboot or a power-off. I think you'd probably have a more fruitful time pursuing that line of investigation.

It would be worth establishing which URL you're fetching your installer images from, in order to work out which kernel you're using.

Revision history for this message
Raphaël Badin (rvb) wrote :

The installed files (kernel/initrd) are downloaded from:
http://archive.ubuntu.com/ubuntu/dists/precise/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/
Nothing in there seems to have changed recently.

Revision history for this message
Raphaël Badin (rvb) wrote :

s/installed/installer/

Revision history for this message
Colin Watson (cjwatson) wrote :

Right, /dists/precise/ has been fixed since 12.04 was released. If you're using that, and the test has passed at some point since April 2012, then that rules out an installer bug. (Well, strictly, some of the installer's code is fetched from -updates at run-time; but all the code that governs rebooting is embedded in the initramfs.) It probably also rules out a kernel bug; perhaps this is something wrong with the VM substrate you're using?

You might try the various images in /dists/precise-updates/ for comparison purposes. That would also allow you to try the various enablement kernels.

Revision history for this message
Raphaël Badin (rvb) wrote :

Actually, after a serious debugging session, we now think the problem has nothing to do with MAAS or the installer. What we think happens is that juju is giving up after 10 minutes (that's something new in version 1.17, recently released to Trusty) and orders MAAS to shutdown the node. The fact that it happened right when the node was rebooting after install lead to an amusing confusion.

See bug 1257649.

We're still in the process of confirming this theory by building a custom version of juju with a bigger timeout and testing with that.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.