dd-format images fail to deploy

Bug #1393953 reported by Daniel Manrique on 2014-11-18
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
High
Blake Rouse
1.7
High
Blake Rouse

Bug Description

I'm using MAAS from the experimental PPA, version is 1.7.0~rc3+bzr3299-0ubuntu1~trusty1 (just upgraded this morning).

I'm trying to deploy a ddtgz format image, but the image is never sent to the node, which then appears as "failed deployment" in the MAAS UI.

The image is based on an OEM version of Ubuntu (though as you'll see below, I think this would happen with any other dd-format image). To create my image, I installed this in a VM, then shut the VM down, connected the qcow image and did "dd if=/dev/nbd0 of=myimage.dd". Finally, I tarred the dd file and uploaded to MAAS like this:

time maas roadmr boot-resources create name=my-image-dd title='roadmr dd image' architecture=amd64/generic content@=/home/roadmr/download/myimage.dd.tar.gz filetype=ddtgz

maas boot-resource read shows this:

roadmr@thinkpad:~$ maas roadmr boot-resource read 22
{
    "name": "my-image-dd",
    "title": "roadmr dd image",
    "sets": {
        "20141118": {
            "files": {
                "root-dd": {
                    "filename": "root-dd",
                    "filetype": "root-dd",
                    "sha256": "e2fc5a1259df5b658d714aee29cfe7580a31a3ed9be69cd8a69f3cdd565f8baa",
                    "complete": true,
                    "size": 1890612863
                }
            },
            "label": "uploaded",
            "version": "20141118",
            "complete": true,
            "size": 1890612863
        }
    },
    "architecture": "amd64/generic",
    "subarches": "generic",
    "type": "Uploaded",
    "id": 22,
    "resource_uri": "/MAAS/api/1.0/boot-resources/22/"
}

then in the UI, I "acquire and start" the node, which boots the installer, then barks something at me, then reboots and gets stuck at the "boot:" prompt. On the UI I see "Failed deployment" and this is in the install log:

--2014-11-18 21:47:33-- http://10.10.10.1/MAAS/static/images/custom/amd64/generic/my-image-dd/uploaded/root-tgz
Connecting to 10.10.10.1:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2014-11-18 21:47:33 ERROR 404: Not Found.

gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
Unexpected error while running command.
Command: ['sh', '-cf', 'wget "$1" --progress=dot:mega -O - |tar -C "$2" --xattrs --xattrs-include=* -Sxpzf - --numeric-owner', '--', 'http://10.10.10.1/MAAS/static/images/custom/amd64/generic/my-image-dd/uploaded/root-tgz', '/tmp/tmp1LPsHt/target']
Exit code: 2
Reason: -
Stdout: ''
Stderr: ''
Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'extract']
Exit code: 3
Reason: -
Stdout: '--2014-11-18 21:47:33-- http://10.10.10.1/MAAS/static/images/custom/amd64/generic/my-image-dd/uploaded/root-tgz\nConnecting to 10.10.10.1:80... connected.\nHTTP request sent, awaiting response... 404 Not Found\n2014-11-18 21:47:33 ERROR 404: Not Found.\n\n\ngzip: stdin: unexpected end of file\ntar: Child returned status 1\ntar: Error is not recoverable: exiting now\nUnexpected error while running command.\nCommand: [\'sh\', \'-cf\', \'wget "$1" --progress=dot:mega -O - |tar -C "$2" --xattrs --xattrs-include=* -Sxpzf - --numeric-owner\', \'--\', \'http://10.10.10.1/MAAS/static/images/custom/amd64/generic/my-image-dd/uploaded/root-tgz\', \'/tmp/tmp1LPsHt/target\']\nExit code: 2\nReason: -\nStdout: \'\'\nStderr: \'\'\n'
Stderr: ''

So curtin (under maas's instructions, I presume) is trying to fetch root-tgz instead of root-dd (which I confirmed exists in /var/lib/maas/boot-resources/current/custom/amd64/generic/my-image-dd/uploaded).

Related branches

Blake Rouse (blake-rouse) wrote :

So the issue is that the OS driver determines the filetype, and the custom OS driver is not checking the file to determine the correct path.

This only occurs for custom images that are dd.

Changed in maas:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Blake Rouse (blake-rouse)
Christian Reis (kiko) on 2014-11-19
Changed in maas:
milestone: none → 1.7.1
Daniel Manrique (roadmr) wrote :

Cool, so adding this method to custom.py to override the one from OperatingSystem (which blindly sends root-tgz as the filename) made it so that maas actually sent the image to the system:

    def get_xinstall_parameters(self, arch, subarch, release, label):
        """Returns the xinstall image name and type for Windows."""
        return "root-dd", "dd-tgz"

This just as blindly sends root-dd and the dd-tgz format, I'm not sure this is correct, nor did I run the test suite to ensure nothing else breaks, but it seems to be a start.

After this, the system wasn't left in a bootable state anyway :/ but it seems like another problem. I'll ask in the IRC channel and either amend this bug or file a new one (or fix things, if I'm doing them wrong)

Daniel Manrique (roadmr) wrote :

OK, I figured this out finally. I did add the get_xinstall_parameters method to custom.py, that correctly sends the dd image and curtin installs it in the node.

Then curtin tries to run hooks (curtin curthooks), it looks for a /curtin directory in the installed system, since my custom image lacks this, curtin then proceeds with what I assume are some ubuntu-default hooks, looking for (and failing to find, as my image lacks) an /etc directory.

I solved this by adding /curtin directory to my image, containing hooks, essentially an empty curtin-hooks and empty finalize files (as I want curtin to mostly leave the image alone; it boots fine once the system restarts).

So other than an image with proper curtin/ directory, all I needed was get_xinstall_parameters. A definitive solution would properly recognize the type of image, since this hack just defaults to dd-tgz.

Changed in maas:
status: Triaged → In Progress
milestone: 1.7.1 → next
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released

Hello Daniel, or anyone else affected,

Accepted maas into utopic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/maas/1.7.5+bzr3369-0ubuntu1~14.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-needed
Andres Rodriguez (andreserl) wrote :

This issue has been verified to work both on upgrade and fresh install, and has been QA'd. Marking verification-done.

tags: added: verification-done
removed: verification-needed
Changed in maas:
milestone: next → none
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers