Unable to deploy Precise on MaaS 2.3.0 (6434)

Bug #1739761 reported by Po-Hsu Lin on 2017-12-22
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Critical
Andres Rodriguez
2.3
Critical
Unassigned
curtin
Medium
Unassigned
maas-images
Medium
Scott Moser

Bug Description

When I try to deploy Precise image on one of our node, which used to work with Precise image (for kernel SRU propose), it will fail and drop to initramfs

From the console, I can see it pass through the boot-initrd, but stuck after loading the usb-hid driver (please refer to the screenshot)

The image source we're using is from https://images.maas.io/ephemeral-v3/daily/
And the status of the Precise image is "synced".

Related bugs:
 * bug 1746345: need squashfs image of precise for maas v3 stream
 * bug 1746348: curtin support installation of fsimage

Related branches

Po-Hsu Lin (cypressyew) wrote :
Andres Rodriguez (andreserl) wrote :

Hi There.

Precise is End of Life and it is no longer supported.

Changed in maas:
status: New → Won't Fix
Po-Hsu Lin (cypressyew) wrote :

Hello Andres,

Yes I know it's EOL, but we need to run Extended Security Maintenance on it for 3.2 and 3.13 kernel.

Blake Rouse (blake-rouse) wrote :

You cannot use MAAS 2.3 to deploy Precise. You need to use MAAS 2.2 or lower.

Changed in maas:
milestone: none → 2.4.x
Changed in maas:
status: Won't Fix → New
importance: Undecided → Critical
Changed in maas:
assignee: nobody → Andres Rodriguez (andreserl)
status: New → In Progress
Changed in maas-images:
assignee: nobody → Scott Moser (smoser)
Scott Moser (smoser) wrote :

I opened bug 1746348 to cover curtin support of for mounting and copying filesystem image as a installation source.

Scott Moser (smoser) on 2018-02-01
Changed in maas-images:
status: New → Confirmed
importance: Undecided → Medium
Scott Moser (smoser) wrote :

The reason that maas 2.3 cannot already deploy precise is because of changes
in how the ephemeral environment (used for installation) is booted.
MAAS now provides kernel and initramfs over pxe and the root filesystem
over http ('root=http://....' via the cloud-initramfs-rooturl package).
Previously maas used iscsi root for providing the root filesystem.

When installing an ubuntu release, MAAS boots the system into the same
release's ephemeral environment for the installation.

Support for 'rooturl' and some other changes were never backported to
precise. So MAAS is not able to boot an ephemeral environment for precise.
Typica

The solution that is agreed upon is to make MAAS utilize the ephemeral
environment of 16.04 or 14.04 when installing precise.

The changes that needed to be done to accomplish this are:
a.) the maas v3 stream [1] needs to have some deployable filesystem.
A root tarball would have been required no changes to curtin.
We chose to provide a squashfs image though for consistency, so that
all Ubuntu releases in the v3 stream would have a squashfs filesystem.

    [1] http://images.maas.io/ephemeral-v3/daily/precise/

b.) MAAS will have to change to boot 16.04 or 14.04 to install precise.
c.) MAAS will have to provide the installation source to curtin as
http://.../path/to/precise.squashfs rather than 'cp://media/root-ro'
as it uses in other releases.

d.) Curtin will have to add support for installing using a filesystem
image as an installation source as it did previously witih root tarball.

There is one known caveat to this approach that I hit during development
of the curtin code [2]. I hit an issue where mkfs.xfs from the
installation environment chose some default values that were not supported
by the older kernel that would be booted into after installation.
This is a known class of problem. We are sheileded from such mismatch
issues currently as we booting and installing the same release,
so the default mkfs options match.

The problem I found specifically was with xenial's mkfs and trusty's kernel.
I just used those for testing as there is no precise image yet.
Similar problems could occur in any filesystem.

  [2] https://code.launchpad.net/~smoser/curtin/+git/curtin/+merge/336872

Changed in curtin:
status: New → Confirmed
importance: Undecided → Medium
status: Confirmed → In Progress
Changed in maas-images:
status: Confirmed → In Progress
Scott Moser (smoser) wrote :

bug 1746345 will cover the creation of 'a' above. I've put a MP up for maas-images such that running it will create the squashfs image but that will be used only for testing.

description: updated
Scott Moser (smoser) on 2018-02-06
Changed in maas-images:
status: In Progress → Fix Released
status: Fix Released → Fix Committed
Scott Moser (smoser) wrote :

The maas-images changes are in lp:maas-images trunk now. Although as mentioned above, they are not needed. I will plan to mark maas-images as fix-released when there is a squashfs image in http://images.maas.io/ephemeral-v3/daily/precise/ .

The curtin changes are in the linked merge proposal and should arrive in trunk (and the daily curtin archive https://launchpad.net/~curtin-dev/+archive/ubuntu/daily) today.

Scott Moser (smoser) wrote :

This is fixed in curtin in trunk at commit
 063b9fe57e4ace5ed938598be6d900840d9e9bcc
daily archive has sufficient version now.
 https://code.launchpad.net/~curtin-dev/+archive/ubuntu/daily

Changed in curtin:
status: In Progress → Fix Committed
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
milestone: 2.4.x → 2.4.0alpha2
Changed in maas:
status: Fix Committed → Fix Released
Po-Hsu Lin (cypressyew) wrote :

Tested today, the deployment still not working.

MaaS 2.3.2 rev6485
curtin 18.1-1-g45564eef-0ubuntu1~16.04.1

Can you attach curtin install logs ?

On Thu, Apr 26, 2018 at 5:31 AM, Po-Hsu Lin <email address hidden> wrote:
> Tested today, the deployment still not working.
>
> MaaS 2.3.2 rev6485
> curtin 18.1-1-g45564eef-0ubuntu1~16.04.1
>
> --
> You received this bug notification because you are subscribed to curtin.
> Matching subscriptions: curtin-bugs-all
> https://bugs.launchpad.net/bugs/1739761
>
> Title:
> Unable to deploy Precise on MaaS 2.3.0 (6434)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1739761/+subscriptions

Po-Hsu Lin (cypressyew) wrote :

Hello Ryan, here is the log from the MaaS server UI:
https://pastebin.ubuntu.com/p/qRB39w4mfV/

The attachment is the installation log extracted from the curtin-error-logs.tar retrieved from the /var/log/curtin/curtin-error-logs.tar

Looks like this is identical to the error log from MaaS UI
HTH

Ryan Harper (raharper) wrote :

Why doesn't the ephemeral environment have the zfs module?

Traceback (most recent call last):
  File "/curtin/curtin/commands/main.py", line 201, in main
    ret = args.func(args)
  File "/curtin/curtin/commands/block_meta.py", line 52, in block_meta
    meta_custom(args)
  File "/curtin/curtin/commands/block_meta.py", line 1368, in meta_custom
    clear_holders.start_clear_holders_deps()
  File "/curtin/curtin/block/clear_holders.py", line 579, in
start_clear_holders_deps
    util.load_kernel_module('zfs')
  File "/curtin/curtin/util.py", line 327, in load_kernel_module
    subp(['modprobe', '--use-blacklist', module])
  File "/curtin/curtin/util.py", line 263, in subp
    return _subp(*args, **kwargs)
  File "/curtin/curtin/util.py", line 131, in _subp
    cmd=args)
curtin.util.ProcessExecutionError: Unexpected error while running command.
Command: ['modprobe', '--use-blacklist', 'zfs']

This is a xenial environment which should have the zfs module included in it.

maas-images bug?

Can you confirm which MAAS images you're using?

On Thu, Apr 26, 2018 at 11:57 AM, Po-Hsu Lin <email address hidden> wrote:
> Hello Ryan, here is the log from the MaaS server UI:
> https://pastebin.ubuntu.com/p/qRB39w4mfV/
>
> The attachment is the installation log extracted from the curtin-error-
> logs.tar retrieved from the /var/log/curtin/curtin-error-logs.tar
>
> Looks like this is identical to the error log from MaaS UI
> HTH
>
> ** Attachment added: "install.log"
> https://bugs.launchpad.net/maas/+bug/1739761/+attachment/5127771/+files/install.log
>
> --
> You received this bug notification because you are subscribed to curtin.
> Matching subscriptions: curtin-bugs-all
> https://bugs.launchpad.net/bugs/1739761
>
> Title:
> Unable to deploy Precise on MaaS 2.3.0 (6434)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1739761/+subscriptions

Scott Moser (smoser) wrote :

We really probably should make it so that the modprobe is not fatal.
Obviously if you're trying to install zfs it would be fatal, but otherwise
we should go on.

I just went to verify if we have zfs in initramfs. Heres what I found:
$ sstream-mirror --max=1 --progress \
    http://images.maas.io/ephemeral-v3/daily/streams/v1/index.json \
    out.d ftype=boot-initrd arch=amd64 \
    "release~(precise|trusty|xenial|artful|bionic)"
$ for f in $(find out.d/*/amd64 -type f | sort); do
   lsinitramfs "$f" | grep -q "zfs.ko$" && found="YES" || found="NO "
   echo "$found ${f#out.d/}"; done
NO precise/amd64/20170424/hwe-p/generic/boot-initrd
NO precise/amd64/20170424/hwe-q/generic/boot-initrd
NO precise/amd64/20170424/hwe-r/generic/boot-initrd
NO precise/amd64/20170424/hwe-s/generic/boot-initrd
NO precise/amd64/20170424/hwe-t/generic/boot-initrd
NO trusty/amd64/20180419/hwe-t/generic/boot-initrd
NO trusty/amd64/20180419/hwe-u/generic/boot-initrd
NO trusty/amd64/20180419/hwe-v/generic/boot-initrd
NO trusty/amd64/20180419/hwe-w/generic/boot-initrd
NO trusty/amd64/20180419/hwe-x/generic/boot-initrd
NO trusty/amd64/20180419/hwe-x/lowlatency/boot-initrd
YES artful/amd64/20180425/ga-17.10/generic/boot-initrd
YES artful/amd64/20180425/ga-17.10/lowlatency/boot-initrd
YES bionic/amd64/20180426.2/ga-18.04/generic/boot-initrd
YES bionic/amd64/20180426.2/ga-18.04/lowlatency/boot-initrd
YES xenial/amd64/20180424/ga-16.04/generic/boot-initrd
YES xenial/amd64/20180424/ga-16.04/lowlatency/boot-initrd
YES xenial/amd64/20180424/hwe-16.04-edge/generic/boot-initrd
YES xenial/amd64/20180424/hwe-16.04-edge/lowlatency/boot-initrd
YES xenial/amd64/20180424/hwe-16.04/generic/boot-initrd
YES xenial/amd64/20180424/hwe-16.04/lowlatency/boot-initrd

The summary is all current initramfs for v3 images have zfs except precise and trusty.

modprobe could fail for another reason. but it looks like at least 'zfs.ko' is present.

Po-Hsu Lin (cypressyew) wrote :

We're using the MaaS image from https://images.maas.io/ephemeral-v3/daily/

I want to check the image version but I can't find the exact location for them on the server.
There are some import warning in the /var/log/maas/maas.log, not sure if this is related:
https://pastebin.ubuntu.com/p/QGp49xyZgh/

Ryan Harper (raharper) wrote :

On Thu, Apr 26, 2018 at 9:25 PM, Scott Moser <email address hidden> wrote:
> We really probably should make it so that the modprobe is not fatal.
> Obviously if you're trying to install zfs it would be fatal, but otherwise
> we should go on.

When curtin starts, it wants to find existing filesystems and devices, and then
be able to wipe them so as to not interfere later.

If we don't have zfs module, we won't be able to clear it; That may
be OK but could impact data not meant to be wiped.

Further, if after rebooting into the target OS, if zfs module loads there
it could impact the storage configuration.

Replace bcache or md with zfs and you know we've seen situations where
these devices take over due to hidden metadata found later.

> The summary is all current initramfs for v3 images have zfs except
> precise and trusty.
>
> modprobe could fail for another reason. but it looks like at least
> 'zfs.ko' is present.

I suspect there's no bug here in curtin or maas images; possibly
out-of-date images
on the maas node?

Po-Hsu Lin (cypressyew) wrote :

How do I check the image version on my MaaS server?
Tried to navigate through the UI, didn't see any. Or if I will need to check it directly on the server?
Thanks

Po-Hsu Lin (cypressyew) wrote :

Hi Scott,
thanks for the handy script, here is my image version:
ubuntu/precise amd64/hwe-p 20170424.1 daily True
ubuntu/precise amd64/hwe-q 20170424 daily True
ubuntu/precise amd64/hwe-r 20170424 daily True
ubuntu/precise amd64/hwe-s 20170424 daily True
ubuntu/precise amd64/hwe-t 20170424 daily True
ubuntu/precise i386/hwe-p 20170424 daily True
ubuntu/precise i386/hwe-q 20170424 daily True
ubuntu/precise i386/hwe-r 20170424 daily True
ubuntu/precise i386/hwe-s 20170424 daily True
ubuntu/precise i386/hwe-t 20170424 daily True

https://pastebin.ubuntu.com/p/BRSZh5yKZq/

Andres Rodriguez (andreserl) wrote :

Upon some investigation, it seems that the precise streams are not being
correctly generated, that may be causing MAAS to try to isntall a root tgz
instead of a precise squashfs image.

On Mon, Apr 30, 2018 at 11:53 AM, Po-Hsu Lin <email address hidden>
wrote:

> Hi Scott,
> thanks for the handy script, here is my image version:
> ubuntu/precise amd64/hwe-p 20170424.1 daily True
> ubuntu/precise amd64/hwe-q 20170424 daily True
> ubuntu/precise amd64/hwe-r 20170424 daily True
> ubuntu/precise amd64/hwe-s 20170424 daily True
> ubuntu/precise amd64/hwe-t 20170424 daily True
> ubuntu/precise i386/hwe-p 20170424 daily True
> ubuntu/precise i386/hwe-q 20170424 daily True
> ubuntu/precise i386/hwe-r 20170424 daily True
> ubuntu/precise i386/hwe-s 20170424 daily True
> ubuntu/precise i386/hwe-t 20170424 daily True
>
> https://pastebin.ubuntu.com/p/BRSZh5yKZq/
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1739761
>
> Title:
> Unable to deploy Precise on MaaS 2.3.0 (6434)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1739761/+subscriptions
>
> Launchpad-Notification-Type: bug
> Launchpad-Bug: product=curtin; status=Fix Committed; importance=Medium;
> assignee=None;
> Launchpad-Bug: product=maas; milestone=2.4.0alpha2; status=Fix Released;
> importance=Critical; <email address hidden>;
> Launchpad-Bug: product=maas; productseries=2.3; milestone=2.3.1;
> status=Fix Released; importance=Critical; assignee=None;
> Launchpad-Bug: product=maas-images; status=Fix Committed;
> importance=Medium; <email address hidden>;
> Launchpad-Bug-Information-Type: Public
> Launchpad-Bug-Private: no
> Launchpad-Bug-Security-Vulnerability: no
> Launchpad-Bug-Commenters: andreserl blake-rouse cypressyew raharper smoser
> Launchpad-Bug-Reporter: Po-Hsu Lin (cypressyew)
> Launchpad-Bug-Modifier: Po-Hsu Lin (cypressyew)
> Launchpad-Message-Rationale: Assignee
> Launchpad-Message-For: andreserl
>

--
Andres Rodriguez (RoAkSoAx)
Ubuntu Server Developer
MSc. Telecom & Networking
Systems Engineer

Po-Hsu Lin (cypressyew) wrote :

Hello,
is there any action that I need to take on my side?
Thanks

Po-Hsu Lin (cypressyew) wrote :

I can confirm that the Precise deployment is now working with our MaaS 2.3.3-6498, curtin (18.1-17-gae48e86f-0ubuntu1~16.04.1)

Please feel free to close this (and those related) bug.
Thank you!

Ryan Harper (raharper) on 2018-06-01
Changed in curtin:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers