Most servers fail to deploy UC via MAAS

Bug #1900705 reported by Rod Smith
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
Fix Released
High
Ian Johnson

Bug Description

I've tried following instructions here for setting up MAAS to deploy Ubuntu Core:

https://www.lucaswilliams.net/index.php/2019/10/18/deploying-ubuntu-core-18-with-maas-2-5/

I've followed these instructions for both Ubuntu Core 18 and Ubuntu Core 20.

With Ubuntu Core 18, I was able to successfully deploy two machines in the 18 Tremont St. certification lab:

* jehan (Quanta D52B-1U with a plug-in NVMe device)
* ashbrook (an Orange Box v3 prototype using ASRock MT-C224
  motherboards)

Several other servers did not deploy:

* cloaker (Lenovo SR665)
* aitken (Lenovo SR650)
* feebas (Cisco UCS C220)
* jolteon (Lenovo x3550 M5)
* mouchez (Lenovo SR670)

The MAAS "Logs" tab showed what appeared to be a successful deployment; but when the nodes rebooted, they showed the following error on their consoles:

findfs: unable to resolve 'LABEL=writable'

On the successful deployments to jehan and ashbrook, the root (/) filesystem has a label of "writable", which suggests a missing disk device driver or some other disk-related problem with the initrd written by the MAAS deployment ephemeral. I believe that all of the failed systems have hardware RAID controllers. So does jehan; however, jehan also has a plug-in NVMe device, and the successful deployment was to that device. The RAID disks in jehan were accessible (albeit unconfigured) once the system booted.

With Ubuntu Core 20, I was unable to deploy to any system until I installed the "dangerous" image. That image deployed to ashbrook, but not to any other node; even jehan failed. Jehan seemed to deploy successfully, according to its logs, but failed to come back up. I didn't see the "findfs" warning noted earlier, but it's possible I missed it.

Revision history for this message
Jeff Lane  (bladernr) wrote :

Confirmed, I also attempted this in our other lab on different servers and got a 100% fail rate there.

tags: added: hwcert-server
Changed in snapd:
status: New → Confirmed
Revision history for this message
Ian Johnson (anonymouse67) wrote :

We are actively working on enabling Ubuntu Core 20 deployment via MAAS by enabling the cloud-init bits that MAAS does with grade: signed images. We will not support deploying generic grade: secured images via MAAS, but someone who wants to do that can build their own image with a gadget snap declaring all the relevant MAAS bits.

See https://github.com/snapcore/snapd/pull/10573 and https://github.com/snapcore/snapd/pull/10674

Changed in snapd:
importance: Undecided → Medium
assignee: nobody → Ian Johnson (anonymouse67)
importance: Medium → High
Revision history for this message
Ian Johnson (anonymouse67) wrote :

This work was completed sometime in 2021 Q2 or Q3

Changed in snapd:
status: Confirmed → Fix Released
milestone: none → 2.52
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.