MAAS 3.4: Deployment fails on LXD VMs

Bug #2011274 reported by Bill Wear
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Christian Grabowski

Bug Description

running MAAS 3.4.0~alpha1 installed from snap, in production configuration, with Postgres 14

Cannot deploy LXD VMs composed from LXD VM host:

1. Commissioning completes, but takes a very long time; rackd.log is idle after squashfs is downloaded, and the lxd console for the machine shows no further activity after machine initially boots from NBP. Machine eventually (suddenly?) reaches the "Ready" state and can be moved to "Allocated".

2. Deployment can be started, but does not complete. Again, rackd.log is idle after squashfs is downloaded, and the lxd console for the machine shows no further activity after machine initially boots from NBP. UI message hangs at "Loading ephemeral". MAAS eventually stops the machine, long before deployment times out.

There are no relevant log messages during this long timeout, and there is absolutely no lxd console output for the machine during Commissioning or Deployment.

Contrast this with 3.3, in which normal commissioning and takes place, including several rackd.log messages along the way, and the expected flood of curtin and cloud-init messages in the lxd console throughout the entire processes.

Related branches

Bill Wear (billwear)
Changed in maas:
status: New → Triaged
importance: Undecided → High
milestone: none → 3.4.0
Revision history for this message
Bill Wear (billwear) wrote :
Revision history for this message
Christian Grabowski (cgrabowski) wrote (last edit ):

It seems the issue is when boot config is fetched the MAC being passed has its `:` replaced with `-`, this has worked in the past, however the query for finding the machine just uses the reformatted mac, while stored in the database with the `:`. This causes MAAS to think the machine does not exist and sets the kernel params to those for commissioning, which then does not load the correct configuration and deployment stalls.

Changed in maas:
assignee: nobody → Christian Grabowski (cgrabowski)
status: Triaged → In Progress
Changed in maas:
status: In Progress → Fix Committed
Alberto Donato (ack)
Changed in maas:
milestone: 3.4.0 → 3.4.0-beta1
Alberto Donato (ack)
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.