Provisioning LXD vmhost fails

Bug #1988759 reported by Dan Ardelean
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Alberto Donato
3.1
Fix Released
High
Alberto Donato
3.2
Fix Released
High
Alberto Donato
lxd (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Hi,

I am trying de deploy a node as a LXD vmhost, with
"maas myprofile machine deploy $LINUX_MACHINE_ID register_vmhost=True"

The node eventually enters deployment failed state after the deployment timeout. The node actually deploys and works, but in MAAS it will be stuck in the "Rebooting" state, and eventually enter failed state.

I see this in the region controller logs: https://paste.ubuntu.com/p/Ffc2ny5nJZ/

And this is the rack controller:
https://paste.ubuntu.com/p/twNgrh7zKS/

If I check the storage pools on the deployed (failed) LXD host, I see this:
https://paste.ubuntu.com/p/J2zNsKcj9R/
https://paste.ubuntu.com/p/SDSSGvqDyC/

MAAS version is: 3.1.0-10901-g.f1f8f1505 19835 3.1/stable
On the deloyed node, LXD version is: 5.5-37534be 23537 latest/stable

Related branches

description: updated
description: updated
description: updated
description: updated
description: updated
Revision history for this message
Björn Tillenius (bjornt) wrote :

Could you please paste the output of "lxd storage show default"?

Changed in maas:
status: New → Incomplete
Revision history for this message
Dan Ardelean (danardelean) wrote :

Description updated with output for "lxc storage show default".

description: updated
Chris Valean (cvalean)
Changed in maas:
status: Incomplete → Confirmed
Changed in maas:
importance: Undecided → High
status: Confirmed → Triaged
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in lxd (Ubuntu):
status: New → Confirmed
Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

Is the issue reproducible on MAAS 3.2 or later?

Changed in maas:
status: Triaged → Incomplete
Revision history for this message
Dan Ardelean (danardelean) wrote :

Yes, I've tested with 3.2 ar the beginning of September and I had the same result.

Revision history for this message
Ben Meinhart (ben-ihelputech) wrote :

I have also tried is on 3.2.6 and have had the same result. To work around it, I usually just deploy the machine and initialize lxd myself. I then add that lxd host to maas manually.

Alberto Donato (ack)
Changed in maas:
status: Incomplete → Triaged
milestone: none → 3.3.0
assignee: nobody → Alberto Donato (ack)
status: Triaged → In Progress
Revision history for this message
Alberto Donato (ack) wrote :

The issue is related to this change in LXD: https://github.com/lxc/lxd/pull/10722

MAAS currently sets up LXD credentials for itself during deployment that are limited to the "maas" project.
This means that it cannot see the expected storage pool configurations anymore, which causes the error.

Changed in maas:
status: In Progress → Fix Committed
Revision history for this message
Ben Meinhart (ben-ihelputech) wrote :

I just tried installing the 3.2/edge version of the snap which should have contained the fix. However, it still seems to be exhibiting the same behavior on my setup.

$ snap info maas
name: maas
summary: Metal as a Service
publisher: Canonical✓
store-url: https://snapcraft.io/maas
contact: https://discourse.maas.io/
license: unset
description: |
  Total automation of your physical servers for amazing data center operational efficiency.
commands:
  - maas
  - maas.power
services:
  maas.supervisor: simple, enabled, active
snap-id: shY22YTZ3RhJJDOj0MfmShTNZTEb1Jiq
tracking: 3.2/edge
refresh-date: today at 13:06 CDT
channels:
  3.2/stable: 3.2.6-12016-g.19812b4da 2022-09-22 (23947) 147MB -
  3.2/candidate: ↑
  3.2/beta: 3.2.6-12016-g.19812b4da 2022-09-21 (23947) 147MB -
  3.2/edge: 3.2.6-12022-g.6840893f5 2022-10-14 (24449) 147MB -
  latest/stable: –
  latest/candidate: –
  latest/beta: –
  latest/edge: 3.3.0~alpha1-12784-g.ac3576cbf 2022-10-14 (24445) 129MB -
...
installed: 3.2.6-12022-g.6840893f5 (24449) 147MB -

Revision history for this message
Ben Meinhart (ben-ihelputech) wrote :

here is a copy of the deployment log from maas.

Revision history for this message
Nobuto Murata (nobuto) wrote :

Subscribing ~field-high to get a stable release for the 3.2 series.

Changed in maas:
milestone: 3.3.0 → 3.3.0-beta1
Changed in maas:
status: Fix Committed → Fix Released
Revision history for this message
John Terrell (coolpeople) wrote :

I'm still seeing this in 3.3.0~rc1-13127-g.e6737625f.

Revision history for this message
Alvin Cura (alvinc) wrote :

I also am not having success deploying LXD via MAAS. I am currently having to `lxd init` from the unsuccessfully deployed host to get it to work. Thus far, (it looks like) successfully.

Revision history for this message
Alberto Donato (ack) wrote :

Please open a new bug about the issue, including regiond/rackd logs from the time when the deployment fails, as well as the MAAS version.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.