LXD VM host refresh failure is ignored

Bug #1923687 reported by Lee Trager
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Expired
Medium
Unassigned

Bug Description

When a VM host is added to MAAS one of the first operations performed is refreshing the VM hosts information. This happens even when MAAS itself is deploying the VM host. In LP:1923685 I deployed an LXD VM host which fails to refresh due to GH:LXC/LXD:8477[1]. This sets 50-maas-01-commissioning to failed and the failure is captured in regiond.log. However the machine is still marked as deployed and an LXD VM host is still added. If I try to compose a VM on this machine I get over commitment errors as 50-maas-01-commissioning wasn't able to process.

1. If the initial refresh fails when adding a VM host it shouldn't be added.
2. If the initial refresh fails when adding a VM host during a deployment the deployment should be marked as failed deployment.

[1] https://github.com/lxc/lxd/issues/8477

Changed in maas:
milestone: 3.0.0-beta3 → 3.0.0-beta4
Changed in maas:
milestone: 3.0.0-beta4 → 3.0.0-beta5
Alberto Donato (ack)
Changed in maas:
importance: Undecided → Medium
milestone: 3.0.0-beta5 → 3.0.0
assignee: nobody → Alberto Donato (ack)
Revision history for this message
Alberto Donato (ack) wrote :

Do you get a traceback in that case in regiond.log?

I had a similar case when testing deployment of a machine as VM host where it would fail to register the host, but in that case I did get the deployment marked as failed, as the PodForm should fail on save() if discover_and_refresh_pod fails.

I'm not exactly sure of what's the path you're hitting that masks the failure

Changed in maas:
status: New → Incomplete
Revision history for this message
Alberto Donato (ack) wrote :

I've also tested now by unconditionally throwing an exception from the driver refresh(), this causes deployment to fail and the VM host not to be added.

Changed in maas:
milestone: 3.0.0 → none
milestone: none → 3.0.1
Revision history for this message
Alberto Donato (ack) wrote :

Lee, are you still able to reproduce the issue?

Changed in maas:
status: Incomplete → New
status: New → Incomplete
Alberto Donato (ack)
Changed in maas:
assignee: Alberto Donato (ack) → nobody
milestone: 3.0.1 → none
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for MAAS because there has been no activity for 60 days.]

Changed in maas:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.