[2.5, UI] Errors when starting KVM pod VMs aren't appropriately surfaced

Bug #1788910 reported by Mike Pontillo on 2018-08-24
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Newell Jensen

Bug Description

I am testing with a privileged LXD container running MAAS and libvirt as a KVM pod.

In this configuration, I am not able to use a macvlan (macvtap) attachment (due to being in a container - see also bug #1788952); when I use an `interfaces` constraint that forces a macvlan attachment (manually requesting an interface on the pod host known to not be a bridge) it fails with an error such as the following when starting the VM (found in the syslog, or when starting the composed VM manually via virsh):

    342: error : virNetDevMacVLanTapOpen:477 : cannot open macvtap tap device /dev/tap11: No such file or directory

When the VM is allocated with skip_commissioning, MAAS never tries to start it; it remains in "Ready" state and the user would never know there is an issue until they go to deploy (it fails with "Failed Deployment").

If the VM is manually composed without skip_commissioning, MAAS attempts commissioning, and commissioning immediately fails with "Failed Commissioning", with no indication about what the error was or why it occurred.

Tags: ui ux Edit Tag help

Related branches

Changed in maas:
milestone: none → 2.5.x
Andres Rodriguez (andreserl) wrote :

I'm making this a UI issue as well.

summary: - [2.5] Errors when starting KVM pod VMs aren't appropriately surfaced
+ [2.5, UI] Errors when starting KVM pod VMs aren't appropriately surfaced
Changed in maas:
milestone: 2.5.x → 2.5.0beta1
importance: Medium → High
tags: added: ui
description: updated
Mike Pontillo (mpontillo) wrote :

Thinking about it, I'm not sure any actual UI changes will be required to fix this. Here's what I'm thinking:

 - If we compose a VM, regardless of whether or not we plan to commission it, we should try to start it up (even if we shut it down right away).
 - If a newly-composed VM raises an error on startup, it should be reported via the ComposeMachineForm.

For the UI, the harder problem is determining in advance if a macvlan attachment would work or not, so that we can prevent displaying options to the user that won't work. Right now it's not possible; MAAS doesn't have enough information to know in advance whether or not the network attachment will actually work.

Mike Pontillo (mpontillo) wrote :

Also, if errors occur during the compose process, we should tear down the newly-composed VM rather than leaving it in the hypervisor (which will further confuse MAAS if a refresh is performed).

We may want to have a field in the ComposeMachineForm such as delete_on_failure=True (by default) so that it's easier to debug badly-composed machines if necessary.

description: updated
Changed in maas:
status: Triaged → In Progress
assignee: nobody → Newell Jensen (newell-jensen)
Mike Pontillo (mpontillo) wrote :

After talking through this with Newell, it seems that we can do something like this:

    virsh start --paused <vm>

This will start the VM but immediate pause it, that way we can catch any errors that wouldn't be seen until the VM attempts to start. And yet the machine won't get a chance to PXE boot, so we shouldn't see it change state in MAAS.

We can then tear down (destroy + undefine) the VM right away if an error occurs, or call "virsh destroy" (which will unpause and shut off the VM) if nothing went wrong, allowing MAAS to then power it back on at its convenience.

Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers