tripleo-quickstart VMs can't survive a host reboot

Bug #1692976 reported by Steven Hardy
22
This bug affects 5 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Steven Hardy

Bug Description

When you run quickstart, it creates some VMs running by default under the stack user, but on reboot these aren't restarted and the permissions seem messed up so it's not clear how to recover without re-running quickstart (which IME only really works if you teardown the nodes, so you invariably end up building a new environment, which is a hassle if you have a specific test configuration for development).

Steps to reproduce:

1. ./tripleo-quickstart/quickstart.sh --teardown all -R master-tripleo-ci -c tripleo-quickstart-extras/config/general_config/containers_minimal.yml tripleodev2.localdomain

2. reboot the host

3. Try to log in to the undercloud, it's gone

[shardy@tripleodev2 ~]$ ssh -F /home/shardy/.quickstart/ssh.config.ansible undercloud
Warning: Permanently added 'tripleodev2.localdomain,192.168.1.91' (ECDSA) to the list of known hosts.
channel 0: open failed: connect failed: No route to host
ssh_exchange_identification: Connection closed by remote host

[shardy@tripleodev2 ~]$ sudo su - stack
Last login: Tue May 23 17:06:31 BST 2017 on pts/2
[stack@tripleodev2 ~]$ virsh list --all
error: failed to connect to the hypervisor
error: Cannot create user runtime directory '/run/user/1001/libvirt': Permission denied

Tags: quickstart ux
Steven Hardy (shardy)
Changed in tripleo:
status: New → Triaged
milestone: none → pike-3
tags: added: quickstart
Changed in tripleo:
importance: Undecided → High
Revision history for this message
John Trowbridge (trown) wrote :

This will actually be solved by being able to run libvirt via system rather than session:

https://bugs.launchpad.net/tripleo/+bug/1692987

In fact it can only be solved in that way. There is no way for session VMs with their config in ephemeral storage to survive a reboot of the virthost.

Changed in tripleo:
milestone: pike-3 → pike-rc1
Changed in tripleo:
milestone: pike-rc1 → pike-rc2
Changed in tripleo:
milestone: pike-rc2 → queens-1
Changed in tripleo:
milestone: queens-1 → queens-2
Revision history for this message
Jason E. Rist (jason-rist) wrote :

Is there still any effort going into this? Would be nice to reduce redeployment time.

Changed in tripleo:
milestone: queens-2 → queens-3
tags: added: ux
Changed in tripleo:
milestone: queens-3 → queens-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-quickstart (master)

Fix proposed to branch: master
Review: https://review.openstack.org/540350

Changed in tripleo:
assignee: nobody → Steven Hardy (shardy)
status: Triaged → In Progress
Revision history for this message
Steven Hardy (shardy) wrote :

https://review.openstack.org/540350 should help with this - basically I think we need to deploy without the unprivileged virt magic and just use privileged virt with the qcow2 files stored in a persistent location e.g /home/stack, then the VMs should survive host reboot (although they probably will need to be started, e.g via virt-manager)

Revision history for this message
Steven Hardy (shardy) wrote :

Note my comment is agreeing with John, but now we have a config example that seems to work, so we can mark this as resolved. Ideally we'd switch some CI jobs (or even the quickstart default) to use this as IMHO it's a more useful setup for many developers doing testing on local hardware.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-quickstart (master)

Reviewed: https://review.openstack.org/540350
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart/commit/?id=d6480e2022c7604914203fdc257bce0cf88a6495
Submitter: Zuul
Branch: master

commit d6480e2022c7604914203fdc257bce0cf88a6495
Author: Steven Hardy <email address hidden>
Date: Fri Feb 2 11:44:46 2018 +0000

    Add default libvirt options to dev_privileged environment

    This should enable deploying with this enabled via -E, previously
    I'd only tested via -c which is inconsistent with the intended
    use of the environments/* files I think.

    Marking this as closing bug #1692976 as this provides a way to make
    the VMs survive host reboots.

    Change-Id: I1143f93d1d36654ae802f53781823364a761f588
    Closes-Bug: #1692976

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-quickstart 2.1.1

This issue was fixed in the openstack/tripleo-quickstart 2.1.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.