Guests lost on host reboot

Bug #742115 reported by justinsb
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Unassigned

Bug Description

When the host is rebooted, all the guests disappear (at least when using libvirt + KVM)

Related branches

Thierry Carrez (ttx)
Changed in nova:
importance: Undecided → Medium
status: New → In Progress
Thierry Carrez (ttx)
Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
Soren Hansen (soren) wrote :

Uh.. We talked about this earlier. To the best of my knowledge we agreed that this was exactly what was expected by consumers of the EC2 API and that we'd extend the metadata model to have a "persistent" flag of some sort for instances. What changed?

Revision history for this message
justinsb (justin-fathomdb) wrote : Re: [Bug 742115] Re: Guests lost on host reboot

I proposed the more complete patch, but it was too late in the cycle
to get it accepted.

So I just did a bugfix that just did the minimal fix - it updates the
state of machines in the DB, if the host crashes it won't restart
them, but it won't delete them either so they're easy to bring back
up. There is a flag that lets you specify that you'd like guests to
auto-restart if you want.

Slow progress, but progress none the less.

Revision history for this message
Soren Hansen (soren) wrote :

I realise it's probably somewhat of a academic problem at this point, but if the instance is kept and charged for, an unknowing consumer of the EC2 api may be surprised to find that they're being charged for an intance that is shut off. Alternatively, if the provider chooses to not charge for intances in this state, ressources are spent on keeping them around, but cannot be charged for.

I don't think I care enough to argue about this. I was just rather surprised to see a, to me, pretty significant (albeit intentional and, by some, desired) change in behaviour so late in the cycle.

Revision history for this message
justinsb (justin-fathomdb) wrote :

I believe that it was because it was so late in the cycle that we had
to do the minimal fix.

The situation with libvirt was that machines would be deleted when the
host restarted (e.g. power loss, kernel update etc). EC2 does that,
Rackspace doesn't, but irrespective we knew that it would hit some
people really hard.

I agree that it would have been nice to have the flexibility to
address the issue more completely.

Thierry Carrez (ttx)
Changed in nova:
milestone: none → 2011.2
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.