Unable to launch new instances in a cluster after a node controller crashes

Bug #974604 reported by Arwin Tugade
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Eucalyptus
New
Undecided
Unassigned

Bug Description

Version: 2.0.3
OS: Centos 5.7
Mode: Managed-Novlan
Setup: Multicluster

On this version of Eucalyptus, I've been able to reproduce this behavior consistently in a test cluster. If a node controller goes unavailable, when you try to launch new instances in that cluster, the public and private ip will always be 0.0.0.0. I've simulated this by pulling the network on a node controller and waiting for the frontend node to deem the affected instances termed and clear it from the list of available instances. This is a huge issue because the only way to correct the issue is the cleanstop/cleanstart the CC. This means that if you have other node controllers in this cluster with running instances, the ip aliases and iptables rules will be cleared. I've played with CCclient_full and was able to restore the network state of the instances / nodes that were previously running, but you shouldn't have to do this everytime you lose a node.

Arwin

Revision history for this message
Andy Grimm (agrimm) wrote :

This issue is now being tracked upstream at http://eucalyptus.atlassian.net/browse/EUCA-2791

Please watch that issue for further updates.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.