StarlingX R2 duplex: VM not getting rebuild on controller-1 when controller-0 (active) is rebooted

Bug #1851332 reported by Akshay on 2019-11-05
This bug affects 1 person
Affects Status Importance Assigned to Milestone
yong hu

Bug Description

Brief Description
Setup: I have deployed Bare Metal StarlingX R2 duplex mode. While testing HA, I tested a case in which I have spawned 2 VMs from horizon together on IPv6 flat and IPv6 vlan networks. One gets spawned on controller-0 and other on controller-1. Both VM gets IP assigned on each network and they are able to ping each other.

Test Case: Now I rebooted the controller-0 (active node ) and the VM on controller-0 tries to rebuild itself.

Issue: But it stays in rebuilding state throughout the time taken by controller-0 to reboot and when controller-0 comes up and available, the VM gets rebuild on controller-0 only.

I tried this case many times with same result.
Please guide me to solve this issue.



Steps to Reproduce
1. Deploy Bare Metal StarlingX R2 duplex mode.
2. Spawn 2 VMs together on IPv6 flat and IPv6 vlan from horizon.
3. Reboot the controller-0 (Active node).
4. Check the rebuilding process of VM on controller-0.

Expected Behavior
VM should be rebuilt on controller-1 immediately.

Actual Behavior
After reboot of controller-0, VM gets rebuilt on controller-0 only (~ After 20 mins).


System Configuration
Two node system

Last Pass

Akshay (yadavakshay58) wrote :

It is showing multiple behaviors.
1. For multiple times, it did the way explained above.
2. Sometimes it gets rebuild on controller-1 but it did not gets IP assigned to the vlan network inside the VM.
3. Sometimes it gets rebuild on controller-1 but did not get any IP assigned on any of the networks.

Please guide.

Ghada Khalil (gkhalil) wrote :

Assigning to the ditro-openstack PL for triage/next steps

tags: added: stx.distro.openstack
Changed in starlingx:
assignee: nobody → yong hu (yhu6)
yong hu (yhu6) on 2019-12-02
tags: added: stx.2.0
yong hu (yhu6) wrote :

@Akshay, please share the VM flavor info and once you see the issue again, you might catch the log by "collect" cmd on both controllers.

I suspect it's related to "anti-affinity" feature, which prevents VMs from being scheduling in the same node.

Changed in starlingx:
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers