StarlingX

StarlingX R2 duplex: VM not getting rebuild on controller-1 when controller-0 (active) is rebooted

Bug #1851332 reported by Akshay on 2019-11-05

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	StarlingX	Invalid	Medium	zhipeng liu

Bug Description

Brief Description
-----------------
Setup: I have deployed Bare Metal StarlingX R2 duplex mode. While testing HA, I tested a case in which I have spawned 2 VMs from horizon together on IPv6 flat and IPv6 vlan networks. One gets spawned on controller-0 and other on controller-1. Both VM gets IP assigned on each network and they are able to ping each other.

Test Case: Now I rebooted the controller-0 (active node ) and the VM on controller-0 tries to rebuild itself.

Issue: But it stays in rebuilding state throughout the time taken by controller-0 to reboot and when controller-0 comes up and available, the VM gets rebuild on controller-0 only.

I tried this case many times with same result.
Please guide me to solve this issue.

Severity
--------

Critical

Steps to Reproduce
------------------
1. Deploy Bare Metal StarlingX R2 duplex mode.
2. Spawn 2 VMs together on IPv6 flat and IPv6 vlan from horizon.
3. Reboot the controller-0 (Active node).
4. Check the rebuilding process of VM on controller-0.

Expected Behavior
------------------
VM should be rebuilt on controller-1 immediately.

Actual Behavior
----------------
After reboot of controller-0, VM gets rebuilt on controller-0 only (~ After 20 mins).

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Two node system

Last Pass
---------
NO

Tags:

Revision history for this message

Akshay (yadavakshay58) wrote on 2019-11-05:

It is showing multiple behaviors.
1. For multiple times, it did the way explained above.
2. Sometimes it gets rebuild on controller-1 but it did not gets IP assigned to the vlan network inside the VM.
3. Sometimes it gets rebuild on controller-1 but did not get any IP assigned on any of the networks.

Please guide.

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-11-06:

Assigning to the ditro-openstack PL for triage/next steps

tags:	added: stx.distro.openstack
Changed in starlingx:
assignee:	nobody → yong hu (yhu6)

yong hu (yhu6) on 2019-12-02

tags:

added: stx.2.0

Revision history for this message

yong hu (yhu6) wrote on 2019-12-19:

@Akshay, please share the VM flavor info and once you see the issue again, you might catch the log by "collect" cmd on both controllers.

I suspect it's related to "anti-affinity" feature, which prevents VMs from being scheduling in the same node.

Changed in starlingx:
importance:	Undecided → Medium

Ghada Khalil (gkhalil) on 2020-01-31

Changed in starlingx:
status:	New → Triaged

zhipeng liu (zhipengs) on 2020-02-12

Changed in starlingx:
assignee:	yong hu (yhu6) → zhipeng liu (zhipengs)

Revision history for this message

zhipeng liu (zhipengs) wrote on 2020-02-12:

Hi Akshay,

Any update from your latest test with our latest build?
I also saw the scenario case you mentioned.
Sometimes, when you reboot node 1, VM in node 1 do not evacuate to node 2 but rebuilding in node 1.
STX has a precondition when determining if evacuation should be triggered,
if node 1 is not stay in offline(may be it restarts quickly), evacuation will be replaced by
rebuilding in node 1.
For IP issue, we have not seen this kind of issue so far.
If you still have issue, please also provide complete log.

Thanks!
Zhipeng

zhipeng liu (zhipengs) on 2020-02-12

Changed in starlingx:
status:	Triaged → Incomplete

Revision history for this message

Akshay (akshay346) wrote on 2020-02-18:

Hi Zhipeng,

Thanks for the information. Also we have not tried it with latest build.

Revision history for this message

zhipeng liu (zhipengs) wrote on 2020-03-27:

I propose to close it since no update for more than 1 month

Thanks!
Zhipeng

zhipeng liu (zhipengs) on 2020-04-26

Changed in starlingx:
status:	Incomplete → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.