stx-openstack: Live-migration traffic going through wrong networks
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Thales Elero Cervi |
Bug Description
Brief Description
-----------------
Since Day0 stx-openstack is not correctly configuring the network IP to be used for live-migrations.
It is currently relying on default gateway resolution, but this is problematic since it will differ between AIO (solve to the oam-net IP) and worker dedicated (solve to the mgmt-net IP) nodes.
Platform firewall will block OAM traffic using ports not explicitly allowed.
Actually, this traffic should be going through cluster-host-net.
Severity
--------
Major: Live-migration not working for AIO-DX systems
Steps to Reproduce
------------------
* Install stx and apply stx-openstack
* Launch a VM
* Try to live-migrate the VM
Expected Behavior
------------------
VM live-migrate successfully (through the cluster-host-net)
Actual Behavior
----------------
Live-migration fails
Reproducibility
---------------
100% Reproducible
System Configuration
-------
AIO-DX
Branch/Pull Time/Commit
-------
master and f/antelope branches
Last Pass
---------
stx.8.0
Timestamp/Logs
--------------
2023-08-
Test Activity
-------------
Regression Testing
Changed in starlingx: | |
assignee: | nobody → Thales Elero Cervi (tcervi) |
tags: | added: stx.9.0 stx.distro.openstack |
Changed in starlingx: | |
status: | New → In Progress |
Changed in starlingx: | |
importance: | Undecided → High |
Reviewed: https:/ /review. opendev. org/c/starlingx /openstack- armada- app/+/895730 /opendev. org/starlingx/ openstack- armada- app/commit/ 310f677d295abff 792168ee860beba 8c52b1c2ab
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit 310f677d295abff 792168ee860beba 8c52b1c2ab
Author: Thales Elero Cervi <email address hidden>
Date: Mon Sep 18 16:00:28 2023 -0300
Move live-migration traffic to cluster-host-net
This change updates the application plugins in order to ensure that all live-migration related traffic is happening through the host-network. Currently most of the libvirt/ live-migration host-network.
libvirt/
cluster-
addresses are being solved through INADDR_ANY (0.0.0.0), and this route
resolution will vary between AIO, routes to oam-network, and Worker,
routes to mgmt-network. Both resolutions are not correct since the
correct network for such traffic should be the cluster-
Actually, current platform firewall will block any traffic through not
allowed oam-network ports.
The goal will be achieved by setting to the node's cluster-host IP: _inbound_ addr"
* libvirt listen_addr
* nova.conf "live_migration
It is important to notice that in the current version of the compute- init.sh for this use case of ours, so an openstack-helm
openstack-helm nova helm chart, there is a problem with
nova-
patch was required to fix it.
Code that was previously implemented only for the Nova plugin and is now
required by the Libvirt plugin, was moved to the parent OpenStack class.
[1] https:/ /github. com/openstack/ openstack- helm/commit/ 31be86079d711c6 98b2560b4bed654 e23373a596
TEST PLAN: ateData has the connect_ addr" ateData has the connect_ addr"
PASS - Build stx-openstack application
PASS - Apply the application to an AIO-DX system
PASS - "$ sudo netstat -ltnp | grep <libvirtd pid>" to ensure that
libvirtd is listening on the correct cluster-host-net IP
PASS - Verify that the nova-compute.sh script was populated correctly
PASS - Test a VM live-migration on the controller+worker node
PASS - Verify that live_migration data in LibvirtLiveMigr
correct cluster-host-net IP address in its "target_
PASS - Apply the application to a Standard system
PASS - "$ sudo netstat -ltnp | grep <libvirtd pid>" to ensure that
libvirtd is listening on the correct cluster-host-net IP
PASS - Verify that the nova-compute.sh script was populated correctly
PASS - Test a VM live-migration on the worker node
PASS - Verify that live_migration data in LibvirtLiveMigr
correct cluster-host-net IP address in its "target_
Closes-Bug: 2037330
Signed-off-by: Thales Elero Cervi <email address hidden> 397a1b8dbdad1a2 93ff25c2e55
Change-Id: I37db601e4b1b0e