IPv6 mgmt network not working, octavia can't talk to Amphora instance
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Octavia Charm |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
Hi,
I'm trying to make Octavia Charm work but I encounter a few issues.
Apparently my deployment goes well (everything is green and I followed the Octavia charm guide to set it up automatically ie. auto create Amphora image with diskimage retrofit + configuring Octavia ressources automatically through the Charm).
Again, no errors and apparently, everything goes well.
Once connected to Horizon, I spawn a few VMs and then, I try to create a new LB for them.
First, until there is a LB create, I constantly get a very annoying notification popup saying there is no LB available for listing.
Then, I create a LB, fill everything to get a very simple round robin LB on 2 VMs in their private subnet.
Once completed, the LB is created but it stay stuck in "Offline/Pending Create" status until it ends in an "Error" state in which I can destroy it (it takes quite a few minutes before ending in that state).
While in "Pending Create" state, I checked the Octavia units logs and saw something like that :
2021-01-14 16:21:00.490 9938 WARNING octavia.
I then checked in the instances list and discovered the Amphora instance has been correctly created (I checked the console and it is fully booted, apparently with no error) and it has the fc00:fa21:
I then ssh to the Octavia units and tried to ping6 this IP and got no answer from every units so there seem to be an issue with the IPv6 management overlay network.
I tried to curl https://[fc00:fa21:
Here is a detailed description of my setup :
* Charmed Openstack (20.10 stable) deployed through MaaS (2.9.0) with Juju (2.8.7) with the Focal series for every units (with Openstack Ussuri so "distro" openstack-origin)
* 7 machines (3 control planes, 4 compute nodes)
* HA mode for every control plane applications in conjunction to the HACluster charm with VIP assigned
* 3 spaces :
- "pxe" : non routed untagged network for provisionning only
- "ost-int" : routed /24 VLAN subnet for openstack internal network
- "ost-pub" : routed /24 VLAN subnet for both openstack admin/public network
"ost-int" and "ost-pub" are routed, they can talk to each other, there is no firewall in between.
You can find my exported bundle attached.
I tried to deploy this bundle dozen of times with multiple spaces or only one space but that didn't change anything, I was never able to get Octavia work properly.
Amphora is created, LB is created but octavia API can't talk to the Amphora instance.
Thank you for the bug. To diagnose communication issues between the Octavia units and the cloud we must start by looking at the state of tunnels and port bindings.
Take a look at the Neutron ports created by the Octavia charm, for example `octavia- health- manager- octavia- 0-listen- port`:
- Does the `binding_host_id` match the FQDN of the Octavia container?
- Does the `binding_vif_type` field say 'ovs' or does it say 'binding_failed'?
Since you are using OVN you will also have rich logging in /var/log/ ovn/ovn- controller. log which would show evidence of any shortname vs. FQDN issues.
There is an ongoing issue with deploying OVS/OVN in LXD containers on MAAS as detailed in bug 1896630. The root of the issue is somewhere below the charms but the linked bug contain steps to work around the issue.