"nova.exception.PortBindingFailed: Binding failed" for OpenStack Zed in Juju deployment
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
charm-ovn-central |
Invalid
|
Undecided
|
Unassigned | ||
neutron (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
ovn (Ubuntu) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
I made an OpenStack deployment with Juju, as documented in the Charm deployment guide (https:/
There is the hint to check "neutron logs", but there is actually no useful information there.
So, I checked the configuration first:
neutron.yaml for deployment of "neutron-api" and "ovn-chassis":
ovn-chassis:
debug: true
bridge-
br-
br-
br-
...
ovn-bridge-
neutron-api:
verbose: true
enable-
neutron-
enable-
vlan-ranges: physnet2
flat-
This looks okay. The network interfaces are mapped into the bridge "br-simulamet", it it is actually existing on all nodes, e.g.:
root@P52S11:
"physnet2:
The network/subnet configuration in OpenStack should also be okay, e.g.:
network create smil-network4 --external --provider-
subnet create smil-network4-ipv4 --network smil-network4 --ip-version 4 --description "VLAN0204-
So, the network should correctly map to "physnet2", with a VLAN tag (here: 204).
(For debugging, I also tried to use the network interface as "flat network" without VLANs. This does not change anything.)
The deployment (from "juju status") also looks okay for Neutron and OVN:
...
neutron-api 21.0.0 active 1 neutron-api zed/stable 546 no Unit is ready
neutron-
neutron-
nova-cloud-
...
ovn-central 22.09.0 active 3 ovn-central 22.09/stable 75 no Unit is ready (leader: ovnsb_db)
ovn-chassis 22.09.1 active 8 ovn-chassis 22.09/stable 109 no Unit is ready
...
/var/log/
ovn-appctl vlog/set dbg
ovn-appctl vlog/disable-
Increasing the Open vSwitch log level also did not reveal more insight, i.e.:
ovn-appctl vlog/set dbg
ovn-appctl vlog/disable-
So, maybe the issue is related to some component around OVN? One strange thing I noticed: There are two processes "ovsdb-server" running, each with a "--log-file" parameter, referring to /var/log/
root@P52S11:
129278 ? Ssl 4:58 ovn-northd -vconsole:emer -vsyslog:err -vfile:info --ovnnb-
130048 ? Ssl 34:42 ovsdb-server -vconsole:off -vfile:info --log-file=
130251 ? Ssl 47:13 ovsdb-server -vconsole:off -vfile:info --log-file=
The logs are in containers, checking them:
/var/snap/
...
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
/var/snap/
...
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
The containers belong to the deployment of "ovn-central", so I assume something is wrong here.
The issue appears on all 8 nodes I have set up. So, it is reproducible. I can provide log files, etc. on request.
Could this issue be a bug of an OpenStack package (may be ovn-central?), or a problem with the Juju Charms for deployment for OpenStack Zed, or some issue with the setup?
Some further debugging: I entered the instance container for ovn-central/0, i.e.:
juju ssh ovn-central/0
In /etc/ovn/ ovn-northd- db-params. conf, I found the TCP and SSL parameters for the OVN NB and SB databases for running check (based on https:/ /numans. blog/2018/ 01/05/debugging -ovn-external- connectivity- part-1/), in my case: ovn-central. crt -p /etc/ovn/key_host --db=ssl: 172.31. 255.114: 6641,ssl: 172.31. 255.115: 6641,ssl: 172.31. 255.116: 6641 show ovn-central. crt -p /etc/ovn/key_host --db=ssl: 172.31. 255.114: 16642,ssl: 172.31. 255.115: 16642,ssl: 172.31. 255.116: 16642 show
sudo ovn-nbctl -c /etc/ovn/cert_host -C /etc/ovn/
sudo ovn-sbctl -c /etc/ovn/cert_host -C /etc/ovn/
Connecting to the DBs works.
SB lists all my 8 nodes, i.e.:
...
Chassis P52S11.maas
hostname: P52S11.maas
Encap geneve
ip: "172.31.255.100"
options: {csum="true"}
...
This seems to look okay.
NB seems to even list my test VM's port, i.e.: e19a-458d- af63-e80d446296 14 (neutron- 97c5c0a1- 5c29-4fc4- 852b-ce818c972a 6d) (aka smil-network4) b261-49fd- 8e95-9cefdf3279 8e (aka Port-warrnamboo l.fire. smil)
switch b2e5de93-
port ec87abb0-
addresses: ["unknown"]
...
But "addresses" only contain "unknown".
Destroying the failed instance leads to removing the port, a new trial leads to creating a new one. So, I assume that at least some communication with the OVN system is working.
It seems that something goes wrong somewhere after creating this port, but without any information in one of the log files. Is there any hint for where to look for further debugging?