When deploying a Stein multi-region environment, 2 issues are observed:
1)
2020-09-08 15:03:01.590 15355 INFO nova.compute.resource_tracker [req-1e5b881d-8c0e-4213-a500-8bea79877f92 - - - - -] Compute node record created for compute1.maas:compute1.maas with uuid: 246ba562-1f2e-4296-a69a-53bceda49739
2020-09-08 15:03:02.358 15355 ERROR nova.scheduler.client.report [req-1e5b881d-8c0e-4213-a500-8bea79877f92 - - - - -] [req-af77bcff-3de0-47ef-98ac-ff4447d9aee3] Failed to create resource provider record in placement API for UUID 246ba562-1f2e-4296-a69a-53bceda49739. Got 409: {"errors": [{"status": 409, "title": "Conflict", "detail": "There was a conflict when trying to complete your request.\n\n Conflicting resource provider name: compute1.maas already exists. ", "request_id": "req-af77bcff-3de0-47ef-98ac-ff4447d9aee3"}]}.
Investigating the issue was found that the compute node was registering itself in the database of another reason. More specifically, the first one from the "openstack endpoint list --service placement".
Turns out that according to the release notes of Rocky release:
"The following deprecated options have been removed from the placement group of nova.conf:
os_region_name (use region_name instead)"
By replacing the os_region_name config with region_name, it allowed the compute node to talk to the correct endpoint and register the node against the correct placement database. Which leads to problem #2.
2)
While testing migrations, it was noticed that migrations started with "openstack server migrate --live-migration" would result in the logs:
2020-11-05 14:34:22.723 1993 ERROR nova.network.neutronv2.api [req-0170510a-264b-441d-84ab-211ac89c5f5f dedbb29c90e94218a838fd7c6bdc8a44 7118e587a8be4a2c81eff9429e8bd249 - ae745ee07d0445e4a11ef46e1cfff59c ae745ee07d0445e4a11ef46e1cfff59c] [instance: 967c9efa-eb81-4892-af15-e148b3ab838b] Binding failed for port 3d82d547-454d-4ee7-ad5c-9d834e3afb9a and host juju-60a5dc-bionic-stein-federated-01-7.cloud.sts. Error: (404 {"NeutronError": {"type": "PortNotFound", "message": "Port 3d82d547-454d-4ee7-ad5c-9d834e3afb9a could not be found.", "detail": ""}})
while migrations with "openstack server migrate --live <host>" would result in the CLI error:
Migration pre-check error: Binding failed for port d92c626a-25d9-4ef3-981a-15c430cdf9c8, please check neutron logs for more information. (HTTP 400) (Request-ID: req-9445bf06-f034-4b39-ac5b-e29485c9f5d2)
so, investigating this, was found that the conductor was talking to the wrong endpoint of neutron-api. By adding region_name to [neutron] section in nova.conf of nova-cloud-controllers, it addressed the problem. But then led to another one later in the migration:
2020-11-05 20:19:16.238 15881 ERROR oslo_messaging.rpc.server [req-c878d5a1-4167-4aa9-8a88-c77dfe77940a dedbb29c90e94218a838fd7c6bdc8a44 7118e587a8be4a2c81eff9429e8bd249 - ae745ee07d0445e4a11ef46e1cfff59c ae745ee07d0445e4a11ef46e1cfff59c] Exception during message handling: keystoneauth1.exceptions.connection.ConnectFailure: Unable to establish connection to http://10.5.2.231:9696/v2.0/ports/1eaef00a-e73f-4c04-a60d-0fd5438ea807: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
then, investigating that, was found that the region_name needed to be added to the [neutron] section in nova.conf of nova-computes as well, so the problem is addressed and migrations can succeed.
However, I did not find any specific mention of the region_name of [neutron] section in Nova release notes. The only relevant mention I found of it in the code is in [0], but that has been removed in Train. Moreover, that code is not invoked in the first error (the port binding one). Instead, the code goes through [1] which hasn't changed since Stein, but picks up the added region_name parameter in [neutron].
Therefore the parameter region_name must be added to [placement] section of nova-computes, and to the [neutron] section of nova-computes and nova-cloud-controllers to address this issue.
When deploying a Stein multi-region environment, 2 issues are observed:
1) resource_ tracker [req-1e5b881d- 8c0e-4213- a500-8bea79877f 92 - - - - -] Compute node record created for compute1. maas:compute1. maas with uuid: 246ba562- 1f2e-4296- a69a-53bceda497 39 client. report [req-1e5b881d- 8c0e-4213- a500-8bea79877f 92 - - - - -] [req-af77bcff- 3de0-47ef- 98ac-ff4447d9ae e3] Failed to create resource provider record in placement API for UUID 246ba562- 1f2e-4296- a69a-53bceda497 39. Got 409: {"errors": [{"status": 409, "title": "Conflict", "detail": "There was a conflict when trying to complete your request.\n\n Conflicting resource provider name: compute1.maas already exists. ", "request_id": "req-af77bcff- 3de0-47ef- 98ac-ff4447d9ae e3"}]}.
2020-09-08 15:03:01.590 15355 INFO nova.compute.
2020-09-08 15:03:02.358 15355 ERROR nova.scheduler.
Investigating the issue was found that the compute node was registering itself in the database of another reason. More specifically, the first one from the "openstack endpoint list --service placement".
Turns out that according to the release notes of Rocky release:
"The following deprecated options have been removed from the placement group of nova.conf:
os_region_name (use region_name instead)"
By replacing the os_region_name config with region_name, it allowed the compute node to talk to the correct endpoint and register the node against the correct placement database. Which leads to problem #2.
2)
While testing migrations, it was noticed that migrations started with "openstack server migrate --live-migration" would result in the logs:
2020-11-05 14:34:22.723 1993 ERROR nova.network. neutronv2. api [req-0170510a- 264b-441d- 84ab-211ac89c5f 5f dedbb29c90e9421 8a838fd7c6bdc8a 44 7118e587a8be4a2 c81eff9429e8bd2 49 - ae745ee07d0445e 4a11ef46e1cfff5 9c ae745ee07d0445e 4a11ef46e1cfff5 9c] [instance: 967c9efa- eb81-4892- af15-e148b3ab83 8b] Binding failed for port 3d82d547- 454d-4ee7- ad5c-9d834e3afb 9a and host juju-60a5dc- bionic- stein-federated -01-7.cloud. sts. Error: (404 {"NeutronError": {"type": "PortNotFound", "message": "Port 3d82d547- 454d-4ee7- ad5c-9d834e3afb 9a could not be found.", "detail": ""}})
while migrations with "openstack server migrate --live <host>" would result in the CLI error:
Migration pre-check error: Binding failed for port d92c626a- 25d9-4ef3- 981a-15c430cdf9 c8, please check neutron logs for more information. (HTTP 400) (Request-ID: req-9445bf06- f034-4b39- ac5b-e29485c9f5 d2)
so, investigating this, was found that the conductor was talking to the wrong endpoint of neutron-api. By adding region_name to [neutron] section in nova.conf of nova-cloud- controllers, it addressed the problem. But then led to another one later in the migration:
2020-11-05 20:19:16.238 15881 ERROR oslo_messaging. rpc.server [req-c878d5a1- 4167-4aa9- 8a88-c77dfe7794 0a dedbb29c90e9421 8a838fd7c6bdc8a 44 7118e587a8be4a2 c81eff9429e8bd2 49 - ae745ee07d0445e 4a11ef46e1cfff5 9c ae745ee07d0445e 4a11ef46e1cfff5 9c] Exception during message handling: keystoneauth1. exceptions. connection. ConnectFailure: Unable to establish connection to http:// 10.5.2. 231:9696/ v2.0/ports/ 1eaef00a- e73f-4c04- a60d-0fd5438ea8 07: ('Connection aborted.', RemoteDisconnec ted('Remote end closed connection without response',))
then, investigating that, was found that the region_name needed to be added to the [neutron] section in nova.conf of nova-computes as well, so the problem is addressed and migrations can succeed.
However, I did not find any specific mention of the region_name of [neutron] section in Nova release notes. The only relevant mention I found of it in the code is in [0], but that has been removed in Train. Moreover, that code is not invoked in the first error (the port binding one). Instead, the code goes through [1] which hasn't changed since Stein, but picks up the added region_name parameter in [neutron].
Therefore the parameter region_name must be added to [placement] section of nova-computes, and to the [neutron] section of nova-computes and nova-cloud- controllers to address this issue.
[0] https:/ /github. com/openstack/ nova/blob/ cde42879a497cd2 b91f0cf926e0417 fda07b3c31/ nova/network/ neutronv2/ api.py# L193
[1] https:/ /github. com/openstack/ nova/blob/ cde42879a497cd2 b91f0cf926e0417 fda07b3c31/ nova/network/ neutronv2/ api.py# L214