Fail to filter the list of instances by availability zone

Bug #1782539 reported by Brin Zhang on 2018-07-19
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
huanhongda

Bug Description

After the host is in the Host Aggregates (HA) where the host is modified, the Availability Zone (az) is also changed, but the available_zone in the instances database of the instance on the host does not change.
The recurring steps are as follows:

1. Query the az of the instance by nova-api:
[root@node01 ~]# nova show 82d28856-a4ec-4ddb-96e1-0298c864c024 |grep -E 'availability_zone|hypervisor_hostname'
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:hypervisor_hostname | node01 |

2. In the database use sql to query the current instance of az:
MariaDB [nova_cell1]> SELECT display_name, vm_state, availability_zone FROM instances WHERE uuid='82d28856-a4ec-4ddb-96e1-0298c864c024';
+--------------+----------+-------------------+
| display_name | vm_state | availability_zone |
+--------------+----------+-------------------+
| vm1 | active | nova |
+--------------+----------+-------------------+

3. Create HA (name is ha1), set AZ (name is az1), add host node01 to HA ha1:
[root@node01 ~]# nova aggregate-show ha1
+----+----------------+-------------------+----------+-------------------------+--------------------------------------+
| Id | Name | Availability Zone | Hosts | Metadata | UUID |
+----+----------------+-------------------+----------+-------------------------+--------------------------------------+
| 5 | ha1 | az1 | 'node01' | 'availability_zone=az1' | e0e98cc0-48cc-4b7f-a49e-fc992892f1c9 |
+----+----------------+-------------------+----------+-------------------------+--------------------------------------+

4.Query the az of the instance by nova-api:
[root@node01 ~]# nova show 82d28856-a4ec-4ddb-96e1-0298c864c024 |grep -E 'availability_zone|hypervisor_hostname'
| OS-EXT-AZ:availability_zone | az1 |
| OS-EXT-SRV-ATTR:hypervisor_hostname | node01 |

5.In the database use sql to query the current instance of az:
MariaDB [nova_cell1]> SELECT display_name, vm_state, availability_zone FROM instances WHERE uuid='82d28856-a4ec-4ddb-96e1-0298c864c024';
+--------------+----------+-------------------+
| display_name | vm_state | availability_zone |
+--------------+----------+-------------------+
| vm1 | active | nova |
+--------------+----------+-------------------+

Summary:By comparing steps 1, 4 and 2, 5, it is found that the data obtained by vm1 through nova-api is inconsistent with the data in the instances database table, and the availability_zone=az1 gets by nova-api, but the availability_zone=nova in instances data table with vm1.

In addition, you can reproduce the problem through horizon.

sunjiazz (sunjiazz) on 2018-07-19
Changed in nova:
assignee: nobody → sunjiazz (sunjiazz)
Matt Riedemann (mriedem) wrote :

Which release is this against? master (rocky)?

Matt Riedemann (mriedem) wrote :

https://review.openstack.org/#/c/582342/ could have implications for this bug.

Matt Riedemann (mriedem) wrote :

Also, what do you mean by "Fail to filter the list of instances by the available zone" as the bug title? Do you mean, because the API says in step 4 that the instance is in AZ az1 that when filtering instances by az1 you expect to see vm1 there, but it's not because the instances table says the instance is in AZ 'nova'?

Brin Zhang (zhangbailin) wrote :

Hi sir: This problem we found in the Ocata release , tested in the Rocky release , still exists.

"Fail to filter the list of instances by the available zone" this tittle:
I mean, if the host's HA changes (AZ will also change), then the AZ of all the instances on the host should also change with the host. Then the available_zone in the instances data table should also be updated.

--"because the API says in step 4 that the instance is in AZ az1 that when filtering instances by az1 you expect to see vm1 there, but it's not because the instances table says the instance is in AZ 'nova'?"
Yes, it should be consistent here.

Do you think this tittle needs to be changed? Any suggestions? Thank you.

tags: added: availability-zones
Matt Riedemann (mriedem) wrote :
Matt Riedemann (mriedem) wrote :

I've added the blueprint and this bug to the Stein PTG etherpad for discussion:

https://etherpad.openstack.org/p/nova-ptg-stein

Matt Riedemann (mriedem) on 2018-07-24
Changed in nova:
status: New → Triaged
Matt Riedemann (mriedem) wrote :

Discussed at the Stein PTG:

https://etherpad.openstack.org/p/nova-ptg-stein

We agreed to block AZ renames in the API if the AZ has instances in it as a bug fix, no microversion. People shoudn't have to opt into not breaking themselves. Plus this is an admin API by default.

https://developer.openstack.org/api-ref/compute/#update-aggregate

Changed in nova:
assignee: sunjiazz (sunjiazz) → nobody
importance: Undecided → Medium

Fix proposed to branch: master
Review: https://review.openstack.org/611833

Changed in nova:
assignee: nobody → huanhongda (hongda)
status: Triaged → In Progress

Denormalized, inconsistent data at its finest... :(

Please see the code comment next to the instances.availability_zone field:

https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py#L295-L298

    # This always refers to the availability_zone kwarg passed in /servers and
    # provided as an API option, not at all related to the host AZ the instance
    # belongs to.
    availability_zone = Column(String(255))

Apparently, it's meant to be this way :(

Jay Pipes (jaypipes) on 2018-12-20
summary: - Fail to filter the list of instances by the available zone
+ Fail to filter the list of instances by availability zone

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.openstack.org/611833
Reason: This looks abandoned and https://review.openstack.org/#/c/509206/ is working on fixing the same thing.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers