Affected version: stable/juno
Description:
On a single-node deployment of OpenStack (using DevStack), if the default availability zone of Nova is replaced by another one, then the API api.nova.server_list returns the list of VMs in which the info of availability zone is inconsistent. This results in the toggling effect of Horizon dashboard when displaying the list of instances (under tab "Project/Instances"). The toggling effect is caused by inconsistent info of availability zone whose values either the default one or the newly-created one.
This bug can easily be reproduced using Horizon dashboard as follows:
- Go to tab "Admin/Host Aggregates" to create a new host aggregate which is assigned with the current host. Pls note that if this newly-created host aggregate is not assigned with any host, then the availability zone won't be defined.
- After that, also under this view, we find (in the Availability zones) that the newly-created availability zone hides away the default one.
- Go to tab "System information", we find that nova-compute service is running in the newly-created availability zone while all the cinder services are running in the default availability zone.
- Go to tab "Project/Image" to select some image for creating a new bootable volume.
- This newly-created volume is then used to launch a new VM
- After launching a new VM, it's auto re-directed to the view of "Instances". At here, we can observe the toggling effect on the availability zone info.
Analysis:
Root cause is due to the API api.nova.server_list as described above.
This can be seen by adding some more debug info as follows:
2014-11-06 10:30:05,103 - my_logger - DEBUG - openstack_dashboard.dashboards.project.instances.views - Instance amount: 1, Instances: "[<Server: {'status': u'BUILD', 'OS-EXT-STS:task_state': u'scheduling', 'addresses': {}, 'name': u'from_vol_cirr_nova', 'links': [{u'href': u'http://192.168.56.103:8774/v2/a0c581f7a88441ed84e9878fa9fc8e50/servers/d5c1575d-8ac8-4921-802c-e9b121acd82e', u'rel': u'self'}, {u'href': u'http://192.168.56.103:8774/a0c581f7a88441ed84e9878fa9fc8e50/servers/d5c1575d-8ac8-4921-802c-e9b121acd82e', u'rel': u'bookmark'}], 'created': u'2014-11-06T10:30:03Z', 'key_name': None, 'image': u'', 'OS-DCF:diskConfig': u'AUTO', 'image_name': '-', 'OS-EXT-STS:power_state': 0, 'OS-EXT-SRV-ATTR:host': None, 'OS-EXT-SRV-ATTR:instance_name': u'instance-00000004', 'tenant_id': u'a0c581f7a88441ed84e9878fa9fc8e50', 'user_id': u'2f8c907029eb43e5ab98a55ac28c885e', 'flavor': {u'id': u'1', u'links': [{u'href': u'http://192.168.56.103:8774/a0c581f7a88441ed84e9878fa9fc8e50/flavors/1', u'rel': u'bookmark'}]}, 'OS-EXT-AZ:availability_zone': u'nova', 'id': u'd5c1575d-8ac8-4921-802c-e9b121acd82e', 'metadata': {}}>]"
2014-11-06 10:31:02,037 - my_logger - DEBUG - openstack_dashboard.dashboards.project.instances.views - Instance amount: 1, Instances: "[<Server: {'status': u'ACTIVE', 'OS-EXT-STS:task_state': None, 'addresses': {u'public': [{u'OS-EXT-IPS-MAC:mac_addr': u'fa:16:3e:13:01:55', u'version': 4, u'addr': u'172.24.4.6', u'OS-EXT-IPS:type': u'fixed'}]}, 'name': u'from_vol_cirr_nova', 'links': [{u'href': u'http://192.168.56.103:8774/v2/a0c581f7a88441ed84e9878fa9fc8e50/servers/d5c1575d-8ac8-4921-802c-e9b121acd82e', u'rel': u'self'}, {u'href': u'http://192.168.56.103:8774/a0c581f7a88441ed84e9878fa9fc8e50/servers/d5c1575d-8ac8-4921-802c-e9b121acd82e', u'rel': u'bookmark'}], 'created': u'2014-11-06T10:30:03Z', 'key_name': None, 'image': u'', 'OS-DCF:diskConfig': u'AUTO', 'image_name': '-', 'OS-EXT-STS:power_state': 1, 'OS-EXT-SRV-ATTR:host': u'ubuntu', 'OS-EXT-SRV-ATTR:instance_name': u'instance-00000004', 'tenant_id': u'a0c581f7a88441ed84e9878fa9fc8e50', 'user_id': u'2f8c907029eb43e5ab98a55ac28c885e', 'flavor': {u'id': u'1', u'links': [{u'href': u'http://192.168.56.103:8774/a0c581f7a88441ed84e9878fa9fc8e50/flavors/1', u'rel': u'bookmark'}]}, 'OS-EXT-AZ:availability_zone': u'test_az', 'id': u'd5c1575d-8ac8-4921-802c-e9b121acd82e', 'metadata': {}}>]"
2014-11-06 10:31:32,437 - my_logger - DEBUG - openstack_dashboard.dashboards.project.instances.views - Instance amount: 1, Instances: "[<Server: {'status': u'ACTIVE', 'OS-EXT-STS:task_state': None, 'addresses': {u'public': [{u'OS-EXT-IPS-MAC:mac_addr': u'fa:16:3e:13:01:55', u'version': 4, u'addr': u'172.24.4.6', u'OS-EXT-IPS:type': u'fixed'}]}, 'name': u'from_vol_cirr_nova', 'links': [{u'href': u'http://192.168.56.103:8774/v2/a0c581f7a88441ed84e9878fa9fc8e50/servers/d5c1575d-8ac8-4921-802c-e9b121acd82e', u'rel': u'self'}, {u'href': u'http://192.168.56.103:8774/a0c581f7a88441ed84e9878fa9fc8e50/servers/d5c1575d-8ac8-4921-802c-e9b121acd82e', u'rel': u'bookmark'}], 'created': u'2014-11-06T10:30:03Z', 'key_name': None, 'image': u'', 'OS-DCF:diskConfig': u'AUTO', 'image_name': '-', 'OS-EXT-STS:power_state': 1, 'OS-EXT-SRV-ATTR:host': u'ubuntu', 'OS-EXT-SRV-ATTR:instance_name': u'instance-00000004', 'tenant_id': u'a0c581f7a88441ed84e9878fa9fc8e50', 'user_id': u'2f8c907029eb43e5ab98a55ac28c885e', 'flavor': {u'id': u'1', u'links': [{u'href': u'http://192.168.56.103:8774/a0c581f7a88441ed84e9878fa9fc8e50/flavors/1', u'rel': u'bookmark'}]}, 'OS-EXT-AZ:availability_zone': u'test_az', 'id': u'd5c1575d-8ac8-4921-802c-e9b121acd82e', 'metadata': {}}>]"
When searching for the 'availability_zone' in the log above, it can be seen that its value is really inconsistent. In fact, at the beginning when the 'task_state' is 'scheduling' then the availability zone is 'nova' (the default one). However, later the availability zone changes to 'test_az' (the newly-created one)
With some added debug logs, the most possible root cause is around the function "get_host_ availability_ zone()" in the module "nova.availabil ity_zones" .
In particular, in this function, if the variable "host" is None, then the returned "aggregates" and "metadata" will also be None. Consequently, the final retrieved value of "az" will be the default one of "nova" that is hard-coded in the module.
During the process of launching new VM instance, this function is called multiple times. At the first time of being called, the passed value of "host" is always None (and the task_state is SCHEDULING). However, for subsequent calls, the "host" is returned correctly (no longer None) and everything becomes correct again.
So, the question is why the "host" value passed is None when the task_state is SCHEDULING??
Further investigation is being carried out.
Horizon is not guilty in this story but it is affected.
Nova is the guilty one.