test_list_servers_by_admin_with_all_tenants fails with InstanceNotFound trying to lazy-load flavor

Bug #1487570 reported by Matt Riedemann
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matt Riedemann

Bug Description

http://logs.openstack.org/70/215170/1/check/gate-tempest-dsvm-nova-v21-full/3fdc0d6/console.html#_2015-08-21_16_04_53_513

2015-08-21 16:04:53.514 | Captured traceback:
2015-08-21 16:04:53.514 | ~~~~~~~~~~~~~~~~~~~
2015-08-21 16:04:53.514 | Traceback (most recent call last):
2015-08-21 16:04:53.514 | File "tempest/api/compute/admin/test_servers.py", line 81, in test_list_servers_by_admin_with_all_tenants
2015-08-21 16:04:53.514 | body = self.client.list_servers(detail=True, **params)
2015-08-21 16:04:53.514 | File "tempest/services/compute/json/servers_client.py", line 159, in list_servers
2015-08-21 16:04:53.514 | resp, body = self.get(url)
2015-08-21 16:04:53.514 | File "/opt/stack/new/tempest/.tox/full/local/lib/python2.7/site-packages/tempest_lib/common/rest_client.py", line 271, in get
2015-08-21 16:04:53.514 | return self.request('GET', url, extra_headers, headers)
2015-08-21 16:04:53.514 | File "/opt/stack/new/tempest/.tox/full/local/lib/python2.7/site-packages/tempest_lib/common/rest_client.py", line 643, in request
2015-08-21 16:04:53.515 | resp, resp_body)
2015-08-21 16:04:53.515 | File "/opt/stack/new/tempest/.tox/full/local/lib/python2.7/site-packages/tempest_lib/common/rest_client.py", line 754, in _error_checker
2015-08-21 16:04:53.515 | raise exceptions.ServerFault(resp_body, message=message)
2015-08-21 16:04:53.515 | tempest_lib.exceptions.ServerFault: Got server fault
2015-08-21 16:04:53.515 | Details: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
2015-08-21 16:04:53.515 | <class 'nova.exception.InstanceNotFound'>

There is a trace in the n-api logs when trying to lazy-load a flavor on an instance:

http://logs.openstack.org/70/215170/1/check/gate-tempest-dsvm-nova-v21-full/3fdc0d6/logs/screen-n-api.txt.gz?level=TRACE#_2015-08-21_15_39_06_148

2015-08-21 15:39:06.148 ERROR nova.api.openstack.extensions [req-5eca1fa9-7948-4a3d-bc80-7e84441bb74e tempest-ServersAdminTestJSON-973647583 tempest-ServersAdminTestJSON-692147874] Unexpected exception in API method
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions Traceback (most recent call last):
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/extensions.py", line 478, in wrapped
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions return f(*args, **kwargs)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/compute/servers.py", line 264, in detail
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions servers = self._get_servers(req, is_detail=True)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/compute/servers.py", line 389, in _get_servers
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions response = self._view_builder.detail(req, instance_list)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/compute/views/servers.py", line 126, in detail
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions return self._list_view(self.show, request, instances, coll_name)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/compute/views/servers.py", line 138, in _list_view
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions server_list = [func(request, server)["server"] for server in servers]
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/compute/views/servers.py", line 266, in show
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions "flavor": self._get_flavor(request, instance),
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/compute/views/servers.py", line 198, in _get_flavor
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions instance_type = instance.get_flavor()
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/objects/instance.py", line 890, in get_flavor
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions return getattr(self, attr)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 65, in getter
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions self.obj_load_attr(name)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/objects/instance.py", line 880, in obj_load_attr
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions self._load_flavor()
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/objects/instance.py", line 811, in _load_flavor
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions expected_attrs=['flavor', 'system_metadata'])
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 169, in wrapper
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions result = fn(cls, context, *args, **kwargs)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/objects/instance.py", line 435, in get_by_uuid
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions use_slave=use_slave)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/db/api.py", line 644, in instance_get_by_uuid
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions columns_to_join, use_slave=use_slave)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 230, in wrapper
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions return f(*args, **kwargs)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 1722, in instance_get_by_uuid
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions columns_to_join=columns_to_join, use_slave=use_slave)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 1734, in _instance_get_by_uuid
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions raise exception.InstanceNotFound(instance_id=uuid)
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions InstanceNotFound: Instance b4025d7a-c2ef-4234-a6ee-28edf966bfe7 could not be found.
2015-08-21 15:39:06.148 22838 ERROR nova.api.openstack.extensions

So between the time that we've retrieved the flavor from the database and the time we try to lazy-load the flavor, the instance is deleted and this fails.

We could pre-load the flavor in this case, or read_deleted='yes' when we're lazy-loading a flavor.

Revision history for this message
Matt Riedemann (mriedem) wrote :
Changed in nova:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Matt Riedemann (mriedem) wrote :

OK, so test_list_servers_by_admin_with_all_tenants is just a test where the admin user lists servers for all tenants. Given the concurrent nature of tempest runs, there could be other tests deleting an instance after the initial list call is made by the test_list_servers_by_admin_with_all_tenants test but before the instance is actually deleted, that's why when we lazy-load the flavor the instance is gone and it fails.

So we could:

1. pre-load expected_attrs when listing details for instances if doing --all-tenants
2. handle InstanceNotFound when looping over the instances pulling details and we're doing --all-tenants (or if context.user_id != instance.user_id)

Changed in nova:
status: Confirmed → Triaged
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

(1:12:23 PM) mriedem: superdan: ah, ok, so it's test_list_servers_by_admin_with_all_tenants that's failing,
(1:12:33 PM) mriedem: which is a test where the admin just lists servers for all tenants
(1:12:38 PM) mriedem: so yeah definitely a chance for a race there
(1:12:46 PM) mriedem: given multi worker tempest
(1:12:54 PM) mriedem: one test is tearing down an instance after test_list_servers_by_admin_with_all_tenants started it's query
(1:13:18 PM) mriedem: so if --all-tenants is true when listing servers, we should be lazy-loading the stuff that will be viewed
(1:14:30 PM) mriedem: or we just handle InstanceNotFound in that case
(1:27:48 PM) mriedem: seems better to handle the InstanceNotFound in the all-tenants list case, otherwise we have to know to pre-load flavor, info_cache and metadata since that's what the detail view uses
(1:28:00 PM) mriedem: and would tightly couple the api extension code to the view builder, which is gross
(1:28:39 PM) mriedem: or we read_deleted=yes when lazy-loading, as noted, but if we want to remove soft delete then it seems we wouldn't want to use read_deleted=yes in more places if we can help it

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/215774

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/215859

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.openstack.org/215774
Reason: I like this approach better:

https://review.openstack.org/#/c/215859/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/215859
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2e68b2298e94a15d1282c0fb46804b9efa6c8b3a
Submitter: Jenkins
Branch: master

commit 2e68b2298e94a15d1282c0fb46804b9efa6c8b3a
Author: Matt Riedemann <email address hidden>
Date: Fri Aug 21 18:44:52 2015 -0700

    Pre-load expected attrs that the view builder needs for server details

    There is a race between getting the instances from the database and
    when the view builder shows the details. If an instance is deleted
    before lazy-loading some info, like the flavor, then the view
    builder fails with an InstanceNotFound and the request returns a 500
    in the v2.1 case and a 404 in the v2 case.

    We can have the view builder tell the API code what attributes it
    expects when it gets the instance from the database so that we don't
    have to do any lazy loading in the view builder.

    Change-Id: Ie03ab415baaa9e88c423a1c60a40d53b4cc464c4
    Closes-Bug: #1487570

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: liberty-3 → 12.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.