tempest doesn't always check to make sure deleting servers worked

Bug #1372696 reported by Joe Gordon
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tempest
Invalid
Undecided
Joe Gordon

Bug Description

tempest doesn't always check to make sure deleting servers worked

For example tempest/api/compute/images/test_images_negative.py:test_create_image_from_stopped_server calls: self.addCleanup(self.servers_client.delete_server, server['id'])

This means that its easier for delete server bugs to go unnoticed and this will also cause tempest to leak resources.

but the nova command to delete a server is asynchronous and tempest/services/compute/json/servers_client.py:delete_server doesn't wait until the server is done deleting. There is a separate command that does that tempest/services/compute/json/servers_client.py:wait_for_server_termination. But it would be nice if there was a single function that did delete_server_and_wait_for_termination or something like that.

Found while investigating https://bugs.launchpad.net/nova/+bug/1372670 (using http://logs.openstack.org/43/116443/7/check/check-tempest-dsvm-full/6ccbfa9/logs/tempest.txt.gz)

Revision history for this message
Joe Gordon (jogo) wrote :

The fix should be straight forward code wise, but not sure what the preferred way of fixing this is. Where should the new function live? Do we want non-nova tests to fail if a nova-delete fails?

description: updated
Revision history for this message
Matthew Treinish (treinish) wrote :

So the test referenced does wait for the delete, but as a part of it's tearDownClass. The test uses the common create_server call from the base test class to create the server:

http://git.openstack.org/cgit/openstack/tempest/tree/tempest/api/compute/images/test_images_negative.py#n69

which appends the server to the list of servers to be tore down during tearDownClass:

http://git.openstack.org/cgit/openstack/tempest/tree/tempest/api/compute/base.py#n255

and

http://git.openstack.org/cgit/openstack/tempest/tree/tempest/api/compute/base.py#n143

So I'm not sure where the bug here is exactly. Delete will be called twice on the server once because of addCleanup and once because of clear_servers(), but that shouldn't be an issue because of clear_servers() is written to be fault tolerant.

Although looking at the test code it's probably an issue with the clear_servers() method which is designed to not hang up on an exception, we're also not logging one if we hit it. It's probably worth improving the logic in clear_servers() to log exceptions on errors, besides 404s, before moving on to the next server so we have a record that the delete failed. I don't think we want to necessarily fail here, if the delete was an important part of what was being tested it should be done explicitly in the test method.

Changed in tempest:
status: New → Incomplete
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/123555

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/123562

Joe Gordon (jogo)
Changed in tempest:
assignee: nobody → Joe Gordon (jogo)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tempest (master)

Reviewed: https://review.openstack.org/123555
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=0c33579fe94712f84e4641ae9f1ea66033144454
Submitter: Jenkins
Branch: master

commit 0c33579fe94712f84e4641ae9f1ea66033144454
Author: Joe Gordon <email address hidden>
Date: Tue Sep 23 12:36:11 2014 -0700

    Log deletion errors in clean servers

    Although we don't want to cause tempest to fail on error while deleting
    a server in tearDownClass as these types of issues are hard to debug. We
    should at least log the issues instead of hiding them completely.

    Change-Id: Id0c172e247ecfcd4eec85bafd4787e33944cee1a
    Related-Bug: #1372696

Revision history for this message
Yaroslav Lobankov (ylobankov) wrote :

Deleting servers is called twice, in addCleanup and clear_servers(). The method clear_servers() has a check to wait for server termination (http://paste.openstack.org/show/328140/). So it looks like this bug should be "Invalid" rather than "Incomplete".

Changed in tempest:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.