test_resource_group functional test times out for delete

Bug #1617130 reported by Rabi Mishra
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Heat
New
Medium
Rabi Mishra

Bug Description

test_resource_group.ResourceGroupErrorResourceTest.test_fail[1] fails to reach the expected status and timesout.

[1] http://logs.openstack.org/31/360831/2/check/gate-heat-dsvm-functional-orig-mysql-lbaasv2/660a7bb/console.html#_2016-08-26_01_36_02_760206

2016-08-26 01:36:02.750391 | 2016-08-26 01:36:02.749 | Captured traceback:
2016-08-26 01:36:02.753642 | 2016-08-26 01:36:02.752 | ~~~~~~~~~~~~~~~~~~~
2016-08-26 01:36:02.756755 | 2016-08-26 01:36:02.755 | Traceback (most recent call last):
2016-08-26 01:36:02.760206 | 2016-08-26 01:36:02.758 | File "/opt/stack/new/heat/heat_integrationtests/functional/test_resource_group.py", line 505, in test_fail
2016-08-26 01:36:02.762890 | 2016-08-26 01:36:02.762 | success_on_not_found=True)
2016-08-26 01:36:02.765993 | 2016-08-26 01:36:02.764 | File "/opt/stack/new/heat/heat_integrationtests/common/test.py", line 349, in _wait_for_stack_status
2016-08-26 01:36:02.769014 | 2016-08-26 01:36:02.768 | raise exceptions.TimeoutException(message)
2016-08-26 01:36:02.772426 | 2016-08-26 01:36:02.771 | heat_integrationtests.common.exceptions.TimeoutException: Request timed out
2016-08-26 01:36:02.776086 | 2016-08-26 01:36:02.774 | Details: Stack ResourceGroupErrorResourceTest-1839635946/10d52b1c-b371-498a-886d-6d3e8ca7b90f failed to reach DELETE_COMPLETE status within the required time (1200 s).
2016-08-26 01:36:02.779161 | 2016-08-26 01:36:02.778 |
2016-08-26 01:37:18.315074 | 2016-08-26 01:37:18.314 | {5} heat_integrationtests.functional.test_software_config.ParallelDeploymentsTest.test_deployments_metadata [75.567784s] ... ok
2016-08-26 01:37:52.820041 | 2016-08-26 01:37:52.819 | {5} heat_integrationtests.functional.test_software_config.ZaqarSignalTransportTest.test_signal_queues [34.504027s] ... ok
2016-08-26 01:37:58.904195 | 2016-08-26 01:37:58.903 | {5} heat_integrationtests.functional.test_stack_outputs.StackOutputsTest.test_outputs [6.082737s] ... ok
2016-08-26 01:38:17.927925 | 2016-08-26 01:38:17.927 | {5} heat_integrationtests.functional.test_template_resource.TemplateResourceSuspendResumeTest.test_suspend_resume [19.022570s] ... ok

Tags: gate-failure
Rabi Mishra (rabi)
description: updated
Revision history for this message
Rabi Mishra (rabi) wrote :

It seems to be happening(for other tests too) since last week, though there are no obvious error in the logs.

Logstash query:

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%20%5C%22DELETE_COMPLETE%20status%20within%20the%20required%20time%20(1200%20s)%5C%22

Revision history for this message
Zane Bitter (zaneb) wrote :

Looking at the log http://logs.openstack.org/31/360831/2/check/gate-heat-dsvm-functional-orig-mysql-lbaasv2/660a7bb/logs/screen-h-eng.txt.gz things go something like as follows:

- The TestResources in both nested stacks (member 0 and member 1) fail as expected.
- The ResourceGroup notices that member 1 has failed first, and tries to cancel member 0.
- Member 0 has in fact already failed, but the ResourceGroup hasn't noticed yet.
- The ResourceGroup and then the main stack go to FAILED, and the test starts a delete.
- We get an RPC message to delete member 0 but then never hear from it again in the logs.

It's tempting to think that the cancel might be somehow in an race that ends up cancelling the delete. However, (a) the cancellation of the ResourceGroup create and all the calls to put the cancel message in the queues are synchronous, so they ought to be guaranteed to happen before the delete starts; and (b) the delete operation does not listen on a queue for a cancel message anyway.

So somehow the delete is failing after receiving the RPC message ("Deleting stack" msg in service.py) but before getting to "Stack DELETE IN_PROGRESS" (at the start of the Stack.delete() thread), and nothing is being logged.

Zane Bitter (zaneb)
tags: added: gate-failure
Rico Lin (rico-lin)
Changed in heat:
milestone: none → no-priority-tag-bugs
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.