tempest.api.orchestration.stacks.test_neutron_resources.NeutronResourcesTestJSON failed to reach CREATE_COMPLETE status within the required time

Bug #1288970 reported by Steve Baker
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
High
Steve Baker
tempest
Fix Released
Undecided
Steve Baker

Bug Description

This is an intermittent error which happens for ~20% of heat-slow runs and is likely to be related to failure of orchestrated resources rather than heat itself.

2014-03-06 19:34:36.985 | setUpClass (tempest.api.orchestration.stacks.test_neutron_resources.NeutronResourcesTestJSON)
2014-03-06 19:34:36.985 | setUpClass (tempest.api.orchestration.stacks.test_neutron_resources.NeutronResourcesTestJSON) ... FAIL
2014-03-06 19:37:56.452 | tempest.api.orchestration.stacks.test_server_cfn_init.ServerCfnInitTestJSON.test_can_log_into_created_server[slow]
2014-03-06 19:37:56.452 | tempest.api.orchestration.stacks.test_server_cfn_init.ServerCfnInitTestJSON.test_can_log_into_created_server[slow] ... ok
2014-03-06 19:39:15.598 | tempest.api.orchestration.stacks.test_server_cfn_init.ServerCfnInitTestJSON.test_stack_wait_condition_data[slow]
2014-03-06 19:39:15.599 | tempest.api.orchestration.stacks.test_server_cfn_init.ServerCfnInitTestJSON.test_stack_wait_condition_data[slow] ... ok
2014-03-06 19:39:24.591 | tempest.scenario.orchestration.test_autoscaling.AutoScalingTest.test_scale_up_then_down[compute,orchestration,slow]
2014-03-06 19:39:24.591 | tempest.scenario.orchestration.test_autoscaling.AutoScalingTest.test_scale_up_then_down[compute,orchestration,slow] ... skipped u'Skipped until Bug: 1257575 is resolved.'
2014-03-06 19:39:24.691 |
2014-03-06 19:39:24.691 | process-returncode
2014-03-06 19:39:24.692 | process-returncode ... FAIL
2014-03-06 19:39:24.734 |
2014-03-06 19:39:24.734 | ======================================================================
2014-03-06 19:39:24.734 | FAIL: setUpClass (tempest.api.orchestration.stacks.test_neutron_resources.NeutronResourcesTestJSON)
2014-03-06 19:39:24.735 | setUpClass (tempest.api.orchestration.stacks.test_neutron_resources.NeutronResourcesTestJSON)
2014-03-06 19:39:24.735 | ----------------------------------------------------------------------
2014-03-06 19:39:24.735 | _StringException: Traceback (most recent call last):
2014-03-06 19:39:24.735 | File "tempest/api/orchestration/stacks/test_neutron_resources.py", line 144, in setUpClass
2014-03-06 19:39:24.735 | raise e
2014-03-06 19:39:24.735 | TimeoutException: Request timed out
2014-03-06 19:39:24.735 | Details: Stack heat-1118778629 failed to reach CREATE_COMPLETE status within the required time (300 s).

Tags: gate-failure
Revision history for this message
Steve Baker (steve-stevebaker) wrote :

Here is an example of a typical failure:
http://logs.openstack.org/86/78286/2/check/check-tempest-dsvm-neutron-heat-slow/ee145bf/

Because there was a failure, the boot log of the orchestrated server is outputted (search for tempest/api/orchestration/stacks/test_neutron_resources.py:143):
http://logs.openstack.org/86/78286/2/check/check-tempest-dsvm-neutron-heat-slow/ee145bf/logs/tempest.txt.gz

The timeout for this entire test is 300s, yet here the time for booting alone is 244s.

I believe this test would become significantly more reliable if the timeout was raised. 600s?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tempest (master)

Fix proposed to branch: master
Review: https://review.openstack.org/78756

Changed in tempest:
assignee: nobody → Steve Baker (steve-stevebaker)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tempest (master)

Reviewed: https://review.openstack.org/78756
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=27f02430864443c127dbf4a09a62497c60fd0aa4
Submitter: Jenkins
Branch: master

commit 27f02430864443c127dbf4a09a62497c60fd0aa4
Author: Steve Baker <email address hidden>
Date: Fri Mar 7 09:47:32 2014 +1300

    Raise orchestration build_timeout to 600 seconds

    Observed boot time for a single server has been around 250 seconds
    so a build_timeout default of 300 would explain why ~20% of heat-slow
    jobs are failing with stack timeout errors.

    Closes-Bug: #1288970
    Change-Id: I4feb1b89acf8db0e164468d0471aff71ff5c6a77

Changed in tempest:
status: In Progress → Fix Released
Revision history for this message
Attila Fazekas (afazekas) wrote :
Revision history for this message
Attila Fazekas (afazekas) wrote :

The bellow ERROR message is visible in the logs/screen-h-eng.txt

Log stash:
message: "DB error resource with id" AND NOT build_status:"FAILURE" 0 not failed jobs
message: "DB error resource with id" AND build_status:"FAILURE" 85 hit

http://logs.openstack.org/19/82519/1/gate/gate-tempest-dsvm-neutron-heat-slow/e26bec1/logs/screen-h-eng.txt.gz?level=WARNING#_2014-04-02_17_21_10_558

Zane Bitter (zaneb)
Changed in heat:
assignee: nobody → Steve Baker (steve-stevebaker)
Angus Salkeld (asalkeld)
tags: added: gate-failure
Angus Salkeld (asalkeld)
Changed in heat:
importance: Undecided → High
status: New → Triaged
Revision history for this message
Angus Salkeld (asalkeld) wrote :

<stevebaker> asalkeld: that can be closed. that test uses cirros as of last week

Changed in heat:
status: Triaged → Fix Committed
Thierry Carrez (ttx)
Changed in heat:
milestone: none → kilo-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in heat:
milestone: kilo-rc1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.