tempest test_aggregate_add_host_create_server_with_az fails with "server failed to build and is ERROR status"

Bug #1217163 reported by John Griffith
38
This bug affects 6 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matthew Treinish

Bug Description

2013-08-27 01:59:04.280 | ======================================================================
2013-08-27 01:59:04.280 | FAIL: tempest.api.compute.admin.test_aggregates.AggregatesAdminTestJSON.test_aggregate_add_host_create_server_with_az[gate]
2013-08-27 01:59:04.281 | tempest.api.compute.admin.test_aggregates.AggregatesAdminTestJSON.test_aggregate_add_host_create_server_with_az[gate]
2013-08-27 01:59:04.281 | ----------------------------------------------------------------------
2013-08-27 01:59:04.281 | _StringException: Empty attachments:
2013-08-27 01:59:04.282 | stderr
2013-08-27 01:59:04.282 | stdout
2013-08-27 01:59:04.282 |
2013-08-27 01:59:04.282 | Traceback (most recent call last):
2013-08-27 01:59:04.283 | File "tempest/openstack/common/lockutils.py", line 246, in inner
2013-08-27 01:59:04.283 | return f(*args, **kwargs)
2013-08-27 01:59:04.283 | File "tempest/api/compute/admin/test_aggregates.py", line 218, in test_aggregate_add_host_create_server_with_az
2013-08-27 01:59:04.284 | servers_client.wait_for_server_status(server['id'], 'ACTIVE')
2013-08-27 01:59:04.284 | File "tempest/services/compute/json/servers_client.py", line 165, in wait_for_server_status
2013-08-27 01:59:04.284 | raise exceptions.BuildErrorException(server_id=server_id)
2013-08-27 01:59:04.285 | BuildErrorException: Server 9175e4e3-3186-410a-931f-d03cdbf2c818 failed to build and is in ERROR status
2013-08-27 01:59:04.285 |
2013-08-27 01:59:04.285 |
2013-08-27 01:59:04.285 | ======================================================================
2013-08-27 01:59:04.286 | FAIL: process-returncode
2013-08-27 01:59:04.286 | process-returncode
2013-08-27 01:59:04.286 | ----------------------------------------------------------------------
2013-08-27 01:59:04.287 | _StringException: Binary content:
2013-08-27 01:59:04.287 | traceback (test/plain; charset="utf8")

Revision history for this message
Matthew Treinish (treinish) wrote :

Looking at the logs from some of the rechecks on this bug this looks like a nova network issue. See:

http://logs.openstack.org/68/43868/1/check/gate-tempest-devstack-vm-postgres-full/37db98a/logs/screen-n-net.txt.gz#_2013-08-27_12_05_34_349

and

http://logs.openstack.org/68/43868/1/check/gate-tempest-devstack-vm-postgres-full/37db98a/logs/screen-n-cpu.txt.gz#_2013-08-27_12_05_34_411

I don't think there will need to be any changes on the tempest side. But, I'll wait until after there is a fix on the nova side.

Also I believe I've seen this error on other tests so I don't think there is anything aggregates specific about it.

Changed in tempest:
status: New → Triaged
importance: Undecided → Critical
assignee: nobody → Matthew Treinish (treinish)
Changed in nova:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/43929

Changed in nova:
assignee: nobody → Matthew Treinish (treinish)
status: Confirmed → In Progress
Changed in nova:
milestone: none → havana-3
Changed in tempest:
milestone: none → havana-3
Joe Gordon (jogo)
summary: - tempet test_aggregates fails
+ tempest test_aggregates fails
Mark McLoughlin (markmc)
summary: - tempest test_aggregates fails
+ tempest test_aggregate_add_host_create_server_with_az fails with "server
+ failed to build and is ERROR status"
Changed in nova:
assignee: Matthew Treinish (treinish) → Matt Riedemann (mriedem)
Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → Matthew Treinish (treinish)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/43929
Committed: http://github.com/openstack/nova/commit/7baa5265e36d76e48fdcb50b9e9a6edc45c0475e
Submitter: Jenkins
Branch: master

commit 7baa5265e36d76e48fdcb50b9e9a6edc45c0475e
Author: Matthew Treinish <email address hidden>
Date: Tue Aug 27 12:01:57 2013 -0400

    Fix race when running initialize_gateway_device()

    If multiple calls that result in initialize_gateway_device() being
    run occur at roughly the same time then there is a race between the
    ip route commands being run at the same time. This will cause
    instances to go into an error state. This commit adds a global lock
    to the initialize_gateway_device() method to prevent it from being
    run at the same time to avoid this issue.

    The race condition is not directly testable in unit tests because it
    requires a multithreaded environment to run
    initialize_gateway_device() at the same time. It was uncovered
    with tempest in parallel.

    Fixes bug 1217163

    Change-Id: Ib750381636d1341062928d0abc8d3518e327935e

Changed in nova:
status: In Progress → Fix Committed
no longer affects: tempest
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: havana-3 → 2013.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.