Nova compute guest still stuck in BUILD state after 400s

Bug #1181567 reported by Eric Harney on 2013-05-18
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Undecided
Unassigned
tempest
Undecided
Unassigned

Bug Description

Failure occurred on https://review.openstack.org/#/c/29591/2

http://logs.openstack.org/29591/2/gate/gate-tempest-devstack-vm-quantum/23189/console.html.gz

2013-05-17 22:54:02.079 | ======================================================================
2013-05-17 22:54:02.079 | ERROR: test suite for <class 'tempest.tests.compute.servers.test_create_server.ServersTestXML'>
2013-05-17 22:54:02.079 | ----------------------------------------------------------------------
2013-05-17 22:54:02.079 | Traceback (most recent call last):
2013-05-17 22:54:02.079 | File "/usr/lib/python2.7/dist-packages/nose/suite.py", line 208, in run
2013-05-17 22:54:02.080 | self.setUp()
2013-05-17 22:54:02.080 | File "/usr/lib/python2.7/dist-packages/nose/suite.py", line 291, in setUp
2013-05-17 22:54:02.080 | self.setupContext(ancestor)
2013-05-17 22:54:02.080 | File "/usr/lib/python2.7/dist-packages/nose/suite.py", line 314, in setupContext
2013-05-17 22:54:02.080 | try_run(context, names)
2013-05-17 22:54:02.080 | File "/usr/lib/python2.7/dist-packages/nose/util.py", line 478, in try_run
2013-05-17 22:54:02.080 | return func()
2013-05-17 22:54:02.080 | File "/opt/stack/new/tempest/tempest/tests/compute/servers/test_create_server.py", line 57, in setUpClass
2013-05-17 22:54:02.080 | cls.client.wait_for_server_status(cls.server_initial['id'], 'ACTIVE')
2013-05-17 22:54:02.081 | File "/opt/stack/new/tempest/tempest/services/compute/xml/servers_client.py", line 311, in wait_for_server_status
2013-05-17 22:54:02.081 | raise exceptions.TimeoutException(message)
2013-05-17 22:54:02.081 | TimeoutException: Request timed out
2013-05-17 22:54:02.081 | Details: Request timed out
2013-05-17 22:54:02.081 | Details: Server 87c1dc14-44b1-406f-a7a0-c41876dc9111 failed to reach ACTIVE status within the required time (400 s). Current status: BUILD.

Jeremy Stanley (fungi) wrote :

I didn't see any obvious open bugs this duplicates in tempest, but hopefully the QA crowd will have a little more insight on it.

affects: openstack-ci → tempest
Dolph Mathews (dolph) wrote :

Slightly different back trace from http://logs.openstack.org/31489/1/gate/gate-tempest-devstack-vm-quantum/27053/console.html but I believe it to be a similar timeout:

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nose/suite.py", line 208, in run
    self.setUp()
  File "/usr/lib/python2.7/dist-packages/nose/suite.py", line 291, in setUp
    self.setupContext(ancestor)
  File "/usr/lib/python2.7/dist-packages/nose/suite.py", line 314, in setupContext
    try_run(context, names)
  File "/usr/lib/python2.7/dist-packages/nose/util.py", line 478, in try_run
    return func()
  File "/opt/stack/new/tempest/tempest/api/compute/servers/test_server_addresses.py", line 31, in setUpClass
    resp, cls.server = cls.create_server(wait_until='ACTIVE')
  File "/opt/stack/new/tempest/tempest/api/compute/base.py", line 195, in create_server
    server['id'], kwargs['wait_until'])
  File "/opt/stack/new/tempest/tempest/services/compute/json/servers_client.py", line 165, in wait_for_server_status
    raise exceptions.BuildErrorException(server_id=server_id)
BuildErrorException: Server %(server_id)s failed to build and is in ERROR status
Details: Server 22b47ce3-f05a-42e9-94b2-d13159b226b0 failed to build and is in ERROR status

summary: - tempest: test_create_server timeout
+ tempest: test_create_server / wait_for_server_status timeout
Hans Lindgren (hanlind) wrote :

My previous comment referencing bug 1185834 only applied to comment #2.

AFAICT the originally reported bug is indeed unique. There is no KeyError reported in the compute log of that one and the state of the instance is BUILD, not ERROR as in the case of bug 1185834.

If one follows the failing spawn request (req-1cec748a-a4ec-42f6-a877-864605a1d688) it looks like it never finishes, no trace or anything unusual other than the request id never shows up in the log again after this final row:

2013-05-17 22:41:06.500 DEBUG nova.openstack.common.processutils [req-1cec748a-a4ec-42f6-a877-864605a1d688 demo demo] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf qemu-nbd -c /dev/nbd3 /opt/stack/data/nova/instances/87c1dc14-44b1-406f-a7a0-c41876dc9111/disk execute /opt/stack/new/nova/nova/openstack/common/processutils.py:142

Attila Fazekas (afazekas) wrote :

Is any similar issue happens in a not neutron job ?

Changed in tempest:
status: New → Incomplete
Mate Lakat (mate-lakat) wrote :

I think I hit this bug here:

    https://review.openstack.org/39360

Changed in nova:
status: New → Incomplete
status: Incomplete → Opinion
status: Opinion → New
tags: added: testing
Sean Dague (sdague) wrote :

Not actually a tempest bug, this is a race in Nova

Changed in tempest:
status: Incomplete → Invalid
summary: - tempest: test_create_server / wait_for_server_status timeout
+ Nova compute guest still stuck in BUILD state after 400s
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers