test_admin_delete_servers_of_others failure due to unexpected task state

Bug #1277027 reported by Steven Hardy
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Unassigned
tempest
Invalid
High
Unassigned

Bug Description

I couldn't find an existing bug for this, apologies if it's a dupe, looks like a nova bug:

https://review.openstack.org/#/c/70717/
http://logs.openstack.org/17/70717/1/gate/gate-tempest-dsvm-full/9adaf90/console.html

2014-02-06 08:04:16.350 | 2014-02-06 07:48:25,729 Response Body: {"server": {"status": "ERROR", "os-access-ips:access_ip_v6": "", "updated": "2014-02-06T07:48:25Z", "os-access-ips:access_ip_v4": "", "addresses": {}, "links": [{"href": "http://127.0.0.1:8774/v3/servers/1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd", "rel": "self"}, {"href": "http://127.0.0.1:8774/servers/1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd", "rel": "bookmark"}], "os-extended-status:task_state": null, "key_name": null, "image": {"id": "7e649e07-3cea-4d95-90b2-2bbea7fce698", "links": [{"href": "http://23.253.79.233:9292/images/7e649e07-3cea-4d95-90b2-2bbea7fce698", "rel": "bookmark"}]}, "os-pci:pci_devices": [], "os-extended-availability-zone:availability_zone": "nova", "os-extended-status:power_state": 0, "os-config-drive:config_drive": "", "host_id": "10f0dc42e72572ed6d30e8dc32b41edc1d41a3dacda6571c5aeabe6e", "flavor": {"id": "42", "links": [{"href": "http://127.0.0.1:8774/flavors/42", "rel": "bookmark"}]}, "id": "1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd", "security_groups": [{"name": "default"}], "user_id": "f2262ed0a64c43359867456cfbccc153", "name": "ServersAdminV3Test-instance-1705049062", "created": "2014-02-06T07:48:21Z", "tenant_id": "8e932e11471a469e85a30195b2198f63", "os-extended-status:vm_state": "error", "os-server-usage:launched_at": null, "os-extended-volumes:volumes_attached": [], "os-server-usage:terminated_at": null, "os-extended-status:locked_by": null, "fault": {"message": "No valid host was found. ", "code": 500, "created": "2014-02-06T07:48:25Z"}, "metadata": {}}}
2014-02-06 08:04:16.350 | }}}
2014-02-06 08:04:16.350 |
2014-02-06 08:04:16.350 | Traceback (most recent call last):
2014-02-06 08:04:16.350 | File "tempest/api/compute/v3/admin/test_servers.py", line 85, in test_admin_delete_servers_of_others
2014-02-06 08:04:16.350 | self.servers_client.wait_for_server_termination(server['id'])
2014-02-06 08:04:16.351 | File "tempest/services/compute/v3/json/servers_client.py", line 179, in wait_for_server_termination
2014-02-06 08:04:16.351 | raise exceptions.BuildErrorException(server_id=server_id)
2014-02-06 08:04:16.351 | BuildErrorException: Server 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd failed to build and is in ERROR status
2014-02-06 08:04:16.351 |
2014-02-06 08:04:16.351 |
2014-02-06 08:04:16.351 | ======================================================================
2014-02-06 08:04:16.351 | FAIL: process-returncode
2014-02-06 08:04:16.351 | process-returncode
2014-02-06 08:04:16.351 | ----------------------------------------------------------------------
2014-02-06 08:04:16.352 | _StringException: Binary content:
2014-02-06 08:04:16.352 | traceback (test/plain; charset="utf8")
2014-02-06 08:04:16.352 |
2014-02-06 08:04:16.352 |
2014-02-06 08:04:16.352 | ----------------------------------------------------------------------
2014-02-06 08:04:16.352 | Ran 2101 tests in 2350.793s
2014-02-06 08:04:16.353 |
2014-02-06 08:04:16.353 | FAILED (failures=2, skipped=130)
2014-02-06 08:04:16.353 | ERROR: InvocationError: '/bin/bash tools/pretty_tox.sh (?!.*\\[.*\\bslow\\b.*\\])(^tempest\\.(api|scenario|thirdparty|cli)) --concurrency=2'
2014-02-06 08:04:16.354 | ___________________________________ summary ____________________________________
2014-02-06 08:04:16.354 | ERROR: full: commands failed
2014-02-06 08:04:16.463 | Checking logs...
2014-02-06 08:04:16.562 | Log File: n-net
2014-02-06 08:04:16.562 | 2014-02-06 07:34:40.598 30086 ERROR oslo.messaging._executors.base [-] Exception during message handling
2014-02-06 08:04:16.562 |
2014-02-06 08:04:16.563 | 2014-02-06 07:34:40.601 30086 ERROR oslo.messaging._drivers.common [-] Returning exception Instance fcfcbace-c4dd-4214-957f-b01e0b47fcf4 could not be found.
2014-02-06 08:04:16.563 |
2014-02-06 08:04:16.563 | 2014-02-06 07:34:40.601 30086 ERROR oslo.messaging._drivers.common [-] ['Traceback (most recent call last):\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/_executors/base.py", line 36, in _dispatch\n incoming.reply(self.callback(incoming.ctxt, incoming.message))\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/dispatcher.py", line 134, in __call__\n return self._dispatch(endpoint, method, ctxt, args)\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/dispatcher.py", line 104, in _dispatch\n result = getattr(endpoint, method)(ctxt, **new_args)\n', ' File "/opt/stack/new/nova/nova/network/floating_ips.py", line 117, in allocate_for_instance\n **kwargs)\n', ' File "/opt/stack/new/nova/nova/network/manager.py", line 521, in allocate_for_instance\n host)\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/server.py", line 153, in inner\n return func(*args, **kwargs)\n', ' File "/opt/stack/new/nova/nova/network/manager.py", line 579, in get_instance_nw_info\n instance_uuid, use_slave=use_slave)\n', ' File "/opt/stack/new/nova/nova/objects/base.py", line 110, in wrapper\n args, kwargs)\n', ' File "/opt/stack/new/nova/nova/conductor/rpcapi.py", line 470, in object_class_action\n objver=objver, args=args, kwargs=kwargs)\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/client.py", line 150, in call\n wait_for_reply=True, timeout=timeout)\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/transport.py", line 87, in _send\n timeout=timeout)\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/_drivers/amqpdriver.py", line 390, in send\n return self._send(target, ctxt, message, wait_for_reply, timeout)\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/_drivers/amqpdriver.py", line 383, in _send\n raise result\n', 'InstanceNotFound_Remote: Instance fcfcbace-c4dd-4214-957f-b01e0b47fcf4 could not be found.\nTraceback (most recent call last):\n\n File "/opt/stack/new/nova/nova/conductor/manager.py", line 572, in _object_dispatch\n return getattr(target, method)(context, *args, **kwargs)\n\n File "/opt/stack/new/nova/nova/objects/base.py", line 112, in wrapper\n result = fn(cls, context, *args, **kwargs)\n\n File "/opt/stack/new/nova/nova/objects/virtual_interface.py", line 98, in get_by_instance_uuid\n use_slave=use_slave)\n\n File "/opt/stack/new/nova/nova/db/api.py", line 594, in virtual_interface_get_by_instance\n use_slave=use_slave)\n\n File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 128, in wrapper\n return f(*args, **kwargs)\n\n File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 140, in wrapper\n instance_get_by_uuid(context, instance_uuid)\n\n File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 128, in wrapper\n return f(*args, **kwargs)\n\n File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 1701, in instance_get_by_uuid\n columns_to_join=columns_to_join, use_slave=use_slave)\n\n File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 1713, in _instance_get_by_uuid\n raise exception.InstanceNotFound(instance_id=uuid)\n\nInstanceNotFound: Instance fcfcbace-c4dd-4214-957f-b01e0b47fcf4 could not be found.\n\n']
2014-02-06 08:04:16.564 |
2014-02-06 08:04:16.957 | Log File: n-cond
2014-02-06 08:04:16.957 | 2014-02-06 07:35:54.076 30662 ERROR oslo.messaging._executors.base [-] Exception during message handling
2014-02-06 08:04:16.957 |
2014-02-06 08:04:16.958 | 2014-02-06 07:35:54.077 30662 ERROR oslo.messaging._drivers.common [-] Returning exception Unexpected task state: expecting [u'scheduling', None] but the actual state is deleting to caller
2014-02-06 08:04:16.958 |
2014-02-06 08:04:16.958 | 2014-02-06 07:35:54.077 30662 ERROR oslo.messaging._drivers.common [-] ['Traceback (most recent call last):\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/_executors/base.py", line 36, in _dispatch\n incoming.reply(self.callback(incoming.ctxt, incoming.message))\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/dispatcher.py", line 134, in __call__\n return self._dispatch(endpoint, method, ctxt, args)\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/dispatcher.py", line 104, in _dispatch\n result = getattr(endpoint, method)(ctxt, **new_args)\n', ' File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/server.py", line 153, in inner\n return func(*args, **kwargs)\n', ' File "/opt/stack/new/nova/nova/conductor/manager.py", line 129, in instance_update\n context, instance_uuid, updates)\n', ' File "/opt/stack/new/nova/nova/db/api.py", line 740, in instance_update_and_get_original\n columns_to_join=columns_to_join)\n', ' File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 128, in wrapper\n return f(*args, **kwargs)\n', ' File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 2215, in instance_update_and_get_original\n columns_to_join=columns_to_join)\n', ' File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 2263, in _instance_update\n actual=actual_state, expected=expected)\n', "UnexpectedDeletingTaskStateError: Unexpected task state: expecting [u'scheduling', None] but the actual state is deleting\n"]
2014-02-06 08:04:16.958 |
2014-02-06 08:04:17.064 | 2014-02-06 07:40:36.034 30669 ERROR oslo.messaging._executors.base [-] Exception during message handling

Tags: testing
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/71709

Revision history for this message
Masayuki Igawa (igawa) wrote :

I think https://bugs.launchpad.net/tempest/+bug/1061167 is also related to this bug.

Revision history for this message
John Garbutt (johngarbutt) wrote :

Assigning to Igawa who submitted the patch.

Changed in nova:
importance: Undecided → High
status: New → Triaged
assignee: nobody → Masayuki Igawa (igawa)
Revision history for this message
John Garbutt (johngarbutt) wrote :

its a tempest issue I guess, but it affects our gate.

tags: added: gate
tags: added: testing
removed: gate
Revision history for this message
Matt Riedemann (mriedem) wrote :
Download full text (3.9 KiB)

The error in the scheduler log is here:

http://logs.openstack.org/17/70717/1/gate/gate-tempest-dsvm-full/9adaf90/logs/screen-n-sch.txt.gz#_2014-02-06_07_48_25_557

The error in the compute log is here:

http://logs.openstack.org/17/70717/1/gate/gate-tempest-dsvm-full/9adaf90/logs/screen-n-cpu.txt.gz#_2014-02-06_07_48_25_544

2014-02-06 07:48:25.544 ERROR nova.compute.manager [req-87a44fcf-fc65-4d3f-8c09-56553d895159 ServersAdminV3Test-929511318 ServersAdminV3Test-156703162] [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] Error: [Errno 2] No such file or directory: '/opt/stack/data/nova/instances/1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd/libvirt.xml'
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] Traceback (most recent call last):
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] File "/opt/stack/new/nova/nova/compute/manager.py", line 1059, in _build_instance
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] set_access_ip=set_access_ip)
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] File "/opt/stack/new/nova/nova/compute/manager.py", line 351, in decorated_function
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] return function(self, context, *args, **kwargs)
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] File "/opt/stack/new/nova/nova/compute/manager.py", line 1467, in _spawn
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] LOG.exception(_('Instance failed to spawn'), instance=instance)
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] File "/opt/stack/new/nova/nova/openstack/common/excutils.py", line 68, in __exit__
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] six.reraise(self.type_, self.value, self.tb)
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] File "/opt/stack/new/nova/nova/compute/manager.py", line 1464, in _spawn
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] block_device_info)
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 2209, in spawn
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] write_to_disk=True)
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 3306, in to_xml
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [instance: 1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd] libvirt_utils.write_to_file(xml_path, xml)
2014-02-06 07:48:25.544 30045 TRACE nova.compute.manager [inst...

Read more...

Changed in nova:
status: Triaged → Invalid
importance: High → Undecided
Matt Riedemann (mriedem)
Changed in tempest:
status: New → In Progress
assignee: nobody → Masayuki Igawa (igawa)
importance: Undecided → High
Changed in nova:
assignee: Masayuki Igawa (igawa) → nobody
Revision history for this message
Attila Fazekas (afazekas) wrote :

The server must be always delete-able (unless it is lock-ed).
One of the reason of removing the whitebox tests was, the servers become always delectable.

Changed in nova:
status: Invalid → New
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/74944

Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

I investigated http://logs.openstack.org/17/70717/1/gate/gate-tempest-dsvm-full/9adaf90/logs/

The scenario is the following:

07:48:22.016 (nova-cpu) Starting instance...
07:48:22.245 (nova-api) receives "DELETE a server" request
07:48:24.599 (nova-cpu) Deleting instance files /opt/stack/data/nova/instances/1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd
07:48:24.602 (nova-cpu) Deletion of /opt/stack/data/nova/instances/1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd complete
07:48:24.630 (nova-cpu) Instance failed to spawn: IOError: [Errno 2] No such file or directory: '/opt/stack/data/nova/instances/1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd/libvirt.xml'
07:48:25.633 (nova-api) receives "GET a server" request
07:48:25.728 (nova-api) returns "ERROR" server status due to 07:48:24.630 server creation error.
07:48:25,729 (tempest) The test failed because of detecting "ERROR" server status.

So this problem is because both "create a server" and "delete a server" run at the same time.

Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

This bug has been fixed already.

When this problem happens, the semaphore names of "create a server" and "delete a server" are different like the following:

* 07:48:22.012 (nova-cpu) Got semaphore "1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd" lock
* 07:48:22.013 (nova-cpu) Got semaphore / lock "do_run_instance"
* 07:48:22.016 (nova-cpu) Starting instance...
* 07:48:22.442 (nova-cpu) Got semaphore "1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd.delete" lock
* 07:48:22.442 (nova-cpu) Got semaphore / lock "do_terminate_instance"
* 07:48:24.599 (nova-cpu) Deleting instance files /opt/stack/data/nova/instances/1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd

The one of "create a server" is "1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd",
and the one of "delete a server" is "1cbd2e01-d2f3-45a0-baee-1d16df3dc7fd.delete".

So the difference made race condition and this problem happened.

https://review.openstack.org/#/c/71363/ has fixed this problem already.

Changed in nova:
status: New → Fix Committed
Changed in nova:
milestone: none → icehouse-3
Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

This is not a tempest bug.

Changed in tempest:
assignee: Masayuki Igawa (igawa) → nobody
status: In Progress → Invalid
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-3 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.