spawn failed with "libvirtError: internal error: received hangup / error event on socket" in the gate

Bug #1451506 reported by Matt Riedemann
42
This bug affects 8 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

Looks like libvirt was temporarily disconnected which caused the spawn failure:

http://logs.openstack.org/80/170780/11/gate/gate-tempest-dsvm-postgres-full/b2e3fd4/logs/screen-n-cpu.txt.gz?level=TRACE#_2015-05-04_16_49_01_948

2015-05-04 16:49:01.948 ERROR nova.compute.manager [req-e9676551-d3ed-4f85-ab2d-34ce9e1b1446 TestServerAdvancedOps-1668477466 TestServerAdvancedOps-559926363] [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] Instance failed to spawn
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] Traceback (most recent call last):
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/opt/stack/new/nova/nova/compute/manager.py", line 2475, in _build_resources
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] yield resources
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/opt/stack/new/nova/nova/compute/manager.py", line 2347, in _build_and_run_instance
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] block_device_info=block_device_info)
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 2355, in spawn
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] block_device_info=block_device_info)
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 4393, in _create_domain_and_network
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] power_on=power_on)
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 4324, in _create_domain
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] LOG.error(err)
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] six.reraise(self.type_, self.value, self.tb)
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 4308, in _create_domain
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] domain = self._conn.defineXML(xml)
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 183, in doit
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] result = proxy_call(self._autowrap, f, *args, **kwargs)
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 141, in proxy_call
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] rv = execute(f, *args, **kwargs)
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 122, in execute
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] six.reraise(c, e, tb)
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 80, in tworker
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] rv = meth(*args, **kwargs)
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 3263, in defineXML
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96] libvirtError: internal error: received hangup / error event on socket
2015-05-04 16:49:01.948 13039 TRACE nova.compute.manager [instance: 2d8249a5-df5a-4c8b-b41b-1c07f27ddb96]

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwibGlidmlydEVycm9yOiBpbnRlcm5hbCBlcnJvcjogcmVjZWl2ZWQgaGFuZ3VwIC8gZXJyb3IgZXZlbnQgb24gc29ja2V0XCIgQU5EIHRhZ3M6XCJzY3JlZW4tbi1jcHUudHh0XCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjE0MzA3NjA1MjQ5OTcsIm1vZGUiOiIiLCJhbmFseXplX2ZpZWxkIjoiIn0=

56 hits in 7 days, check and gate, all failures.

Tags: libvirt
Matt Riedemann (mriedem)
Changed in nova:
status: New → Confirmed
Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

Looks like keep alive is timing out:

2015-05-04 16:49:01.912+0000: 5567: warning : virKeepAliveTimerInternal:140 : No response from client 0x7f896af0d200 after 5 keepalive messages in 30 seconds

Revision history for this message
melanie witt (melwitt) wrote :

I found this message in n-cpu.log can also be:

"2015-05-06 15:07:54.158 6767 TRACE oslo_messaging.rpc.dispatcher NovaException: Error from libvirt while looking up instance-00000029: [Error Code 1] internal error: received hangup / error event on socket"

as seen in another job [1]. I'm going to try updating the e-r query.

[1] http://logs.openstack.org/03/166703/17/check/gate-tempest-dsvm-neutron-src-python-novaclient/302fd71/logs/screen-n-cpu.txt.gz?level=TRACE#_2015-05-06_15_07_54_158

Revision history for this message
melanie witt (melwitt) wrote :
Revision history for this message
John Garbutt (johngarbutt) wrote :

Doesn't look like we see this in the gate any more. Marking as incomplete (so this will expire, etc) because we don't have any repo steps for this right now.

Changed in nova:
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.