Instance creation fails with libvirtError: Unable to create tap device: Device or resource busy

Bug #1515768 reported by Sripriya
28
This bug affects 6 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
yong sheng gong
Nominated for Liberty by Markus Zoeller (markus_z)
tacker
Fix Released
Critical
yong sheng gong
nova (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Fix Released
Undecided
Unassigned

Bug Description

Summary:
This issue is observed frequently on Jenkins gate and has been reproducible in local setup too.

Steps:
Initiate 3 stack create requests at once in a script:

heat stack-create -f /home/stack/template_file stack1
heat stack-create -f /home/stack/template_file stack2
heat stack-create -f /home/stack/template_file stack3

using the following HOT file:
http://paste.openstack.org/show/479920/

One of the stack creations fails with CreateFailed: Resource Create Failed: Conflict: Resources. vdu3: Port Is Still In Use.

From the nova logs, there are duplicate bridges created for one of the servers. The qemu xml fails with libvirtError: Unable to create tap device tapd3a3d9e9-5d: Device or resource busy. See timestamp 2015-11-25 23:03:14.940 in n-cpu.log

Attaching the relevant n-cpu.log, q-svc.log and h-eng.log

Observation:
The 1st network interface for the nova instance is a Neutron Port resource provided in HOT template.
Nova sends a PUT request to update the port information. It also sends 2 POST requests for the 2nd and 3rd network interfaces.
Neutron receives the PUT request and sends network-event changed event while nova is still waiting for the POST response for the 2 ports.
If the network-changed event is received before the 3rd port POST response is received, refresh_cache is acquired by nova_service
Nova sends a query for port information, updates the cache and release the lock.
By then, POST requests are completed which acquires the cache lock again and sends request for network info. refresh_cache is updated twice and contains duplicate set of ports
Network vifs are built for all 6 ports and qemu xml is build based on that.
Duplicate bridges in xml is complained by libvirt as device or resource busy.

Version and environment:
Devstack Master

Sripriya (sseetha)
description: updated
Gary Kotton (garyk)
tags: added: gate-failure
Sripriya (sseetha)
description: updated
Revision history for this message
Sripriya (sseetha) wrote :
Revision history for this message
Sripriya (sseetha) wrote :
Revision history for this message
Sripriya (sseetha) wrote :
Revision history for this message
Sripriya (sseetha) wrote :
Changed in nova:
assignee: nobody → yong sheng gong (gongysh)
Changed in tacker:
assignee: nobody → yong sheng gong (gongysh)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/252824

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tacker (master)

Fix proposed to branch: master
Review: https://review.openstack.org/252826

Changed in tacker:
status: New → In Progress
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

From the analysis above I gather there's not much that can be done from the Neutron-side, especially if the failure mode is caused by the specific use case mentioned.

no longer affects: neutron
Changed in tacker:
importance: Undecided → Critical
Alex Xu (xuhj)
Changed in nova:
importance: Undecided → High
Alex Xu (xuhj)
tags: added: network
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/258221

Revision history for this message
Rakesh H S (rh-s) wrote :

With Heat convergence we are hitting this issue.
(In convergence, heat will have multiple workers, so independent resources are created parallel)
While creating 8 nova servers, we are consistently hitting this issue in Mitaka.

Regards,
Rakesh

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/liberty)

Change abandoned by Matt Riedemann (<email address hidden>) on branch: stable/liberty
Review: https://review.openstack.org/258221
Reason: Restore/re-propose this when the change on master is approved.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by gongysh (gong.yongsheng@99cloud.net) on branch: master
Review: https://review.openstack.org/252824

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tacker (master)

Change abandoned by gongysh (gong.yongsheng@99cloud.net) on branch: master
Review: https://review.openstack.org/252826

Revision history for this message
Sripriya (sseetha) wrote :

Moving this bug to fix-released manually as the bug was fixed separately in nova project on a parallel patch at https://review.openstack.org/#/c/252565/ which resolved the original Tacker issue.

Changed in tacker:
status: In Progress → Fix Committed
Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote :

Acording to the abandoned review [1] (from the last Nova assignee
yong sheng gong) this got fixed with patch [2]. Patch [2] didn't
mention this in its commit message, that's why this bug report looks
like it is still open. I'm closing it manually.

References:
[1] https://review.openstack.org/#/c/252824/
[2] https://review.openstack.org/#/c/252565/

Changed in nova:
status: In Progress → Fix Released
tags: added: liberty-backport-potential
Ante Karamatić (ivoks)
no longer affects: heat (Ubuntu)
Angus Salkeld (asalkeld)
no longer affects: heat
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nova (Ubuntu):
status: New → Confirmed
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Also marking this as fix-released for Ubuntu since it is available in nova 13.0.0+.

Changed in nova (Ubuntu):
status: Confirmed → Fix Released
Changed in nova (Ubuntu Xenial):
status: New → Fix Released
Changed in tacker:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.