nova libvirtError: Unable to add bridge brqxxx-xx port tapxxx-xx: Device or resource busy
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| neutron |
Medium
|
Andreas Scheuring | ||
| Kilo |
Undecided
|
Unassigned |
Bug Description
Hello: My OpenStack's version is 2013.1.5(G) , plugin is linuxbridge ,os is ubuntu12.04.3 , libvirt-bin is '1.1.1-0ubuntu8.9'
When i launch three instances , two instances is successful, and one of the three is failed to spawn .
I check the log of nova-compute , I found the following errors :
(it's worth noting that:
"libvirtError: Unable to add bridge brq233a5889-2e port tap3f81c08a-39: Device or resource busy")
Somebody in the same problem?
2014-04-24 14:41:58.499 ERROR nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
libvirtError: Unable to add bridge brq233a5889-2e port tap3f81c08a-39: Device or resource busy
2014-04-24 14:41:58.499 60306 TRACE nova.compute.
2014-04-24 14:41:58.680 AUDIT nova.compute.
description: | updated |
tags: | added: libvirt |
Changed in nova: | |
status: | New → Confirmed |
importance: | Undecided → Medium |
description: | updated |
Changed in nova: | |
assignee: | jiji (zzww-666) → nobody |
tags: | added: icehouse juno kilo |
Minho Ban (mhban) wrote : | #2 |
I've suffered from this error since a few weeks ago. I found the error is caused by libvirt which is in race condition with Neutron agent (to me it was linuxbridge agent). I know this error also happens not only LB but also OVS agent.
When a VM is created by Nova, port creation and adding it to bridge is executed by libvirt. Just before libvirt try to add port to bridge, it will fail if Neutron (see below) succeed to add the interface to the bridge.
[neutron/
def add_tap_
"""Add tap interface.
If a VIF has been plugged into a network, this function will
add the corresponding tap device to the relevant bridge.
"""
-------
# Check if device needs to be added to bridge
if not tap_device_
data = {'tap_device_name': tap_device_name,
if utils.execute(
else:
data = {'tap_device_name': tap_device_name,
return True
libvirt will return with failure 'EBUSY' returned from brctl command because the interface is already in that bridge.
I have no idea why the Neutron is keep polling status of bridge interfaces and try to add it if it is not in.
Minho Ban (mhban) wrote : | #3 |
Correction. There have been no report of failure in OVS. It seems this issue only happens only in LinuxBridge agent.
Ihar Hrachyshka (ihar-hrachyshka) wrote : | #4 |
I agree that LB agent should not do the nova's work, which is all that is relevant to tap + qbr plumbing. If the agent loop will be fired right in the middle of nova plugging plugging tap into qbr, it may introduce the race condition you describe.
Changed in neutron: | |
assignee: | nobody → Ihar Hrachyshka (ihar-hrachyshka) |
Changed in neutron: | |
status: | New → In Progress |
Sean M. Collins (scollins) wrote : | #6 |
Changed in neutron: | |
status: | In Progress → Fix Committed |
status: | Fix Committed → In Progress |
tags: | added: linuxbridge-gate-parity |
Changed in neutron: | |
importance: | Undecided → High |
tags: | removed: grizzly icehouse |
tags: | removed: libvirt |
Sean M. Collins (scollins) wrote : | #7 |
Changed in neutron: | |
assignee: | Ihar Hrachyshka (ihar-hrachyshka) → Sean M. Collins (scollins) |
Changed in neutron: | |
assignee: | Sean M. Collins (scollins) → Darragh O'Reilly (darragh-oreilly) |
Changed in neutron: | |
assignee: | Darragh O'Reilly (darragh-oreilly) → Sean M. Collins (scollins) |
The query should be updated:
build_name:
but I can't seem to get any occurrence as of today.
[1] is touted to be the one fixing the issue, but I can't seem to see it happening in the gate.
Changed in neutron: | |
importance: | High → Medium |
Is there a patch targeting Nova? If not, we should not target the project.
no longer affects: | nova |
Changed in neutron: | |
assignee: | Sean M. Collins (scollins) → Ihar Hrachyshka (ihar-hrachyshka) |
tags: |
added: usability removed: linuxbridge-gate-parity |
Ihar Hrachyshka (ihar-hrachyshka) wrote : | #11 |
How is it a usability issue?..
Changed in neutron: | |
assignee: | Ihar Hrachyshka (ihar-hrachyshka) → Andreas Scheuring (andreas-scheuring) |
Fix proposed to branch: master
Review: https:/
Change abandoned by Andreas Scheuring (<email address hidden>) on branch: master
Review: https:/
Reason: Will be merged into patchset https:/
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit f42ea67995537c7
Author: Andreas Scheuring <email address hidden>
Date: Wed Nov 11 14:03:08 2015 +0100
lb: avoid doing nova VIF work plumbing tap to qbr
neutron should rely on nova doing the job instead of trying to 'fix' it.
'Fixing' it introduces race conditions between lb agent and nova VIF
driver. Particularly, lb agent can scan for new tap devices in the
middle of nova plumbing qbr-tap setup, and attempt to do it on its own.
So if agent is more lucky to plug the tap device into the bridge, nova
may fail to do the same, getting the following error:
libvirtError: Unable to add bridge brqxxx-xx port tapxxx-xx: Device or
resource busy
This also requires a change in how the port admin_state_up is implemented
by setting the tap device's link state instead of moving it in or out
of the bridge.
Co-Authored-By: Sean M. Collins <email address hidden>
Co-Authored-By: Darragh O'Reilly <email address hidden>
Co-Authored-By: Andreas Scheuring <email address hidden>
Change-Id: I02971103407b4e
Closes-Bug: #1312016
Changed in neutron: | |
status: | In Progress → Fix Committed |
Jesse Pretorius (jesse-pretorius) wrote : | #15 |
Can we expect to see this backported to Liberty?
Changed in neutron: | |
status: | Fix Committed → Fix Released |
@Jesse: Is there a actual need for this backport? If so, we could give it a try. I mean this was not a totally breaking issue and there may be some risk backporting it. So if there's no actual need, I personally would avoid it.
Ihar Hrachyshka (ihar-hrachyshka) wrote : | #17 |
If you think it's worth a backport, please provide rationale and set liberty-
tags: | removed: juno kilo nova |
This issue was fixed in the openstack/neutron 8.0.0.0b2 development milestone.
Fix proposed to branch: stable/liberty
Review: https:/
Fix proposed to branch: stable/kilo
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/kilo
commit a8b300ac6d489b9
Author: Andreas Scheuring <email address hidden>
Date: Wed Nov 11 14:03:08 2015 +0100
lb: avoid doing nova VIF work plumbing tap to qbr
neutron should rely on nova doing the job instead of trying to 'fix' it.
'Fixing' it introduces race conditions between lb agent and nova VIF
driver. Particularly, lb agent can scan for new tap devices in the
middle of nova plumbing qbr-tap setup, and attempt to do it on its own.
So if agent is more lucky to plug the tap device into the bridge, nova
may fail to do the same, getting the following error:
libvirtError: Unable to add bridge brqxxx-xx port tapxxx-xx: Device or
resource busy
This also requires a change in how the port admin_state_up is implemented
by setting the tap device's link state instead of moving it in or out
of the bridge.
Conflicts:
neutron/
neutron/
neutron/
Co-Authored-By: Sean M. Collins <email address hidden>
Co-Authored-By: Darragh O'Reilly <email address hidden>
Co-Authored-By: Andreas Scheuring <email address hidden>
Closes-Bug: #1312016
(cherry picked from commit f42ea67995537c7
(cherry picked from commit eb61b837f70906a
===
Also squashed the following follow up fix:
lb: Correct String formatting to get rid of logged ValueError
The following error is caused by a missing String formatting in the
linuxbridge agent:
"ValueError: unsupported format character 'a' (0x61) at index 90
Logged from file linuxbridge_
In addition a duplicated word in the log text has been fixed.
Change-Id: I587f1165fc7084
(cherry picked from commit 1f86d8687b2781f
===
Also squashed in:
Only ensure admin state on ports that exist
The linux bridge agent was calling ensure_port_admin state
unconditionally on ports in treat_devices_
This would cause it to throw an error on interfaces that
didn't exist so it would restart the entire processing loop.
If another port was being updated in the same loop before this
one, that port would experience a port status life-cycle of
DOWN-
This causes the bug below because the first active transition will
cause Nova to boot the VM. At this point tempest tests expect the
ports that belong to the VM to be in the ACTIVE state so it filters
Neutron port list calls with "status=ACTIVE". Therefore...
tags: | added: in-stable-kilo |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/liberty
commit 4cb90623193bd68
Author: Andreas Scheuring <email address hidden>
Date: Wed Nov 11 14:03:08 2015 +0100
lb: avoid doing nova VIF work plumbing tap to qbr
neutron should rely on nova doing the job instead of trying to 'fix' it.
'Fixing' it introduces race conditions between lb agent and nova VIF
driver. Particularly, lb agent can scan for new tap devices in the
middle of nova plumbing qbr-tap setup, and attempt to do it on its own.
So if agent is more lucky to plug the tap device into the bridge, nova
may fail to do the same, getting the following error:
libvirtError: Unable to add bridge brqxxx-xx port tapxxx-xx: Device or
resource busy
This also requires a change in how the port admin_state_up is implemented
by setting the tap device's link state instead of moving it in or out
of the bridge.
Conflicts:
neutron/
neutron/
neutron/
Co-Authored-By: Sean M. Collins <email address hidden>
Co-Authored-By: Darragh O'Reilly <email address hidden>
Co-Authored-By: Andreas Scheuring <email address hidden>
Closes-Bug: #1312016
(cherry picked from commit f42ea67995537c7
===
Also squashed in the following follow up fix:
lb: Correct String formatting to get rid of logged ValueError
The following error is caused by a missing String formatting in the
linuxbridge agent:
"ValueError: unsupported format character 'a' (0x61) at index 90
Logged from file linuxbridge_
In addition a duplicated word in the log text has been fixed.
Change-Id: I587f1165fc7084
(cherry picked from commit 1f86d8687b2781f
===
Also squashed in:
Only ensure admin state on ports that exist
The linux bridge agent was calling ensure_port_admin state
unconditionally on ports in treat_devices_
This would cause it to throw an error on interfaces that
didn't exist so it would restart the entire processing loop.
If another port was being updated in the same loop before this
one, that port would experience a port status life-cycle of
DOWN-
This causes the bug below because the first active transition will
cause Nova to boot the VM. At this point tempest tests expect the
ports that belong to the VM to be in the ACTIVE state so it filters
Neutron port list calls with "status=ACTIVE". Therefore tempest would
not get any ports back and assume there was some...
tags: | added: in-stable-liberty |
This issue was fixed in the openstack/neutron 2015.1.4 release.
This issue was fixed in the openstack/neutron 7.1.0 release.
This issue was fixed in the openstack/neutron 2015.1.4 release.
Have a headache, This bug has the serious influence me to create instances .