OVN transaction could not be completed due to a race condition
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
Fix Released
|
Undecided
|
Unassigned | ||
Ussuri |
Fix Released
|
High
|
Unassigned | ||
Victoria |
Fix Released
|
High
|
Unassigned | ||
neutron |
Fix Released
|
High
|
Unassigned | ||
neutron (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
High
|
Unassigned |
Bug Description
When executing the test "test_connectiv
a race condition:
networking_
Bugzilla reference: https:/
===== Ubuntu SRU Details =====
[Impact]
See bug description.
[Test Case]
Deploy openstack with OVN. Run the test_connectivi
[Where problems could occur]
The existing bug could still occur if the assumpion that specifying the port type is not correct. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches.
Changed in neutron: | |
assignee: | nobody → Arnau Verdaguer (averdagu) |
description: | updated |
Changed in neutron: | |
status: | New → In Progress |
information type: | Public → Public Security |
information type: | Public Security → Public |
tags: | added: ovn |
Changed in neutron: | |
status: | New → In Progress |
Changed in neutron (Ubuntu): | |
status: | New → Fix Released |
Changed in neutron (Ubuntu Focal): | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in cloud-archive: | |
status: | New → Fix Released |
description: | updated |
description: | updated |
tags: | added: verification-done |
The race conditions is triggered when the port (test_ap{ 1|2}_wan_ port) is added to the router.
This happens because during the handling of the OSP cli command a LogicalSwitchPo rtUpdateUpEvent
is received that triggers a new worker who will work with the same port that the worker handling
the OSP CLI cmd is working.
Another problem is that during the course of the request the port status will flap from:
unknown -> UP -> DOWN -> UP
This will cause that both workers will be modifying the same port during more time, increasing
the chances of getting the race condition error.
The flapping error is triggered because on the update_port function, the _nb_idl. set_lswitch_ port
will not include the port type on the command. When this command reaches the nbdb since the port
type is not "router" this will cause that the status is set to DOWN (And later on to UP again).