openvswitch agent eating CPU, time spent in ip_conntrack.py
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
Fix Released
|
High
|
Unassigned | ||
Queens |
Fix Released
|
High
|
Unassigned | ||
neutron |
Fix Released
|
High
|
Brian Haley | ||
neutron (Ubuntu) |
Fix Released
|
High
|
Unassigned | ||
Bionic |
Fix Released
|
High
|
Unassigned | ||
Cosmic |
Fix Released
|
High
|
Unassigned |
Bug Description
We just ran into a case where the openvswitch agent (local dev destack, current master branch) eats 100% of CPU time.
Pyflame profiling show the time being largely spent in neutron.
https:/
The code around this line is:
while True:
The documentation of eventlet.spawn_n says: "The same as spawn(), but it’s not possible to know how the function terminated (i.e. no return value or exceptions). This makes execution faster. See spawn_n for more details." I suspect that GreenPool.spawn_n may behave similarly.
It seems plausible that spawn_n is returning very quickly because of some error, and then all time is quickly spent in a short circuited while loop.
SRU details for Ubuntu:
-------
[Impact]
We're cherry-picking a single bug-fix patch here from the upstream stable/queens branch as there is not currently an upstream stable point release available that includes this fix. We'd like to make sure all of our supported customers have access to this fix as there is a significant performance hit without it.
[Test Case]
The following SRU process was followed:
https:/
In order to avoid regression of existing consumers, the OpenStack team will run their continuous integration test against the packages that are in -proposed. A successful run of all available tests will be required before the proposed packages can be let into -updates.
The OpenStack team will be in charge of attaching the output summary of the executed tests. The OpenStack team members will not mark ‘verification-done’ until this has happened.
[Regression Potential]
In order to mitigate the regression potential, the results of the
aforementioned tests are attached to this bug.
description: | updated |
description: | updated |
Changed in neutron (Ubuntu): | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in neutron (Ubuntu Bionic): | |
importance: | Undecided → High |
status: | New → Triaged |
tags: | added: neutron-proactive-backport-potential |
tags: | removed: neutron-proactive-backport-potential |
Changed in cloud-archive: | |
status: | Fix Committed → Fix Released |
The bug is in the code introduced by https:/ /review. openstack. org/#/c/ 537654 .
I believe the importance of this bug should perhaps be raised.