Idle rpc traffic with a large number of instances causes failures
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
Fix Released
|
Undecided
|
Unassigned | ||
neutron |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
OpenStack Juno (Neutron ML2+OVS/
500 compute node cloud, running 4.5k active instances (can't get it any further right now).
As the number of instances in the cloud increases, the idle loading on the neutron-server servers (4 of them all with 4 cores/8 threads and a suitable *_worker configuration) increases from nothing to 30; The db call get_port_and_sgs is being serviced around 10 times per second on each server at this point. Other things are also happening - I've attached the last 1000 lines of the server log with debug enabled.
The result is that its no longer possible to create new instances, as the rpc calls and api thread just don't get onto CPU, resulting in VIF plugging timeouts on compute nodes, and ERROR'ed instances.
ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: neutron-common 1:2014.
ProcVersionSign
Uname: Linux 3.13.0-35-generic x86_64
ApportVersion: 2.14.1-0ubuntu3.5
Architecture: amd64
CrashDB:
{
}
Date: Thu Oct 23 10:22:14 2014
PackageArchitec
SourcePackage: neutron
UpgradeStatus: No upgrade log present (probably fresh install)
modified.
modified.
modified.
modified.
modified.
modified.
modified.
modified.
modified.
modified.
modified.
modified.
modified.
Changed in neutron: | |
status: | New → Confirmed |
Changed in neutron: | |
status: | Incomplete → Fix Released |
Some other details: l2population driver and neutron security groups are enabled.