Comment 4 for bug 1500615

Revision history for this message
Nikola Đipanov (ndipanov) wrote :

After reading the IRC logs from when the bug was reported I noticed an observation by Matt that the time the resource semaphore was held kept increasing until it dropped - this can indeed be caused by a small number of conductor workers. COMPUTE_RESOURCE_SEMAPHORE is nova-cpu process wide, and there is a non-trivial amount of DB querying that goes on while it's held. A storm of requests would mean even more stuff going to the conductor which in turn makes ongoing requests holding the lock slow down as they are competing for the same pool of conductor workers to do their DB queries for them.

It is hard to tell without actually profiling the code (AND porobably adding even more probes like number of threads and DB requests per conductor worker), but increasing the number of conductor workers is definitely a good guess as to where the problem might be