OpenStack Neutron API Charm

Bug #1999671
Comment #9

Comment 9 for bug 1999671

Revision history for this message

Bas de Bruijne (basdbruijne) wrote on 2023-08-17: Re: [jammy][yoga] Tempest times out reaching neutron-api

I did some more investigation on our systems. It turns out that while the memory does run low on our systems, the bottleneck is actually the CPU. I noticed a spike to 100% CPU usage every 30 minutes and htop revealed that this is due to the landscape-package-reporter process.

The landscape-client charm reports the packages on all 13 machines (12 lxd machines + the baremetal machine itself) at the same time every 30 minutes, which results in a CPU overload that lasts about 2 minutes. This results in the neutron server responding too slowly to requests which causes tempest to fail.

Sure enough, after removing the landscape-client charm all tempest tests passed without errors. I also realize now that we started seeing this bug a lot only when we added landscape-client to all our OpenStack SKUS.

I'm moving this bug to the landscape client. I think it would be very helpful if the various checks that landscape runs are staggered for the different machines, rather than all at the same time. I did find an option in the landscape UI to stagger the package updates, but I couldn't find a similar option for the package-reporter.