nova compute service running IronicDriver maybe leak memory

Bug #1949051 reported by Simon Li
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

Description
===========
We run nova-compute service with IronicDriver in k8s cluster as statefulSet pod, with 1GiB memory limit and only this service in POD.
There are about 40 nodes in our test environment. Most of them have instances and in active provision state.
Some nodes fail to connect to the IPMI. As a result, they cannot obtain the power status.
In about 12 hours, the memory limit is exceeded and the POD is restarted.

Steps to reproduce
==================
Nothing need to do.
Note the flowing:
1. The more nodes there are, the faster the memory grows and the shorter the time limit is exceeded.
2. Even with only one node, the memory limit will be exceeded, but for a long time.
3. In our environment, the frequency of memory growth is around 10min, so we suspect that is caused by periodic task, maybe `_sync_power_states` task.
4. I am not sure whether the IPMI connection has any impact.

Expected result
===============
Memory of the pod should be in a stable state when we are not performing operations on nodes/instances.

Actual result
=============
Memory keeps increasing until the limit is exceeded and the POD is restarted.

Environment
===========
openstack version
   - nova: 22.0.1
   - ironic: 16-0-1

Logs & Configs
==============

Tags: ironic
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Which exact services are eating memory as you can see ? nova-compute or any ironic service ?

Do you have any memory tracking usage ?

Please mark the bug as New once you reply so we could see your answer.

Changed in nova:
status: New → Incomplete
tags: added: ironic
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.