nova-api uses too much memory after long uptime

Bug #1427688 reported by Roman Podoliaka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Invalid
Medium
MOS Nova
6.0.x
Won't Fix
Medium
MOS Nova
6.1.x
Won't Fix
Medium
MOS Nova
7.0.x
Won't Fix
Medium
MOS Nova
8.0.x
Invalid
Medium
MOS Nova

Bug Description

From MOS Infra CI

On the controller node after 34 days of uptime:

top - 11:43:45 up 34 days, 21:41, 2 users, load average: 2.38, 3.56, 3.72
Tasks: 845 total, 2 running, 843 sleeping, 0 stopped, 0 zombie
Cpu(s): 10.6%us, 0.7%sy, 0.0%ni, 88.4%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 65940016k total, 64784240k used, 1155776k free, 145544k buffers
Swap: 31999996k total, 17341100k used, 14658896k free, 8080880k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13160 nova 20 0 3172m 2.8g 2676 S 2 4.5 993:39.88 nova-api
13166 nova 20 0 2966m 2.7g 2672 S 2 4.3 979:59.12 nova-api
13169 nova 20 0 3012m 2.7g 2672 S 2 4.2 983:36.91 nova-api
13184 nova 20 0 2898m 2.7g 2668 S 2 4.2 984:39.07 nova-api
13159 nova 20 0 2793m 2.4g 2664 S 2 3.8 983:05.25 nova-api
13168 nova 20 0 2842m 1.9g 2664 S 2 3.0 981:30.06 nova-api
13177 nova 20 0 2191m 1.4g 2664 S 2 2.2 978:50.91 nova-api
13153 nova 20 0 2221m 1.4g 2728 S 2 2.2 980:20.91 nova-api
13157 nova 20 0 1388m 1.2g 2668 S 2 1.9 1002:05 nova-api
13145 nova 20 0 2214m 1.1g 2668 S 2 1.8 979:10.51 nova-api
25936 rabbitmq 20 0 4931m 1.1g 2080 S 2 1.8 3064:59 beam.smp
13165 nova 20 0 1996m 1.0g 2664 S 2 1.6 982:27.46 nova-api
13156 nova 20 0 1648m 1.0g 2724 S 2 1.6 987:07.03 nova-api
13167 nova 20 0 1645m 1.0g 2664 S 2 1.6 991:49.58 nova-api
13148 nova 20 0 1363m 1.0g 2668 S 2 1.6 981:30.84 nova-api
13164 nova 20 0 1678m 943m 2664 S 3 1.5 1003:21 nova-api
13174 nova 20 0 1647m 827m 2664 S 2 1.3 955:25.43 nova-api
13793 cinder 20 0 1905m 771m 4352 S 3 1.2 1240:34 cinder-volume
24564 mysql 20 0 26.8g 763m 4196 S 2 1.2 844:29.72 mysqld
13152 nova 20 0 1756m 726m 2664 S 2 1.1 984:03.76 nova-api
 5189 root 20 0 902m 694m 3648 S 0 1.1 58:19.69 ceph-mon
13155 nova 20 0 1423m 658m 2724 S 2 1.0 985:27.61 nova-api
13158 nova 20 0 871m 652m 2664 S 2 1.0 975:48.03 nova-api
13171 nova 20 0 1879m 566m 2668 S 2 0.9 981:59.60 nova-api
13149 nova 20 0 1701m 540m 2668 S 2 0.8 927:43.85 nova-api
13170 nova 20 0 675m 464m 2668 S 2 0.7 989:38.72 nova-api
13146 nova 20 0 797m 463m 2668 S 2 0.7 978:44.29 nova-api
13140 nova 20 0 671m 459m 2668 S 2 0.7 982:09.39 nova-api
13144 nova 20 0 873m 393m 2672 S 2 0.6 995:36.77 nova-api
13179 nova 20 0 1600m 331m 2664 S 2 0.5 977:21.92 nova-api
13143 nova 20 0 873m 282m 2724 S 2 0.4 993:28.76 nova-api
 3969 root 10 -10 2256m 276m 6420 S 1 0.4 534:46.48 ovs-vswitchd

During this time nova-api has been actively used for running/deleting VMs (CI jobs).

description: updated
Marian Horban (mhorban)
Changed in mos:
assignee: MOS Nova (mos-nova) → Marian Horban (mhorban)
Changed in mos:
status: Confirmed → In Progress
Revision history for this message
Marian Horban (mhorban) wrote :

Script for reproducing issue

Revision history for this message
Marian Horban (mhorban) wrote :

Tried to reproduce with attached scripts without luck

Revision history for this message
Marian Horban (mhorban) wrote :
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Unfortunately, we haven't managed to reproduce this locally yet.

The problem is that a patched env is needed, to make sure we've got introspection tools (like objgraph) put in-place, which means we can't do this on a production env (like MOS Infra).

At the very least we need a repro on ubuntu 14.04 to make sure CPython is built correctly (-g rather than -g0 in precise), so that we can use gdb/pyringe to inject code into a running nova-api process.

That being said, this must be a rare case and not a blocker for us to cut 6.1 release.

tags: added: release-notes
Changed in mos:
milestone: 6.1 → 7.0
tags: added: release-notes-done
removed: release-notes
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

We haven't seen this for a while, so this should have a lower priority, IMO.

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

No longer fixing medium bugs in 7.0, moving to 8.0

Changed in mos:
status: Confirmed → Won't Fix
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

As I mentioned in #5 we haven't seen this for a while now. I suggest we move it to Incomplete and close it, if it's not reproduced again in 8.0.

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Haven't seen a repro on 8.0.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.