Quantum test suite leaks memory like a sieve

Bug #1065276 reported by Kevin L. Mitchell on 2012-10-10
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
High
Iryoung Jeong
Folsom
High
Gary Kotton
quantum (Ubuntu)
Undecided
Unassigned
Quantal
Undecided
Unassigned

Bug Description

I have at least one test (and possibly others; can't tell for sure) which is failing because the Quantum test suite seems to leak memory like a sieve. The machine I'm testing Quantum on has 1G physical and 1G swap, and the exact failure is a "Cannot allocate memory" exception from os.fork(). I ran the test suite and kept an eye on it with top, and within a few minutes, all physical memory was in use by the Quantum test suite, with roughly half of my swap also consumed.

dan wendlandt (danwent) wrote :

Hi Kevin,

some team members have already been investigating mem leaks in the unit tests and think they are due to sqlalchemy. A recent change was pushed to use an updated version of sqlalchemy, which they believe will address the issue, though I have not confirmed it myself: https://review.openstack.org/#/c/14135/

Would be good to get your feedback as to whether this is the same issue you are seeing.

Kevin L. Mitchell (klmitch) wrote :

The version of Quantum I'm running tests against right now is the one that I pulled this morning, and it looks like that patch merged on the 8th, so I would say that there's still a significant leak somewhere…

On 10/10/2012 11:58 PM, Kevin L. Mitchell wrote:
> Public bug reported:
>
> I have at least one test (and possibly others; can't tell for sure)
> which is failing because the Quantum test suite seems to leak memory
> like a sieve. The machine I'm testing Quantum on has 1G physical and 1G
> swap, and the exact failure is a "Cannot allocate memory" exception from
> os.fork(). I ran the test suite and kept an eye on it with top, and
> within a few minutes, all physical memory was in use by the Quantum test
> suite, with roughly half of my swap also consumed.
>
> ** Affects: quantum
> Importance: Undecided
> Status: New
>
Thanks. I'll try and take at this after summit.
Thanks
Gary

Gary Kotton (garyk) on 2012-10-22
Changed in quantum:
status: New → Confirmed
David Ripton (dripton) wrote :

Data point:

run_tests.py, on Fedora 17 with 8 GB, takes 115s and top shows a peak RES of 2.2 GB.

After changing all instances of ":memory:" to "/tmp/quantum-unit-tests-db", run_tests.py takes 155s and RES still grows to 1.7 GB. The database file was 112 kB at the end of the test run.

dan wendlandt (danwent) on 2012-10-29
Changed in quantum:
milestone: none → grizzly-1
importance: Undecided → High
dan wendlandt (danwent) on 2012-10-29
Changed in quantum:
assignee: nobody → Gary Kotton (garyk)
dan wendlandt (danwent) on 2012-11-13
Changed in quantum:
assignee: Gary Kotton (garyk) → Mark McClain (markmcclain)
status: Confirmed → In Progress

Fix proposed to branch: master
Review: https://review.openstack.org/16151

Changed in quantum:
assignee: Mark McClain (markmcclain) → Iryoung Jeong (iryoung)
Iryoung Jeong (iryoung) wrote :

Hello,

At first, I'd like to say "I'm sorry" if I bothered anyone(because someone was already assigned to this bug)

For me, run_tests failed because of out-of-memory(I have only 1G mem VM for testing quantum :(

So, after a few hours of investigation, I found some points of leaks which make run_tests can run successfully with 1G mem.

With this patch, final memory consumption looks like this.(with Ubuntu Precise 64bit)

UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
xxxx 15613 15607 98 230434 850328 0 15:39 pts/4 00:06:59 python .//run_tests.py

Definitely, it looks like there's more leaks(still more than 800MB), but this can be starting point for others who'd like to dig further.
(IMHO, this is just tests, so if production code doesn't have leak, this size of memory leak is acceptable... for me :)

On 11/15/2012 09:22 AM, Iryoung Jeong wrote:
> Hello,
>
> At first, I'd like to say "I'm sorry" if I bothered anyone(because
> someone was already assigned to this bug)
>
> For me, run_tests failed because of out-of-memory(I have only 1G mem VM
> for testing quantum :(
>
> So, after a few hours of investigation, I found some points of leaks
> which make run_tests can run successfully with 1G mem.
>
> With this patch, final memory consumption looks like this.(with Ubuntu
> Precise 64bit)
>
> UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
> xxxx 15613 15607 98 230434 850328 0 15:39 pts/4 00:06:59 python .//run_tests.py
>
> Definitely, it looks like there's more leaks(still more than 800MB), but this can be starting point for others who'd like to dig further.
> (IMHO, this is just tests, so if production code doesn't have leak, this size of memory leak is acceptable... for me :)
>

Hi,
Thank you for trying to address tis. I have taken a look and the patch
does not address the memory utilization. If you run top whilst the tests
are running you will see how the memory is "chewed up"
Thanks
Gary

dan wendlandt (danwent) wrote :

thanks iryoung!

I actually do see a significant reduction in the amount that is leaked on my system. Without this patch I see about 1.4 GB at the end of the run, whereas with the patch, I see it top out at about 0.9 GB. Getting below the 1 GB barrier seems like a quite useful step in the right direction to me. Gary, are you able to confirm a reduction?

Gary Kotton (garyk) wrote :

I did not see a reduction in the memory usage. Whilst running the unit tests I ran top in a separate window. Evry second it grows and grows. In my opinion the patch is good. It just does not fix the problem that is described above.
Thanks
Gary

dan wendlandt (danwent) wrote :

I see this as well, its just that the memory grows a bit more slowly than without the change. \\

Can you measure the peak memory that you reach when this patch is and is not applied? When it is NOT applied, I see 1.4 GB. When it is applied, i see 0.9 GB, which seems like a step in the right direction.

Gary Kotton (garyk) on 2012-11-20
Changed in quantum:
milestone: grizzly-1 → grizzly-2
assignee: Iryoung Jeong (iryoung) → Mark McClain (markmcclain)

Fix proposed to branch: master
Review: https://review.openstack.org/16735

Changed in quantum:
assignee: Mark McClain (markmcclain) → Iryoung Jeong (iryoung)
Iryoung Jeong (iryoung) wrote :

One more try :)

I found another clue which ate memory, so I took the liberty of uploading new review.

With new patch, unit tests can run under 250M RES.

It looks like there's still more place to improve, but I think sharing early would be better.

Cool.

Please note that jenkins gave -1.

Looks like trivial pep8 things -
http://logs.openstack.org/16735/2/check/gate-quantum-pep8/2637/console.html

Thanks
Gary

On 11/22/2012 11:00 AM, Iryoung Jeong wrote:
> One more try :)
>
> I found another clue which ate memory, so I took the liberty of
> uploading new review.
>
> With new patch, unit tests can run under 250M RES.
>
> It looks like there's still more place to improve, but I think sharing
> early would be better.
>

Gary Kotton (garyk) wrote :

Hi,
I have taken a look at the patch. Great work. I just ran tests with the
code in the teardown. This saved a ton of memory.
If I was you. I'd add this in the first patch. In an additional one I'd
add the changes for the networks, subnets etc.
It will certainly help speed up the approval process.
Thanks
Gary

On 11/22/2012 11:00 AM, Iryoung Jeong wrote:
> One more try :)
>
> I found another clue which ate memory, so I took the liberty of
> uploading new review.
>
> With new patch, unit tests can run under 250M RES.
>
> It looks like there's still more place to improve, but I think sharing
> early would be better.
>

Reviewed: https://review.openstack.org/16735
Committed: http://github.com/openstack/quantum/commit/8e94da49675310dae2e94a860e515ce5ac16f33f
Submitter: Jenkins
Branch: master

commit 8e94da49675310dae2e94a860e515ce5ac16f33f
Author: Iryoung Jeong <email address hidden>
Date: Thu Nov 22 12:58:47 2012 +0900

    Updates tearDown() to release instance objects

    This change fixes the bug by releasing the objects of the instance
    of class QuantumDbPluginV2TestCase. Removing unnecessary objects
    explicitly reduces the memory required by unit tests.

    Fixes bug 1065276

    Change-Id: Ia003a7718e1aedc4e4c8fb02b723f4a511ebc319

Changed in quantum:
status: In Progress → Fix Committed
Gary Kotton (garyk) on 2012-11-26
tags: added: folsom-backport-potential

Reviewed: https://review.openstack.org/16890
Committed: http://github.com/openstack/quantum/commit/c60051ac6b95d0e146f81d04dba0367edc8b9f78
Submitter: Jenkins
Branch: stable/folsom

commit c60051ac6b95d0e146f81d04dba0367edc8b9f78
Author: Iryoung Jeong <email address hidden>
Date: Thu Nov 22 12:58:47 2012 +0900

    Updates tearDown() to release instance objects

    This change fixes the bug by releasing the objects of the instance
    of class QuantumDbPluginV2TestCase. Removing unnecessary objects
    explicitly reduces the memory required by unit tests.

    Fixes bug 1065276

    Change-Id: Ia003a7718e1aedc4e4c8fb02b723f4a511ebc319

tags: added: in-stable-folsom
Gary Kotton (garyk) on 2012-11-27
tags: removed: folsom-backport-potential
Changed in quantum (Ubuntu):
status: New → Fix Released
Changed in quantum (Ubuntu Quantal):
status: New → Confirmed

Hello Kevin, or anyone else affected,

Accepted quantum into quantal-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/quantum/2012.2.1-0ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in quantum (Ubuntu Quantal):
status: Confirmed → Fix Committed
tags: added: verification-needed
Thierry Carrez (ttx) on 2013-01-09
Changed in quantum:
status: Fix Committed → Fix Released
Mark McLoughlin (markmc) on 2013-01-22
tags: removed: in-stable-folsom
Launchpad Janitor (janitor) wrote :
Download full text (3.8 KiB)

This bug was fixed in the package quantum - 2012.2.1-0ubuntu1

---------------
quantum (2012.2.1-0ubuntu1) quantal-proposed; urgency=low

  * Resynchronize with stable/folsom (1e774867) (LP: #1085255):
    - [aeabb42] There are routing problems when the dnsmasq port does not come
      first in the routing table (LP: #1083238)
    - [04aab72] Quantum linux bridge not optimized with libvirt (LP: #1078210)
    - [ca7fc10] getting quotas from database has severe performance implications
      (LP: #1075369)
    - [66605e8] failed to update an external network into non external network
      (LP: #1083387)
    - [c60051a] Quantum test suite leaks memory like a sieve (LP: #1065276)
    - [3179dfc] clear_db() does incomplete db teardown (LP: #1080988)
    - [c1e19d7] Unauthorized command: cat /proc/None/cmdline (LP: #1077651)
    - [af9e076] At times a instance will not receive an IP address from the DHCP
      agent (LP: #1081664)
    - [e0d1a7d] allow multiple floating-ip on single port if they use different
      fixed ips and/or external nets (LP: #1057844)
    - [8471d79] Delete port fails to gateway ip (LP: #1079980)
    - [aca8b4a] fixed_ip allocation which is not included within
      allocation_pools makes error when delete port or re-create port
      (LP: #1077292)
    - [eacc9d3] Mapping same bridge to different phyiscal networks succeed
      (LP: #1067669)
    - [51b4c82] python-quantum: not region aware (LP: #1080793)
    - [6f0a486] delete floatingip should be in one transaction to delete port
      (LP: #1080516)
    - [db6cda7] Remove qpid configuration variables no longer supported
    - [a112840] Allow NVP plugin to use per-tenant quota extension
    - [82b1a55] Quantum service does not restart after reboot (LP: #1073999)
    - [c01a839] There are some cases that L3 API with an invalid parameter
      returns 500. (LP: #1064765)
    - [26b383f] external network can be plugged also as internal network for one
      router (LP: #1053633)
    - [49f649c] There is a lot of cases that API with an invalid parameter
      returns 500. (LP: #1062046)
    - [4546a18] When create subnet, you con set up the value as cidr (the value
      isn't cidr form). (LP: #1067959)
    - [9ba453a] killfilter should handle updated/deleted executables
      (LP: #1073768)
    - [7c8a55c] a port which is not able to delete is made when floatingip
      create fails. (LP: #1064748)
    - [c9b84cf] Linux bridge port update causes exception (LP: #1072713)
    - [cb57932] I can't add interface to router, if there is another port in
      non-shared network of other tenant (LP: #1057558)
    - [574e278] Ryu plugin does not support Security Groups (LP: #1059393)
    - [607f486] tap device added to integration bridge without tag
      (LP: #1064070)
    - [21a0fdf] L3 agent external network flag (LP: #1056720)
    - [5cbaff4] router create with external_gateway_info fails with 500 always.
      (LP: #1064235)
    - [63b81f6] l3 db operations failed in multiple transactions (LP: #1070335)
    - [bff17fb] Ensure that the SqlSoup import is still supported.
    - [e091a29] l3_nat_agent was renamed to l3_agent
    - [9030969] remove default value of 'local_ip' of 10...

Read more...

Changed in quantum (Ubuntu Quantal):
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2013-04-04
Changed in quantum:
milestone: grizzly-2 → 2013.1
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers