neutron-ns-metadata-proxy uses ~25MB/router in production

Bug #1524916 reported by Miguel Angel Ajo
26
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Daniel Alvarez
tripleo
Fix Released
High
Unassigned

Bug Description

[root@mac6cae8b61e442 memexplore]# ./memexplore.py all metadata-proxy | cut -c 1-67
25778 kB (pid 420) /usr/bin/python /bin/neutron-ns-metadata-proxy -
25774 kB (pid 1468) /usr/bin/python /bin/neutron-ns-metadata-proxy
25778 kB (pid 1472) /usr/bin/python /bin/neutron-ns-metadata-proxy
25770 kB (pid 1474) /usr/bin/python /bin/neutron-ns-metadata-proxy
26528 kB (pid 1489) /usr/bin/python /bin/neutron-ns-metadata-proxy
25778 kB (pid 1520) /usr/bin/python /bin/neutron-ns-metadata-proxy
25778 kB (pid 1738) /usr/bin/python /bin/neutron-ns-metadata-proxy
25774 kB (pid 1814) /usr/bin/python /bin/neutron-ns-metadata-proxy
25774 kB (pid 2024) /usr/bin/python /bin/neutron-ns-metadata-proxy
25774 kB (pid 3961) /usr/bin/python /bin/neutron-ns-metadata-proxy
25774 kB (pid 4076) /usr/bin/python /bin/neutron-ns-metadata-proxy
25770 kB (pid 4099) /usr/bin/python /bin/neutron-ns-metadata-proxy
[...]
25778 kB (pid 31386) /usr/bin/python /bin/neutron-ns-metadata-proxy
25778 kB (pid 31403) /usr/bin/python /bin/neutron-ns-metadata-proxy
25774 kB (pid 31416) /usr/bin/python /bin/neutron-ns-metadata-proxy
25778 kB (pid 31453) /usr/bin/python /bin/neutron-ns-metadata-proxy
25770 kB (pid 31483) /usr/bin/python /bin/neutron-ns-metadata-proxy
25770 kB (pid 31647) /usr/bin/python /bin/neutron-ns-metadata-proxy
25774 kB (pid 31743) /usr/bin/python /bin/neutron-ns-metadata-proxy

2,581,230 kB Total PSS

if we look explicitly at one of those processes we see:

# ./memexplore.py pss 24039
0 kB 7f97db981000-7f97dbb81000 ---p 0005f000 fd:00 4298776438 /usr/lib64/libpcre.so.1.2.0
0 kB 7f97dbb83000-7f97dbba4000 r-xp 00000000 fd:00 4298776486 /usr/lib64/libselinux.so.1
0 kB 7fff16ffe000-7fff17000000 r-xp 00000000 00:00 0 [vdso]
0 kB 7f97dacb5000-7f97dacd1000 r-xp 00000000 fd:00 4298779123 /usr/lib64/python2.7/lib-dynload/_io.so
0 kB 7f97d6a06000-7f97d6c05000 ---p 000b1000 fd:00 4298777149 /usr/lib64/libsqlite3.so.0.8.6
[...]
0 kB 7f97d813a000-7f97d8339000 ---p 0000b000 fd:00 4298779157 /usr/lib64/python2.7/lib-dynload/pyexpat.so
0 kB 7f97dbba4000-7f97dbda4000 ---p 00021000 fd:00 4298776486 /usr/lib64/libselinux.so.1
0 kB 7f9
7db4f7000-7f97db4fb000 r-xp 00000000 fd:00 4298779139 /usr/lib64/python2.7/lib-dynload/cStringIO.so
0 kB 7f97dc81e000-7f97dc81f000 rw-p 00000000 00:00 0
0 kB 7f97d8545000-7f97d8557000 r-xp 00000000 fd:00 4298779138 /usr/lib64/python2.7/lib-dynload/cPickle.so
0 kB 7f97d9fd3000-7f97d9fd7000 r-xp 00000000 fd:00 4298779165 /usr/lib64/python2.7/lib-dynload/timemodule.so
0 kB 7f97d99c4000-7f97d9bc3000 ---p 00002000 fd:00 4298779147 /usr/lib64/python2.7/lib-dynload/grpmodule.so
0 kB 7f97daedb000-7f97daede000 r-xp 00000000 fd:00 4298779121 /usr/lib64/python2.7/lib-dynload/_heapq.so
0 kB 7f97ddfd4000-7f97ddfd7000 r-xp 00000000 fd:00 4298779119 /usr/lib64/python2.7/lib-dynload/_functoolsmodule.so
0 kB 7f97d8b67000-7f97d8b78000 r-xp 00000000 fd:00 4298779141 /usr/lib64/python2.7/lib-dynload/datetime.so
0 kB 7f97d7631000-7f97d7635000 r-xp 00000000 fd:00 4298776496 /usr/lib64/libuuid.so.1.3.0
0 kB 7f97dd59e000-7f97dd5a6000 r-xp 00000000 fd:00 4298779132 /usr/lib64/python2.7/lib-dynload/_ssl.so
0 kB 7f97dbfc0000-7f97dbfc2000 rw-p 00000000 00:00 0
0 kB 7f97dd332000-7f97dd394000 r-xp 00000000 fd:00 4298776137 /usr/lib64/libssl.so.1.0.1e
0 kB 7f97d6e22000-7f97d7021000 ---p 00004000 fd:00 6442649369 /usr/lib64/python2.7/site-packages/sqlalchemy/cresultproxy.so
0 kB 7f97d95bb000-7f97d97ba000 ---p 0000b000 fd:00 4298779156 /usr/lib64/python2.7/lib-dynload/parsermodule.so
0 kB 7f97da3dd000-7f97da3e0000 r-xp 00000000 fd:00 4298779129 /usr/lib64/python2.7/lib-dynload/_randommodule.so
0 kB 7f97dddcf000-7f97dddd3000 r-xp 00000000 fd:00 4298779125 /usr/lib64/python2.7/lib-dynload/_localemodule.so
0 kB 7f97da7e5000-7f97da7ea000 r-xp 00000000 fd:00 4298779136 /usr/lib64/python2.7/lib-dynload/binascii.so
2 kB 7f97e490a000-7f97e4ac0000 r-xp 00000000 fd:00 4299921917 /usr/lib64/libc-2.17.so
3 kB 7f97d6955000-7f97d6a06000 r-xp 00000000 fd:00 4298777149 /usr/lib64/libsqlite3.so.0.8.6
4 kB 7f97d7428000-7f97d7429000 r--p 00002000 fd:00 6442649368 /usr/lib64/python2.7/site-packages/sqlalchemy/cprocessors.so
4 kB 7f97d7630000-7f97d7631000 rw-p 00006000 fd:00 4298779128 /usr/lib64/python2.7/lib-dynload/_multiprocessing.so
4 kB 7f97d95a8000-7f97d95a9000 r--p 00010000 fd:00 2147488545 [...]
/usr/lib64/python2.7/site-packages/OpenSSL/SSL.so
16 kB 7f97d7c58000-7f97d7c5c000 rw-p 0001a000 fd:00 4298779115 /usr/lib64/python2.7/lib-dynload/_ctypes.so
16 kB 7f97dd32e000-7f97dd332000 rw-p 00000000 00:00 0
16 kB 7f97dd9b6000-7f97dd9bb000 rw-p 0000f000 fd:00 4298779130 /usr/lib64/python2.7/lib-dynload/_socketmodule.so
16 kB 7f97dd593000-7f97dd597000 r--p 00061000 fd:00 4298776137 /usr/lib64/libssl.so.1.0.1e
16 kB 7f97e4cc0000-7f97e4cc4000 r--p 001b6000 fd:00 4299921917 /usr/lib64/libc-2.17.so
20 kB 7f97db2ea000-7f97db2ef000 rw-p 0000a000 fd:00 4298779149 /usr/lib64/python2.7/lib-dynload/itertoolsmodule.so
28 kB 7f97dd597000-7f97dd59e000 rw-p 00065000 fd:00 4298776137 /usr/lib64/libssl.so.1.0.1e
28 kB 7f97d95a9000-7f97d95b0000 rw-p 00011000 fd:00 2147488545 /usr/lib64/python2.7/site-packages/OpenSSL/crypto.so
40 kB 7f97daed1000-7f97daedb000 rw-p 0001c000 fd:00 4298779123 /usr/lib64/python2.7/lib-dynload/_io.so
48 kB 7f97dd322000-7f97dd32e000 rw-p 001d5000 fd:00 4298776134 /usr/lib64/libcrypto.so.1.0.1e
48 kB 7fff16e88000-7fff16ea9000 rw-p 00000000 00:00 0 [stack]
52 kB 7f97dccf3000-7f97dcd00000 r--p 000d0000 fd:00 4298778191 /usr/lib64/libkrb5.so.3.3
60 kB 7f97e59a7000-7f97e59b6000 rw-p 00000000 00:00 0
104 kB 7f97dd308000-7f97dd322000 r--p 001bb000 fd:00 4298776134 /usr/lib64/libcrypto.so.1.0.1e
156 kB 03b92000-03bd4000 rw-p 00000000 00:00 0 [heap]
220 kB 7f97e5969000-7f97e59a7000 rw-p 00179000 fd:00 4298778899 /usr/lib64/libpython2.7.so.1.0
532 kB 7f97e5b48000-7f97e5bcf000 rw-p 00000000 00:00 0
768 kB 7f97e5a54000-7f97e5b17000 rw-p 00000000 00:00 0
772 kB 7f97dabf4000-7f97dacb5000 rw-p 00000000 00:00 0
22192 kB 025d2000-03b92000 rw-p 00000000 00:00 0 [heap]
Total Pss: 25778 kB

Being the maximum responsible python's heap (the tool doesn't show who's heap that is, but if I look manually it's /usr/bin/python2.7's heap).

For reference, a bare python waiting on command line Pss is 2930kB, 984kB due to python heap.

memexplore can be found here: https://github.com/mangelajo/memexplore

Assaf Muller (amuller)
Changed in neutron:
status: New → Confirmed
tags: added: loadimpact
Changed in neutron:
importance: Undecided → High
Assaf Muller (amuller)
tags: added: l3-ipam-dhcp
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Memory is shared, swapped, dirty etc. 66MB per 99 routers is not going to block ~7GB of RAM. The picture must be more complex than that.

Changed in neutron:
status: Confirmed → Incomplete
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

The source of OOM kill must be further identified, please provide more info about the environment; smaps should give you a more granular view of what's going on per process, and meminfo should also give us a sense of the overall memory usage.

description: updated
Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

Hi @armax, I will try to grab a more complete picture of this, I'm not very familiar with OOM kernel dumps, but I summed up all the entries in the total_vm + rss columns for all processes and that is the amount of memory available in the machine at the time of the oom-kill.

Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

Ok, I refreshed myself a bit about my ignorance (plenty of demonstration in #3), and I believe the most reasonable parameter to look at is Pss, which is the amount of Rss, with shared pages pondered among all the sharing processes (so if you sum total PSS's you may get the real memory usage).

I didn't find any reasonable tool, but I wrote a tiny one for the purpose. I'm updating the bug comments.

It's half bad as we calculated initially.

summary: - neutron-ns-metadata-proxy uses ~65MB/router in production, generating
- OOM situations
+ neutron-ns-metadata-proxy uses ~25MB/router in production
description: updated
Changed in neutron:
status: Incomplete → New
description: updated
Revision history for this message
Hong Hui Xiao (xiaohhui) wrote :

We have the same problem in a large scale neutron node(about 1000 routers). At last, we have to turn off the metadata for router.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

A 1000 router per node sounds pretty reasonable to me. You'd be about running out of other resources before running out of memory.

Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

I must admit, I've always tended to have @armax opinion here, but I've seen cloud use cases, where network usage is very light for most tenants, and yet the resources have to be there, consuming memory.

Everything else we start in namespaces is very light being this the only exception.

But I still wonder if the effort is worth it,

Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

I tend to share @Armando and @Ajo's sentiment here. I have run routers with 100s and close to a 1000 routers. I recall that memory usage was a concern but it wasn't our top concern. But, I concede that others' may have more lightly used routers and 25MB for a single instance of metadata proxy is unfortunate.

My question is this. Is this really a High importance bug? It has been around for a long time and hasn't seen a lot of complaint.

Revision history for this message
Assaf Muller (amuller) wrote :

I think it's a serious scale issue. 25 MBs x 500 routers amounts to 12.5 GBs of RAM, that's ridiculous.

Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

If you've got 500 routers, I think it is safe to assume that you have at least 500 VMs behind them. If you compare 12.5 GB of RAM with the resources that those 500 VMs use in your infrastructure then it seems pretty small. I'm not saying that it isn't a waste of resources. I'm just questioning whether this really needs to be High importance. Are we looking at this with the right perspective?

Revision history for this message
Assaf Muller (amuller) wrote :

If you want to scale to 500 routers on a single machine, the amount of RAM the metadata proxy out of all things should not be the limiting factor.

Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

I don't disagree with you. I think 25GB is a bit ridiculous. I'd be happy if someone who felt strongly about this picked it up and fixed it.

I'm just arguing the relative importance of this bug. If you're going to run 500 routers on a single machine, I'm thinking you're not going to try doing it on an under-powered machine. Also, I think to be marked High, it should have more than just a couple of people complaining about it.

I have run a machine with well over 500 routers on it. I noticed the memory usage of the metadata proxy but it wasn't on the top of our issues and I didn't see an obvious solution. Higher on my list were things like how long it took to reboot the machine and rebuild all of those routers. Or, how long it took to reschedule the routers to another machine when it went belly up. There were also concerns about kernel lock contention, etc.

I hate to say to throw more memory at the problem but it was hard to justify spending engineering time on it when a little bit more RAM was all that was needed. Some might not consider 12.5 GB a little bit more RAM but if there are well over 500 VMs behind the routers then I have trouble imaging that it is.

Revision history for this message
Assaf Muller (amuller) wrote :

I think finding someone that can sit down and fix this is far more important than the 'importance' flag.

I think it's important. In the same breath, Miguel and I cannot commit to fix it at this time.

I think that starting a discussion of what would a sane approach look like is a good first start. If we can come up with a plan, maybe we could find someone to implement that plan.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/259398

Changed in neutron:
assignee: nobody → Oleg Bondarev (obondarev)
status: New → In Progress
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

I tried to give clarity on the Importance field for bugs in [1,2].

This is not High. There's a workaround to address this issue: either balance your nodes, add more nodes, add more memory, etc.

Let's not get confused by LP importance descriptions. The urgency of a fix is not necessarily a direct consequence of the severity of the issue. I may as well have a rare blocker bug that has minimal impact and thus doesn't force me to fix it promptly.

[1] http://docs.openstack.org/developer/neutron/policies/bugs.html#bug-screening-best-practices
[2] https://wiki.openstack.org/wiki/Bugs#Importance

Changed in neutron:
importance: High → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/259398
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Do we have a resolution here?

Changed in neutron:
milestone: none → mitaka-rc1
Changed in neutron:
assignee: Oleg Bondarev (obondarev) → Brian Haley (brian-haley)
Changed in neutron:
milestone: mitaka-rc1 → newton-1
Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

I wonder how do the OVN guys plan to offer metadata in a native l3 implementation.

After thinking of this issue for months, I cannot find a decent way to do this using existing tools, leveraging openflow, or iptables without the risk of exposing the host network to the tenant.

The most reasonable way to tackle this and help scalability IMHO is writing a tiny daemon in C/C++, (I estimate <1 month or so for an experienced C developer), it's just a matter of taking HTTP requests, augmenting them with X-Neutron-Router-ID or X-Neutron-Network-ID and forwarding to the unix socket & back while translating error responses. [1]

I'd be glad to take on that if my managers ( ;-) ) wanted to devote some of my time to it, and the community was happy to accept a C version of the metadata agent in tree (I'd prefer to avoid a separate project/repo for something of a magnitude of <1000LOC)

I have like >10 years of C/C++ writing experience, and I know about security and how modern exploits work, so my code (or code under my review) would be likely to pass any audit.

@russellb, @amuller, @* any thoughts on this?.

Another alternative could be go, as long as we can make our executable dynamically linked (opposed to the statically linked Go standard which would again hurt RSS/PSS memory ratios).

[1] https://github.com/openstack/neutron/blob/master/neutron/agent/metadata/namespace_proxy.py

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :
Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

Ok, without going C extreme, an alternative could be a neutron-ns-metadata-proxy on diet:

2930kB for a bare python could indicate, that if we cut down most external dependencies and stick to the stdlib (argparse, sockets, etc..) we could be in the 3-3.5MB target with python.

As per IRC discussion, that seams a reasonable alternative.

Thoughts? @obondarev, @halleyb, were you looking at those things?

Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

Sorry, I missed the work being done in https://review.openstack.org/#/c/259398/5, I posted some comments. Thanks @obondarev.

Revision history for this message
Doug Wiegley (dougwig) wrote :

nginx per namespace would be a stock package and much smaller (9k resident when i checked.)

Changed in neutron:
assignee: Brian Haley (brian-haley) → Doug Wiegley (dougwig)
tags: added: scale
Changed in neutron:
milestone: newton-1 → newton-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/259398
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Do we have an updated on the nginx approach?

tags: added: low-hanging-fruit
description: updated
Changed in neutron:
milestone: newton-2 → newton-3
Revision history for this message
Ethan Lynn (ethanlynn) wrote :

Any updates?same problem occurs to keepalived_state_change

Changed in neutron:
milestone: newton-3 → newton-rc1
Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

Doug, any patches to switch to nginx?

Changed in neutron:
milestone: newton-rc1 → ocata-1
Assaf Muller (amuller)
Changed in neutron:
status: In Progress → Confirmed
assignee: Doug Wiegley (dougwig) → nobody
Changed in neutron:
assignee: nobody → Assaf Muller (amuller)
status: Confirmed → In Progress
Revision history for this message
Anton Aksola (aakso) wrote :

If somebody neeeds a quick workaround, here is our HAProxy based solution:
https://gist.github.com/aakso/1e9dab65edf0f392d413f125a7bdcf0c

The script is a drop-in replacement for neutron-ns-metadata-proxy

The RSS footprint seems to 0.9-1M in our environment. Please test before using in prod.

Doug Wiegley (dougwig)
Changed in neutron:
assignee: Assaf Muller (amuller) → Doug Wiegley (dougwig)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/390565

Changed in neutron:
milestone: ocata-1 → ocata-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Oleg Bondarev (<email address hidden>) on branch: master
Review: https://review.openstack.org/259398
Reason: in favor of nginx approach https://review.openstack.org/#/c/390565/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/390565
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Changed in neutron:
milestone: ocata-2 → ocata-3
Changed in neutron:
milestone: ocata-3 → ocata-rc1
Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

Oh, I see we had some work done around using nginx as a replacement, and I didn't see the patches , yikes.

The other alternative as per example was [1] haproxy-based replacement.

haproxy is widely available and supported on all distributions, while nginx is not. On the other hand I don't know if any of them has a technical advantage/disadvantage over each other.

Of course, who's doing the field work is using nginx and I'm perfectly fine with that, I wouldn't mind if nginx were more widely supported.

I will keep up with reviews if this restarts anytime.

[1] https://gist.github.com/aakso/1e9dab65edf0f392d413f125a7bdcf0c

Revision history for this message
Alejandro Comisario (alejandro-f) wrote :

we are hitting exactly the same on ubuntu 14.04 mitaka packages.
does testing #25 comment worth the while ? this is tearing our customers cloud once a day

Changed in neutron:
milestone: ocata-rc1 → pike-1
assignee: Doug Wiegley (dougwig) → nobody
status: In Progress → Confirmed
tags: removed: scale
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/431691

Changed in neutron:
assignee: nobody → Daniel Alvarez (dalvarezs)
status: Confirmed → In Progress
Revision history for this message
Jakub Libosvar (libosvar) wrote :

I added TripleO project here to not forget to update l3 and dhcp rootwrap filters after upgrade to Pike. I hope it's the right project, if there is a dedicated project that handles upgrades, please tell us :)

Revision history for this message
Daniel Alvarez (dalvarezs) wrote :

@Jakub, well noted :)
I think that filters are copied over on upgrade since rpm packages are installed [0] and filters are moved during installation as in the spec file [1]. However would be nice if someone else can confirm.

[0] https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/tripleo-packages.yaml#L46
[1] https://github.com/rdo-packages/neutron-distgit/blob/rpm-master/openstack-neutron.spec#L377

Revision history for this message
Miguel Angel Ajo (mangelajo) wrote : Re: [Bug 1524916] Re: neutron-ns-metadata-proxy uses ~25MB/router in production
Download full text (7.9 KiB)

Do we need to do that manually?, isn't it handled by the RPM update as
usual?

On Tue, Mar 7, 2017 at 10:00 AM, Jakub Libosvar <email address hidden> wrote:

> I added TripleO project here to not forget to update l3 and dhcp
> rootwrap filters after upgrade to Pike. I hope it's the right project,
> if there is a dedicated project that handles upgrades, please tell us :)
>
> ** Also affects: tripleo
> Importance: Undecided
> Status: New
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1524916
>
> Title:
> neutron-ns-metadata-proxy uses ~25MB/router in production
>
> Status in neutron:
> In Progress
> Status in tripleo:
> New
>
> Bug description:
> [root@mac6cae8b61e442 memexplore]# ./memexplore.py all metadata-proxy |
> cut -c 1-67
> 25778 kB (pid 420) /usr/bin/python /bin/neutron-ns-metadata-proxy -
> 25774 kB (pid 1468) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25778 kB (pid 1472) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25770 kB (pid 1474) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 26528 kB (pid 1489) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25778 kB (pid 1520) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25778 kB (pid 1738) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25774 kB (pid 1814) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25774 kB (pid 2024) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25774 kB (pid 3961) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25774 kB (pid 4076) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25770 kB (pid 4099) /usr/bin/python /bin/neutron-ns-metadata-proxy
> [...]
> 25778 kB (pid 31386) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25778 kB (pid 31403) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25774 kB (pid 31416) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25778 kB (pid 31453) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25770 kB (pid 31483) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25770 kB (pid 31647) /usr/bin/python /bin/neutron-ns-metadata-proxy
> 25774 kB (pid 31743) /usr/bin/python /bin/neutron-ns-metadata-proxy
>
> 2,581,230 kB Total PSS
>
> if we look explicitly at one of those processes we see:
>
> # ./memexplore.py pss 24039
> 0 kB 7f97db981000-7f97dbb81000 ---p 0005f000 fd:00 4298776438
> /usr/lib64/libpcre.so.1.2.0
> 0 kB 7f97dbb83000-7f97dbba4000 r-xp 00000000 fd:00 4298776486
> /usr/lib64/libselinux.so.1
> 0 kB 7fff16ffe000-7fff17000000 r-xp 00000000 00:00 0
> [vdso]
> 0 kB 7f97dacb5000-7f97dacd1000 r-xp 00000000 fd:00 4298779123
> /usr/lib64/python2.7/lib-dynload/_io.so
> 0 kB 7f97d6a06000-7f97d6c05000 ---p 000b1000 fd:00 4298777149
> /usr/lib64/libsqlite3.so.0.8.6
> [...]
> 0 kB 7f97d813a000-7f97d8339000 ---p 0000b000 fd:00 4298779157
> /usr/lib64/python2.7/lib-dynload/pyexpat.so
> 0 kB 7f97dbba4000-7f97dbda4000 ---p 00021000 fd:00 4298776486
> /usr/lib64/libselinux.so.1
> 0 kB 7f9
> 7db4f7000-7f97db4fb000 r...

Read more...

Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

Right, I believe rpm update will handle it.

Changed in tripleo:
status: New → Triaged
importance: Undecided → High
milestone: none → pike-3
milestone: pike-3 → pike-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/431691
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=3b22541a2aa9a5b06e2bff256701dbe24554c17c
Submitter: Jenkins
Branch: master

commit 3b22541a2aa9a5b06e2bff256701dbe24554c17c
Author: Daniel Alvarez <email address hidden>
Date: Thu Feb 9 18:30:23 2017 +0000

    Switch ns-metadata-proxy to haproxy

    Due to the high memory footprint of current Python ns-metadata-proxy,
    it has to be replaced with a lighter process to avoid OOM conditions in
    large environments.

    This patch spawns haproxy through a process monitor using a pidfile.
    This allows tracking the process and respawn it if necessary as it was
    done before. Also, it implements an upgrade path which consists of
    detecting any running Python instance of ns-metadata-proxy and
    replacing them by haproxy. Therefore, upgrades will take place by
    simply restarting neutron-l3-agent and neutron-dhcp-agent.

    According to /proc/<pid>/smaps, memory footprint goes down from ~50MB
    to ~1.5MB.

    Also, haproxy is added to bindep in order to ensure that it's installed.

    UpgradeImpact

    Depends-On: I36a5531cacc21c0d4bb7f20d4bec6da65d04c262
    Depends-On: Ia37368a7ff38ea48c683a7bad76f87697e194b04

    Closes-Bug: #1524916
    Change-Id: I5a75cc582dca48defafb440207d10e2f7b4f218b

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 11.0.0.0b1

This issue was fixed in the openstack/neutron 11.0.0.0b1 development milestone.

Changed in tripleo:
milestone: pike-2 → pike-3
Changed in tripleo:
milestone: pike-3 → pike-rc1
Revision history for this message
Ben Nemec (bnemec) wrote :

As others have noted, the rpm upgrade process should handle updating the rootwrap filters. The only exception would be if a user edited them after installation, but in that case they're responsible for merging in the updated ones themselves.

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.