Quantum-gateway charm should set mhash_entries=16000000 mphash_entries=16000000 on kernel command line

Bug #1376958 reported by Dave Chiluk
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Neutron Gateway Charm
Triaged
Medium
Unassigned
neutron-gateway (Juju Charms Collection)
Invalid
Medium
Unassigned

Bug Description

With the default sized hashtables, creation and deletion of instances in their own namespaces becomes prohibitively expensive. Without this change namespace creation and deletion can begin to take minutes instead of milliseconds when higher numbers of namespaces already exist on the machine (higher numbers = greater than 4000 namespaces).

As of commit 742ceaba530995da02ef5e5ac32f1478d60efd35 to the ubuntu-trusty kernel tree
http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-trusty.git;a=commitdiff;h=742ceaba530995da02ef5e5ac32f1478d60efd35

Please see https://bugs.launchpad.net/linux/+bug/1328088 as well as it's use case directly applies to this tunable.

mhash_entries and mphash_entries are now a settable option for the kernel. These options adjust a hashtable that grows at n^2 the number of namespaces on the the machine. So if you have 4000 network namespaces on the neutron gateway 4000^2=16000000 entries will exist in the hashtable. Recently this value increased to 262144 entries by default *(4mb page size/sizeof(struct list_head)). That is enough to store 512 network namespaces without collisions in the hashtable.

So it is the recommendation of this engineer that deployments with large numbers of network namespaces add "mhash_entries=16000000 mphash_entries=16000000" to their kernel command line of at least the neutron gateway.

This would be enough to fit 4000 namespaces without collission.
My suggestion is that this should be sold as an expected number of namespaces config option with a default value of 4000 for neutron nodes. The charm should then square the value and edit /etc/default/grub GRUB_CMDLINE_LINUX appropriately and update-grub.

Dave Chiluk (chiluk)
description: updated
description: updated
Revision history for this message
Dave Chiluk (chiluk) wrote :

It is also important to note that with "mhash_entries=16000000 mphash_entries=16000000" set each of these tables will take 256mb of space.

In order to remedy this massive amount of space the way ip creates a new namespace via CLONE_NEWNS would have to be completely reworked from userspace all the way down to kernel.

tags: added: cts
tags: added: openstack
Revision history for this message
James Page (james-page) wrote :

The gateway charm is the only one which makes use of namespaces so the scope makes sense to me; I'm assuming that this will require a reboot to take effect? If that is the case then we can configure this for now in the charm, but we can't effectively apply this change without causing hook execution.

I'd recommend as a workaround for now applying this globally across all deployed servers by using MAAS to provide additional kernel parameters.

Changed in quantum-gateway (Juju Charms Collection):
importance: Undecided → High
status: New → Triaged
Revision history for this message
James Page (james-page) wrote :

*causing hook execution failures.

Juju is growing a reboot command for charms to use, we'll splice that in as and when that's released.

Changed in quantum-gateway (Juju Charms Collection):
importance: High → Medium
Ryan Beisner (1chb1n)
tags: added: uosci
Revision history for this message
Ryan Beisner (1chb1n) wrote :

I believe UOSCI may be seeing the symptoms of this, or something very similar.

Environment:
Tenant w/ 1 neutron router, 15 connected networks and subnets; with a somewhat high-volume of short-lived instances (~100 to 200 built and torn down each day). Max concurrent instances in the tenant are generally 60 to 80, and they roughly hit all networks equally in rotation.

Observation:
After a month or so of nova booting and/or juju deploying around 200 instances per day, eventually, we start to see 'no network' issues in 7 to 12% of instance boot attempts. Once the issue arises, it is persistently present and debilitating. This, after working flawlessly for ~1mo+. We have seen this cycle twice now, and hit it again today.

Impact:
We end up seeing false failures in our deployment testing, which requires a manual workaround once detected.

Workaround:
Delete the neutron nets and subs, but not the router, and re-add the nets and subs. Then it hums along happily again until some unknown point in the future, where we start to experience 'nonet' bootstrap fails again - possibly coinciding with this bug.

The common symptom / indicator is present in nova console-log for the affected instances:
    cloud-init-nonet[134.04]: gave up waiting for a network device

From the tenant perspective, log inspection has yielded no other useful indicators to me. Granted, my host log inspection level has not been thorough on this to-date, as we generally need to resolve the issue ASAP to restore services.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

I dug into this a bit more as we are living the issue today ;)

Please review and comment here regarding: http://paste.ubuntu.com/9239628/

Thank you.

James Page (james-page)
affects: quantum-gateway (Juju Charms Collection) → neutron-gateway (Juju Charms Collection)
James Page (james-page)
Changed in charm-neutron-gateway:
importance: Undecided → Medium
status: New → Triaged
Changed in neutron-gateway (Juju Charms Collection):
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.