while spining vms on large scale some vms get multiple ips and multiple ports get created

Bug #1565105 reported by Manjeet Singh Bhatia
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
kolla
Fix Released
Critical
Steven Dake
Mitaka
Fix Released
Critical
Steven Dake

Bug Description

attached logs from neutron_server

not getting any error logs in rabbit mq

http://paste.openstack.org/show/492785/

Revision history for this message
Manjeet Singh Bhatia (manjeet-s-bhatia) wrote :
Revision history for this message
Akihiro Motoki (amotoki) wrote :

The pasted logs are about: Failed to bind port (in neutron log) and TimeoutError: QueuePool limit (nova scheduler log), but the bug title is 'some vms get multiple ips and multiple ports get created'. I don't understand how these logs are related to the bug filed.
Could you explain more?

Changed in neutron:
status: New → Incomplete
Revision history for this message
Steven Dake (sdake) wrote :

TimeoutError: QueuePool limit of size 5 overflow 50 reached

Perhaps we should try raising the queue pool limit or the amqp pool timeouts for all services for large scale deployments. can you try that?

Changed in kolla:
importance: Undecided → Critical
status: New → Incomplete
milestone: none → newton-1
status: Incomplete → Confirmed
Revision history for this message
Steven Dake (sdake) wrote :

The neutron issue here is when that error happens, neutron creates more ips and loads them into the vms. This is incorrect behavior - instead Neutron should ail gracefully on overload.

Changed in neutron:
status: Incomplete → Confirmed
Steven Dake (sdake)
Changed in kolla:
assignee: nobody → Steven Dake (sdake)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (master)

Fix proposed to branch: master
Review: https://review.openstack.org/302415

Changed in kolla:
status: Confirmed → In Progress
Revision history for this message
Narender (narender-soorineeda) wrote :

How can we recerate the issue ?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (master)

Reviewed: https://review.openstack.org/302415
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=316eee3680d0f934b8491367c6007ea16ec41906
Submitter: Jenkins
Branch: master

commit 316eee3680d0f934b8491367c6007ea16ec41906
Author: Steven Dake <email address hidden>
Date: Wed Apr 6 15:24:12 2016 -0400

    Increase max pool size so conductor doesn't implode

    When horizon is used to launch 2000 VMs, nova-conductor is very
    busy making database connections. All 55 database connections are
    in use, resulting in an inability to garbage collect database
    connections. Instead raise the max pool to 50 which will allow
    50 concurrent database connections and the max overflow to 1000
    which permits the database connections to finish the job at
    large nodecount scales.

    Closes-Bug: #1565105

    Change-Id: I26dc2f7fda8760197888a1d61fbc45dfada2dd06

Changed in kolla:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/303677

Revision history for this message
Steven Dake (sdake) wrote :

Narender,

The recreation steps are launch an openstack deployment with the default settings in 64 nodes (13 tb ram 2600 cores) and launch 2000 VMs via horizon. When launched via horizon this problem happens, when launched via the nova cli it doesn't happen.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/mitaka)

Reviewed: https://review.openstack.org/303677
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=4c4e095b6787eed75a994359e395297bec1637d4
Submitter: Jenkins
Branch: stable/mitaka

commit 4c4e095b6787eed75a994359e395297bec1637d4
Author: Steven Dake <email address hidden>
Date: Wed Apr 6 15:24:12 2016 -0400

    Increase max pool size so conductor doesn't implode

    When horizon is used to launch 2000 VMs, nova-conductor is very
    busy making database connections. All 55 database connections are
    in use, resulting in an inability to garbage collect database
    connections. Instead raise the max pool to 50 which will allow
    50 concurrent database connections and the max overflow to 1000
    which permits the database connections to finish the job at
    large nodecount scales.

    Closes-Bug: #1565105

    Change-Id: I26dc2f7fda8760197888a1d61fbc45dfada2dd06
    (cherry picked from commit 316eee3680d0f934b8491367c6007ea16ec41906)

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote : Fix included in openstack/kolla 2.0.0.0rc4

This issue was fixed in the openstack/kolla 2.0.0.0rc4 release candidate.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/kolla 2.0.0

This issue was fixed in the openstack/kolla 2.0.0 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/kolla 1.1.0

This issue was fixed in the openstack/kolla 1.1.0 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/kolla 3.0.0.0b1

This issue was fixed in the openstack/kolla 3.0.0.0b1 development milestone.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

This bug is > 180 days without activity. We are unsetting assignee and milestone and setting status to Incomplete in order to allow its expiry in 60 days.

If the bug is still valid, then update the bug status.

Changed in neutron:
status: Confirmed → Incomplete
no longer affects: neutron
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.