Heavy cpu load seen when keepalived state change server gets wsgi_default_pool_size requests at same time

Bug #1581580 reported by venkata anil
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
venkata anil

Bug Description

With wsgi_default_pool_size=100[1], if the keepalived state change server gets 100 requests at the same time, while processing the requests heavy load is seen on cpu, making the network node unresponsive. For each request, keepalived state change server spawns a new meta data proxy process(i.e neutron-ns-metadata-proxy). During heavy cpu load, with "top" command, I can see many metadata proxy processes in "running" state at same time(see the attachment).

When wsgi_default_pool_size=8, I see state change server spawning 8 metadata proxy processes at a time("top" command shows 8 meta data proxy processes in "running" state at a time), cpu load is less and metadata proxy processes(for example, 100) spawned for all requests without failures(as backlog=KEEPALIVED_STATE_CHANGE_SERVER_BACKLOG i.e 4096).

We can keep wsgi_default_pool_size=100 for neutron API server, and use seperate configuration for UnixDomainWSGIServer(for example CONF.unix_domain_wsgi_default_pool_size).

neutron/agent/linux/utils.py
class UnixDomainWSGIServer(wsgi.Server):

    def _run(self, application, socket):
        """Start a WSGI service in a new green thread."""
        logger = logging.getLogger('eventlet.wsgi.server')
        eventlet.wsgi.server(socket,
                             application,
                             max_size=CONF.unix_domain_wsgi_default_pool_size,
                             protocol=UnixDomainHttpProtocol,
                             log=logger)

[1] https://github.com/openstack/neutron/commit/9d573387f1e33ce85269d3ed9be501717eed4807

Revision history for this message
venkata anil (anil-venkata) wrote :
Changed in neutron:
assignee: nobody → venkata anil (anil-venkata)
Assaf Muller (amuller)
tags: added: loadimpact
Changed in neutron:
status: New → Confirmed
importance: Undecided → Medium
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/317616

Changed in neutron:
status: Confirmed → In Progress
Henry Gessau (gessau)
summary: Heavy cpu load seen when keepalived state change server gets
- wsi_default_pool_size requests at same time
+ wsgi_default_pool_size requests at same time
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/317616
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=70ea188f5d87c45fb60ace8b8405274e5f6dd489
Submitter: Jenkins
Branch: master

commit 70ea188f5d87c45fb60ace8b8405274e5f6dd489
Author: venkata anil <email address hidden>
Date: Tue May 17 16:30:13 2016 +0000

    New option for num_threads for state change server

    Currently max number of client connections(i.e greenlets spawned at
    a time) opened at any time by the WSGI server is set to 100 with
    wsgi_default_pool_size[1].

    This configuration may be fine for neutron api server. But with
    wsgi_default_pool_size(=100) requests, state change server
    is creating heavy cpu load on agent.
    So this server(which run on agents) need lesser value i.e
    can be configured to half the number of cpu on agent

    We use "ha_keepalived_state_change_server_threads" config option
    to configure number of threads in state change server instead of
    wsgi_default_pool_size.

    [1] https://review.openstack.org/#/c/278007/

    DocImpact: Add new config option -
    ha_keepalived_state_change_server_threads, to configure number
    of threads in state change server.

    Closes-Bug: #1581580
    Change-Id: I822ea3844792a7731fd24419b7e90e5aef141993

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/379578

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/379580

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/379582

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/379582
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4387d4aedfa2ccf129eb957ee057ad9c9edac82d
Submitter: Jenkins
Branch: stable/liberty

commit 4387d4aedfa2ccf129eb957ee057ad9c9edac82d
Author: venkata anil <email address hidden>
Date: Tue May 17 16:30:13 2016 +0000

    New option for num_threads for state change server

    Currently max number of client connections(i.e greenlets spawned at
    a time) opened at any time by the WSGI server is set to 100 with
    wsgi_default_pool_size[1].

    This configuration may be fine for neutron api server. But with
    wsgi_default_pool_size(=100) requests, state change server
    is creating heavy cpu load on agent.
    So this server(which run on agents) need lesser value i.e
    can be configured to half the number of cpu on agent

    We use "ha_keepalived_state_change_server_threads" config option
    to configure number of threads in state change server instead of
    wsgi_default_pool_size.

    [1] https://review.openstack.org/#/c/278007/

    DocImpact: Add new config option -
    ha_keepalived_state_change_server_threads, to configure number
    of threads in state change server.

    Closes-Bug: #1581580
    Change-Id: I822ea3844792a7731fd24419b7e90e5aef141993
    (cherry picked from commit 70ea188f5d87c45fb60ace8b8405274e5f6dd489)

tags: added: in-stable-liberty
tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/mitaka)

Reviewed: https://review.openstack.org/379580
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=039ab16f4ea9308792d4168adb1d501310145346
Submitter: Jenkins
Branch: stable/mitaka

commit 039ab16f4ea9308792d4168adb1d501310145346
Author: venkata anil <email address hidden>
Date: Tue May 17 16:30:13 2016 +0000

    New option for num_threads for state change server

    Currently max number of client connections(i.e greenlets spawned at
    a time) opened at any time by the WSGI server is set to 100 with
    wsgi_default_pool_size[1].

    This configuration may be fine for neutron api server. But with
    wsgi_default_pool_size(=100) requests, state change server
    is creating heavy cpu load on agent.
    So this server(which run on agents) need lesser value i.e
    can be configured to half the number of cpu on agent

    We use "ha_keepalived_state_change_server_threads" config option
    to configure number of threads in state change server instead of
    wsgi_default_pool_size.

    [1] https://review.openstack.org/#/c/278007/

    DocImpact: Add new config option -
    ha_keepalived_state_change_server_threads, to configure number
    of threads in state change server.

    Closes-Bug: #1581580
    Change-Id: I822ea3844792a7731fd24419b7e90e5aef141993
    (cherry picked from commit 70ea188f5d87c45fb60ace8b8405274e5f6dd489)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/newton)

Reviewed: https://review.openstack.org/379578
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=5c1516e1fb07bb3e026a3569396567c3907b31c6
Submitter: Jenkins
Branch: stable/newton

commit 5c1516e1fb07bb3e026a3569396567c3907b31c6
Author: venkata anil <email address hidden>
Date: Tue May 17 16:30:13 2016 +0000

    New option for num_threads for state change server

    Currently max number of client connections(i.e greenlets spawned at
    a time) opened at any time by the WSGI server is set to 100 with
    wsgi_default_pool_size[1].

    This configuration may be fine for neutron api server. But with
    wsgi_default_pool_size(=100) requests, state change server
    is creating heavy cpu load on agent.
    So this server(which run on agents) need lesser value i.e
    can be configured to half the number of cpu on agent

    We use "ha_keepalived_state_change_server_threads" config option
    to configure number of threads in state change server instead of
    wsgi_default_pool_size.

    [1] https://review.openstack.org/#/c/278007/

    DocImpact: Add new config option -
    ha_keepalived_state_change_server_threads, to configure number
    of threads in state change server.

    Closes-Bug: #1581580
    Change-Id: I822ea3844792a7731fd24419b7e90e5aef141993
    (cherry picked from commit 70ea188f5d87c45fb60ace8b8405274e5f6dd489)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 7.2.0

This issue was fixed in the openstack/neutron 7.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 8.3.0

This issue was fixed in the openstack/neutron 8.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/383713
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=68caeced9268fbf74d0c6bf5e35509f15037cc36
Submitter: Jenkins
Branch: master

commit 68caeced9268fbf74d0c6bf5e35509f15037cc36
Author: venkata anil <email address hidden>
Date: Fri Oct 7 12:40:49 2016 +0000

    Add sample_default for state change server config

    Adding sample_default to ha_keepalived_state_change_server_threads
    will make sample config file contents consistent.
    This patch also adds a missing space between help text sentences.

    Related-Bug: #1581580
    Change-Id: Ieae84c69a397465bed595b2970d1d51ce93fa3b0

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 9.1.0

This issue was fixed in the openstack/neutron 9.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 7.2.0

This issue was fixed in the openstack/neutron 7.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 8.3.0

This issue was fixed in the openstack/neutron 8.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 9.1.0

This issue was fixed in the openstack/neutron 9.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 10.0.0.0b1

This issue was fixed in the openstack/neutron 10.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.