nova and cinder ha fails due to glance api servers configuration

Bug #1643509 reported by Vladislav Belogrudov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
Fix Released
Undecided
Vladislav Belogrudov
Ocata
Fix Released
Undecided
Unassigned

Bug Description

Nova and Cinder use a list of glance api servers - this list does not provide a proper high availability because the servers are connected at random without recalling who is failed last time. E.g. out of three controllers with a failed first one nova and cinder can try connection to the first controller because of random / shuffled choice of the glance server. A proper solution is to use VIP that connects to alive server from the beginning. Also as workaround one could significantly increase max number of tries in configuration files in hope that a random function will choose a healthy glance server sometime.

Changed in kolla-ansible:
assignee: nobody → Vladislav Belogrudov (vlad-belogrudov)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (master)

Fix proposed to branch: master
Review: https://review.openstack.org/400187

Changed in kolla-ansible:
status: New → In Progress
Revision history for this message
Vladislav Belogrudov (vlad-belogrudov) wrote :

The list of api servers in configs is a misleading feature - everyone thinks it's ha but it's pure load balancing with random choice

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (master)

Reviewed: https://review.openstack.org/400187
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=ac3e4cf9c9b84e43c39cc928ddd1f07687ad2b72
Submitter: Jenkins
Branch: master

commit ac3e4cf9c9b84e43c39cc928ddd1f07687ad2b72
Author: Vladislav Belogrudov <email address hidden>
Date: Mon Nov 21 13:55:23 2016 +0300

    Use kolla_internal_vip_address for glance_api servers

    Nova and Cinder used a list of glance api servers - this list
    does not provide a proper high availability because the servers
    are connected at random without recalling who is failed last
    time. E.g. out of three controllers with a failed first one nova
    and cinder can try connection to the first controller because of
    random / shuffled choice of the glance server. A proper solution
    is to use VIP that connects to alive server from the beginning.
    Also as workaround one could significantly increase max number
    of retries in configuration files in hope that a random function
    will choose a healthy glance server sometime - not a good choice.

    Change-Id: Ifaf8ffe3697ec88a6da4c2b43c83975b63dc2e8c
    Closes-Bug: #1643509

Changed in kolla-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 5.0.0.0b2

This issue was fixed in the openstack/kolla-ansible 5.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/ocata)

Reviewed: https://review.openstack.org/494110
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=009b94feae583265473210e8eb2bfddf8e90ba1a
Submitter: Zuul
Branch: stable/ocata

commit 009b94feae583265473210e8eb2bfddf8e90ba1a
Author: Vladislav Belogrudov <email address hidden>
Date: Mon Nov 21 13:55:23 2016 +0300

    Use kolla_internal_vip_address for glance_api servers

    Nova and Cinder used a list of glance api servers - this list
    does not provide a proper high availability because the servers
    are connected at random without recalling who is failed last
    time. E.g. out of three controllers with a failed first one nova
    and cinder can try connection to the first controller because of
    random / shuffled choice of the glance server. A proper solution
    is to use VIP that connects to alive server from the beginning.
    Also as workaround one could significantly increase max number
    of retries in configuration files in hope that a random function
    will choose a healthy glance server sometime - not a good choice.

    Change-Id: Ifaf8ffe3697ec88a6da4c2b43c83975b63dc2e8c
    Closes-Bug: #1643509
    (cherry picked from commit ac3e4cf9c9b84e43c39cc928ddd1f07687ad2b72)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 4.0.3

This issue was fixed in the openstack/kolla-ansible 4.0.3 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.