keepalived track script fails sometimes

Bug #2025219 reported by Michal Nasiadka
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
Fix Released
Medium
Michal Nasiadka
Antelope
Fix Released
Medium
Michal Nasiadka
Bobcat
Fix Released
Medium
Michal Nasiadka
Yoga
New
Medium
Unassigned
Zed
Fix Released
Medium
Michal Nasiadka

Bug Description

Observed mainly in CI on single node deployments

In some occasions (sometimes rare, sometimes more often - most often it surfaces in upgrade jobs) the standard keepalived track script that checks haproxy state via a socket - times out.
In those occurrences - keepalived fails and goes into BACKUP state for some seconds - which breaks API connectivity.

In case of multinode - there's always another node to fail over to - but in single node - not really.

Changed in kolla-ansible:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/887069
Committed: https://opendev.org/openstack/kolla-ansible/commit/8d5356268688645efbd09517059cbc1189e3fea7
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 8d5356268688645efbd09517059cbc1189e3fea7
Author: Michal Nasiadka <email address hidden>
Date: Tue Jun 27 09:42:31 2023 +0200

    loadbalancer: Add option to not define track script

    We've seen issues in CI when keepalived haproxy check script returns
    an error and keepalived is switching to backup and then again to primary
    on a single node environment.

    Closes-Bug: #2025219

    Change-Id: Iba62e76b3cf83f3ade6df81288d2d77129ffc725
    (cherry picked from commit a0e614ee10937eb0930d1c5481293a6d2dc8d915)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/887170
Committed: https://opendev.org/openstack/kolla-ansible/commit/993b854bb5bba7a2a2e50615f2d437d7c81d8001
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 993b854bb5bba7a2a2e50615f2d437d7c81d8001
Author: Michal Nasiadka <email address hidden>
Date: Tue Jun 27 09:42:31 2023 +0200

    loadbalancer: Add option to not define track script

    We've seen issues in CI when keepalived haproxy check script returns
    an error and keepalived is switching to backup and then again to primary
    on a single node environment.

    Closes-Bug: #2025219

    Change-Id: Iba62e76b3cf83f3ade6df81288d2d77129ffc725
    (cherry picked from commit a0e614ee10937eb0930d1c5481293a6d2dc8d915)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 16.2.0

This issue was fixed in the openstack/kolla-ansible 16.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 15.3.0

This issue was fixed in the openstack/kolla-ansible 15.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 17.0.0.0rc1

This issue was fixed in the openstack/kolla-ansible 17.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.