fix stop_grace_period for octavia worker container

Bug #1855684 reported by Gregory Thiemonge
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Undecided
Gregory Thiemonge

Bug Description

Default stop timeout for tripleo containers is 10 seconds, it may break octavia-worker services which can run long "taskflow" flows when building a load balancer.
If an admin restarts octavia-worker container while creating a load balancer, octavia-worker will be non-gracefully shutdown, and octavia will leak resources (VM instances and ports) that cannot be removed without manually editing the database.

To fix this issue, stop timeout value should be set to the same value as octavia-worker's graceful_shutdown_timeout (used by cotyledon to manage the service) which has been set in https://review.opendev.org/#/c/684201/ to 300 seconds.

Changed in tripleo:
assignee: nobody → Gregory Thiemonge (gthiemonge)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.opendev.org/698014

Changed in tripleo:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/698014
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=c595835776eccbc2f59b69574f0aa5c3e87c9bd5
Submitter: Zuul
Branch: master

commit c595835776eccbc2f59b69574f0aa5c3e87c9bd5
Author: Gregory Thiemonge <email address hidden>
Date: Mon Dec 9 14:43:02 2019 +0100

    Set octavia services' stop grace period to 300sec

    Octavia worker, house-keeping and health-monitor serivices may use some
    long taskflow's flows to handle load balancers and amphorae (launch VMs,
    etc...). Those flows should not be interrupted when restarting those
    services (i.e when updating an overcloud, or restarting services because
    of certificates rotation), it might cause resource leaks that cannot be
    fixed by an admin.

    As default container stop timeout is defined to 10 seconds, this timeout
    value needs to be increased for octavia services (except octavia api) to
    ensure a graceful shutdown.
    This new value has been set to 300 seconds according to the octavia
    worker default configuration introduced in
    https://review.opendev.org/#/c/684201/

    Closes-Bug: #1855684
    Change-Id: I8911a79328769c910d03168cfa5a421d0dd0f9b6

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/703937

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/703938

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/703942

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/stein)

Reviewed: https://review.opendev.org/703938
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=011935828040360940130d1402704f3bb68485e9
Submitter: Zuul
Branch: stable/stein

commit 011935828040360940130d1402704f3bb68485e9
Author: Gregory Thiemonge <email address hidden>
Date: Mon Dec 9 14:43:02 2019 +0100

    Set octavia services' stop grace period to 300sec

    Octavia worker, house-keeping and health-monitor serivices may use some
    long taskflow's flows to handle load balancers and amphorae (launch VMs,
    etc...). Those flows should not be interrupted when restarting those
    services (i.e when updating an overcloud, or restarting services because
    of certificates rotation), it might cause resource leaks that cannot be
    fixed by an admin.

    As default container stop timeout is defined to 10 seconds, this timeout
    value needs to be increased for octavia services (except octavia api) to
    ensure a graceful shutdown.
    This new value has been set to 300 seconds according to the octavia
    worker default configuration introduced in
    https://review.opendev.org/#/c/684201/

    Closes-Bug: #1855684
    Change-Id: I8911a79328769c910d03168cfa5a421d0dd0f9b6
    (cherry picked from commit c595835776eccbc2f59b69574f0aa5c3e87c9bd5)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/703937
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=36f9cc78c88b377092cebca5b11f451af35f4f10
Submitter: Zuul
Branch: stable/train

commit 36f9cc78c88b377092cebca5b11f451af35f4f10
Author: Gregory Thiemonge <email address hidden>
Date: Mon Dec 9 14:43:02 2019 +0100

    Set octavia services' stop grace period to 300sec

    Octavia worker, house-keeping and health-monitor serivices may use some
    long taskflow's flows to handle load balancers and amphorae (launch VMs,
    etc...). Those flows should not be interrupted when restarting those
    services (i.e when updating an overcloud, or restarting services because
    of certificates rotation), it might cause resource leaks that cannot be
    fixed by an admin.

    As default container stop timeout is defined to 10 seconds, this timeout
    value needs to be increased for octavia services (except octavia api) to
    ensure a graceful shutdown.
    This new value has been set to 300 seconds according to the octavia
    worker default configuration introduced in
    https://review.opendev.org/#/c/684201/

    Closes-Bug: #1855684
    Change-Id: I8911a79328769c910d03168cfa5a421d0dd0f9b6
    (cherry picked from commit c595835776eccbc2f59b69574f0aa5c3e87c9bd5)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/rocky)

Reviewed: https://review.opendev.org/703942
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=07d02af94548ff35a91c62ab9d206f761a1d7b37
Submitter: Zuul
Branch: stable/rocky

commit 07d02af94548ff35a91c62ab9d206f761a1d7b37
Author: Gregory Thiemonge <email address hidden>
Date: Mon Dec 9 14:43:02 2019 +0100

    Set octavia services' stop grace period to 300sec

    Octavia worker, house-keeping and health-monitor serivices may use some
    long taskflow's flows to handle load balancers and amphorae (launch VMs,
    etc...). Those flows should not be interrupted when restarting those
    services (i.e when updating an overcloud, or restarting services because
    of certificates rotation), it might cause resource leaks that cannot be
    fixed by an admin.

    As default container stop timeout is defined to 10 seconds, this timeout
    value needs to be increased for octavia services (except octavia api) to
    ensure a graceful shutdown.
    This new value has been set to 300 seconds according to the octavia
    worker default configuration introduced in
    https://review.opendev.org/#/c/684201/

    Closes-Bug: #1855684
    Change-Id: I8911a79328769c910d03168cfa5a421d0dd0f9b6
    (cherry picked from commit c595835776eccbc2f59b69574f0aa5c3e87c9bd5)
    (cherry picked from commit 011935828040360940130d1402704f3bb68485e9)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 12.1.0

This issue was fixed in the openstack/tripleo-heat-templates 12.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.4.0

This issue was fixed in the openstack/tripleo-heat-templates 11.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates rocky-eol

This issue was fixed in the openstack/tripleo-heat-templates rocky-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates stein-eol

This issue was fixed in the openstack/tripleo-heat-templates stein-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.