paunch doesn't error when image pull fails on detached containers

Bug #1733941 reported by Steve Baker
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
paunch
Fix Released
High
Steve Baker
tripleo
Fix Released
High
Steve Baker

Bug Description

(also discussed at https://bugzilla.redhat.com/show_bug.cgi?id=1516275 )

Currently when the container image references are misconfigured, or can't be pulled, the first evidence of this in an overcloud deploy is when db_sync tasks timeout because mariadb isn't running at all.

There is an enhancement to paunch which would make this failure a lot less obscure. Currently detached containers are launched by doing a "docker run" then continuing with the next tasks. If the image can't be pulled (wrong image ref, network issue) then the container will eventually fail to start.

If paunch checked whether the image exists locally, then does a docker pull, then it could fail early with a clear message.

This won't catch the cases where the container isn't starting for some other reason, because paunch is not a service manager. For this we would need specific validator resources in tripleo-heat-templates which (for example) assert that mariadb is running and responding just before the first db_sync thing runs.

Changed in paunch:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Steve Baker (steve-stevebaker)
Changed in tripleo:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Steve Baker (steve-stevebaker)
milestone: none → queens-2
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to paunch (master)

Fix proposed to branch: master
Review: https://review.openstack.org/522665

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to paunch (master)

Reviewed: https://review.openstack.org/522665
Committed: https://git.openstack.org/cgit/openstack/paunch/commit/?id=9b2ae8bd9665fce38e38aa99dbc4716f20dc2146
Submitter: Zuul
Branch: master

commit 9b2ae8bd9665fce38e38aa99dbc4716f20dc2146
Author: Steve Baker <email address hidden>
Date: Fri Nov 24 10:38:22 2017 +1300

    Explicitly pull images before docker run

    By pulling any missing required images, a pull failure will result in
    an early failure with a clear cause. For detached containers, this is
    improved error handling than having a container which fails
    to start.

    Change-Id: Ifa0257cbeadd3da9be756edf6a729c90141c238f
    Closes-Bug: #1733941

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to paunch (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/525322

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to paunch (stable/pike)

Reviewed: https://review.openstack.org/525322
Committed: https://git.openstack.org/cgit/openstack/paunch/commit/?id=79ec2f2a8b9f6d17c0a2af5dbdee96d3f0eaeae4
Submitter: Zuul
Branch: stable/pike

commit 79ec2f2a8b9f6d17c0a2af5dbdee96d3f0eaeae4
Author: Steve Baker <email address hidden>
Date: Fri Nov 24 10:38:22 2017 +1300

    Explicitly pull images before docker run

    By pulling any missing required images, a pull failure will result in
    an early failure with a clear cause. For detached containers, this is
    improved error handling than having a container which fails
    to start.

    Change-Id: Ifa0257cbeadd3da9be756edf6a729c90141c238f
    Closes-Bug: #1733941
    (cherry picked from commit 9b2ae8bd9665fce38e38aa99dbc4716f20dc2146)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/paunch 1.5.3

This issue was fixed in the openstack/paunch 1.5.3 release.

Changed in paunch:
status: Triaged → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/paunch 2.2.0

This issue was fixed in the openstack/paunch 2.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.