paunch doesn't error when image pull fails on detached containers
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
paunch |
Fix Released
|
High
|
Steve Baker | ||
tripleo |
Fix Released
|
High
|
Steve Baker |
Bug Description
(also discussed at https:/
Currently when the container image references are misconfigured, or can't be pulled, the first evidence of this in an overcloud deploy is when db_sync tasks timeout because mariadb isn't running at all.
There is an enhancement to paunch which would make this failure a lot less obscure. Currently detached containers are launched by doing a "docker run" then continuing with the next tasks. If the image can't be pulled (wrong image ref, network issue) then the container will eventually fail to start.
If paunch checked whether the image exists locally, then does a docker pull, then it could fail early with a clear message.
This won't catch the cases where the container isn't starting for some other reason, because paunch is not a service manager. For this we would need specific validator resources in tripleo-
Changed in paunch: | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → Steve Baker (steve-stevebaker) |
Changed in tripleo: | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → Steve Baker (steve-stevebaker) |
milestone: | none → queens-2 |
description: | updated |
Changed in paunch: | |
status: | Triaged → Fix Released |
Fix proposed to branch: master /review. openstack. org/522665
Review: https:/