tripleo

Bug #1809833
Comment #1

Comment 1 for bug 1809833

Revision history for this message

Jill Rouleau (jillrouleau) wrote on 2019-01-04:

We technically had this same situation with docker, it just didn't show in the logs. On container start, something that takes more than 30s to pass healthchecks totally will show unhealthy in docker ps before turning to healthy. The functionality here is the same, it's just more visible in the logs.

We could leave healthcheck units disabled on the host and modify the container images to emit notify signals to enable checks when the container is ready, but that's a pretty involved approach.

I've been experimenting with unit file options to try to work out a combo that reliably delays activating the healthcheck by X seconds/minutes on both install and boot but still properly brings everything up after the delay. I can keep chasing this path but it's worth asking how much we want to invest what amounts to a cosmetic error. It should be expected that the container wouldn't be considered healthy until it's completed its startup processes. I don't think we want to suppress these messages because we would want to know if it never came up ok, and I'm not sure the way it is now is really a problem. What about a debug message in the task that "hey fyi, some containers might error a bit before being healthy"? I'll see what I can do with that idea in common_deploy_steps_tasks.yaml.