SoftwareDeployment does not error if the required hook is not installed on the client host

Bug #1651785 reported by Alex Schultz
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
heat-agents
Confirmed
Medium
Unassigned

Bug Description

Some of the details can be found in https://bugs.launchpad.net/tripleo/+bug/1651616 but the summary is:

If you create a stack and the end host does not have the required heat agent installed, the task times out rather than errors indicating that it cannot be run. In my case, the python-heat-agent-hiera was not installed on the node but the stack contained a deployment with group: hiera. The stack create just hung until the timeout hit.

Revision history for this message
Thomas Herve (therve) wrote :

I guess https://github.com/openstack/heat-templates/blob/master/hot/software-config/elements/heat-config/os-refresh-config/configure.d/55-heat-config#L132 is the reason nothing is happening? Can you confirm you see that in the logs?

I suspect we can't change that behavior right away, but maybe we can introduce a config option to do so.

Also we need to have a dedicated component for that code, because it's not in heat.

Revision history for this message
Alex Schultz (alex-schultz) wrote :

No i did not see that in the logs. As far as the end host was concerned there were no new os-* logs when I hit this point. All i was able to see was heat engine looping when I turned up the debug on the heat engine.

Revision history for this message
Alex Schultz (alex-schultz) wrote :

Attached is a sosreport of the undercloud logs for when this happened

Revision history for this message
Alex Schultz (alex-schultz) wrote :

Attached is one of the overcloud logs for when this happened. Basically nothing continued to process and it just stops at around 16:35

Steven Hardy (shardy)
summary: - heat does not error if the required agent is not installed on the client
+ heat does not error if the required hook is not installed on the client
host
summary: - heat does not error if the required hook is not installed on the client
- host
+ SoftwareDeployment does not error if the required hook is not installed
+ on the client host
Revision history for this message
Steven Hardy (shardy) wrote :

Thomas - I think 55-heat-config#L132 is working, but it's not really a sane default since we switched things over to use heat-config-notify (I assume that's when this behavior changed but not 100% sure).

IIRC previously we'd skip configs for which there was not any hook script and silently do nothing, but what we're seeing is the SoftwareDeployment doesn't ever complete (or fail) in this case, it just hangs IN_PROGRESS forever (you can see why in 55-heat-config, we just return without sending any signal).

I suspect that's pretty much never the desired outcome for any user, as it requires manual investigation on the nodes.

I think it would be valid to change that behavior to either fail, or skip as in 55-heat-config#L132 and send a signal with a payload saying it was skipped.

I guess the only question is whether we should fail, or silently do nothing - IMHO failing under these circumstances makes far more sense, but I'm happy to have that made a config option if folks feel strongly silently skipping configs is a better idea.

Revision history for this message
Steven Hardy (shardy) wrote :

Alex - this is from /var/log/messages in your sosreport, it shows 55-heat-config#L132 and the missing hiera hook is the reason for this problem, so the code is working as expected, it's just bad default behavior which we might want to fix.

Dec 21 16:35:40 host-192-168-24-7 os-collect-config: [2016-12-21 16:35:40,115] (heat-config) [WARNING] Skipping group hiera with no hook script None

Changed in heat:
status: New → Confirmed
importance: Undecided → Medium
milestone: none → ocata-3
Revision history for this message
Alex Schultz (alex-schultz) wrote :

Ah ok I missed it in the logs. I would say that condition should be an error and not a warning as well. I would also expect some sort of status indication on the heat side when this condition well.

Revision history for this message
Marius Cornea (mcornea) wrote :

I've also hit this issue today and I completely ignored the 'Skipping group hiera with no hook script None' log since it showed up as a warning but it in the end it was the cause for the hung stack create. +1 for making this condition an error.

Zane Bitter (zaneb)
affects: heat → heat-agents
Changed in heat-agents:
milestone: ocata-3 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.