Comment 1 for bug 1820902

Revision history for this message
Ovidiu Poncea (ovidiuponcea) wrote :

Initial conclusiont point to a tricky helm-toolkit bug. I got some small guidance on the openstack-helm stash channel, but nothing to point me to a solutiom.

Short story: On a multihost nova compute deployment with nova-compute functionality we add or delete a host and reapply the manifests. To our surprise nova-compute services get restarted on ALL nodes. We are expecting it to juts start on the newly added nova-compute node (or not do anything in the host removal case). The interesting part here is that the config maps passed don't change at all (i.e. there is no change in the openstack config files!), so there shouldn't be any reason to restart/recreate the pods at all.

Long story:
When comparing the chart output from before and after removing a host
There are two changes that seems to tricker the POD recreation: the name of the daemonsets and one hash (configmap-etc-hash). Looking further, both the name of the daemonsets and the hash are differen for the same reason: the hostnames of all the nodes configured (i.e. a map with the hostnames) is included in the hash computation which is used for the generation of the pod names & configmap-etc-hash.
It seems that there is no reason to use the hostname list for this as there is no actual config file change... problem is that I don't yet know how to fix it. It's in the helm-toolkit, it will impact ALL the services in the system.

I was able to reconcile the differece in the config map hash by not taking into account the hostnames in helm-toolkit/templates/utils/_daemonset_overrides.tpl by replacing:

   {{- if not $context.Values.__daemonset_yaml.spec.template.metadata.annotations }}{{- $_ := set $context.Values.__daemonset_yaml.spec.template.metadata "annotations" dict }}{{- end }}
   {{- $cmap := list $current_dict.dns_1123_name $current_dict.nodeData | include $configmap_include }}
   {{- $values_hash := $cmap | quote | sha256sum }}
   {{- $_ := set $context.Values.__daemonset_yaml.spec.template.metadata.annotations "configmap-etc-hash" $values_hash }}

with:
   {{- if not $context.Values.__daemonset_yaml.spec.template.metadata.annotations }}{{- $_ := set $context.Values.__daemonset_yaml.spec.template.metadata "annotations" dict }}{{- end }}
   {{- $cmap := list $current_dict.dns_1123_name $current_dict.nodeData | include $configmap_include }}
   {{- $hashcmap := list "default" $current_dict.nodeData | include $configmap_include }}
   {{- $values_hash := $hashcmap | quote | sha256sum }}
   {{- $_ := set $context.Values.__daemonset_yaml.spec.template.metadata.annotations "configmap-etc-hash" $values_hash }}

Problem is that I still get the POD name change... and this is much harder to fix.