older os-collect-config can't be updated or upgraded via heat

Bug #1603144 reported by Marios Andreou on 2016-07-14
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
Marios Andreou

Bug Description

Older versions of os-collect-config (in particular prior to os-collect-config-0.1.37-5 for rhel/centos based systems) can't be updated via heat.

As discussed at https://bugzilla.redhat.com/show_bug.cgi?id=1350489 and https://bugzilla.redhat.com/show_bug.cgi?id=1349890#c15 overcloud rhel/centos nodes that are running occ os-collect-config-0.1.37-4 or earlier will hang on major upgrade or minor update because occ itself is restarted as part of the update, and the heat deployment itself ultimately hangs.

Regardless of the 'real' fix in newer occ packages we are still left with the problem of updating existing setups; sbaker suggest that adding:

    KillMode=process
    SendSIGKILL=no

To the occ unit file:

"This means that when os-collect-config gets restarted, the current running os-refresh-config will continue to completion. Its not ideal because os-refresh-config output stops getting logged to the journal, and you only find out what happened in the rest of the run if/when the various heat deployment resources get signaled. "

Which got me thinking that perhaps we could just deliver this with a crudini on the unit file, as part of the upgrade init command? It would have to go to stable/mitaka for the bugzillas that need this fix, but I don't see why we can't also have it on master, too. Posting review momentarily for pointing at and discussion.

One outstanding question for me, which I'll find out via testing (probably tomorrow at this rate), is, is it OK to just edit the unit file in place and expect that to work for the fixup? (i.e. do we need to restart the service too?)

thanks, marios

Fix proposed to branch: master
Review: https://review.openstack.org/342278

Changed in tripleo:
status: Triaged → In Progress
Marios Andreou (marios-b) wrote :

posted to master for now @ https://review.openstack.org/342278 Fixup os-collect-config unit file so it won't fail on update

Marios Andreou (marios-b) wrote :

as discussed on the review, it is expected that updating to a version of occ with the sigkill (i.e. in a new package ) will result in successful upgrade of occ with the usual workflow - so the fix itself is packaging only, no special case needed for upgrading from a 'broken' occ, so not a bug in tripleo, setting invalid

Changed in tripleo:
status: In Progress → Won't Fix
status: Won't Fix → Invalid

Change abandoned by Marios Andreou (<email address hidden>) on branch: master
Review: https://review.openstack.org/342278
Reason: thanks for looking sbaker

Steve Baker (steve-stevebaker) wrote :

I've submitted a packaging fix here http://review.rdoproject.org/r/1678

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers