missing swift rings causes swift containers stuck in docker restart loop

Bug #1710952 reported by James Slagle on 2017-08-15
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Critical
Emilien Macchi

Bug Description

I did a deployment with containers and all my swift containers were stuck in restart loops due to missing ring files.

I confirmed there were no ring files under /var/lib/config-data/puppet-generated/swift/etc/swift on the host.

The issue seems to be that both docker/services/swift-ringbuilder.yaml and docker/services/swift-storage.yaml both have puppet_config tasks that use a config_volume called "swift".

When docker-puppet.py runs, it rsyncs with --delete-after the generated files into /var/lib/config-data/puppet-generated/{config_volume name}

So, these container's puppet_config tasks end up deleting what the other has done (depends on which runs second).

Changed in tripleo:
importance: Undecided → Critical
milestone: none → pike-rc1

Fix proposed to branch: master
Review: https://review.openstack.org/494008

Changed in tripleo:
assignee: nobody → James Slagle (james-slagle)
status: New → In Progress
James Slagle (james-slagle) wrote :

I worked on a patch for this:
https://review.openstack.org/#/c/494008/

So I went ahead and assigned this one to myself. I won't be able to push on this after Thursday 8/18 though due to personal leave.

Changed in tripleo:
assignee: James Slagle (james-slagle) → Carlos Camacho (ccamacho)
Changed in tripleo:
assignee: Carlos Camacho (ccamacho) → Dan Prince (dan-prince)
Changed in tripleo:
assignee: Dan Prince (dan-prince) → Jose Luis Franco (jfrancoa)
Changed in tripleo:
milestone: pike-rc1 → pike-rc2
Changed in tripleo:
assignee: Jose Luis Franco (jfrancoa) → Emilien Macchi (emilienm)

Change abandoned by Emilien Macchi (<email address hidden>) on branch: master
Review: https://review.openstack.org/494008
Reason: I need to purge the gate because TripleO CI gate has critical issues right now, I'll make this patch goes to the gate.

Reviewed: https://review.openstack.org/494008
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=cba00abb7517efa6a8d9b8fb954563204323ffed
Submitter: Jenkins
Branch: master

commit cba00abb7517efa6a8d9b8fb954563204323ffed
Author: James Slagle <email address hidden>
Date: Tue Aug 15 15:59:08 2017 -0400

    Separate config_volume for ringbuilder

    Use a separate config_volume for swift_ringbuilder puppet_config tasks.
    This is necessary so that the swift_ringbuilder and swift-storage
    services don't both rsync files to the same bind mounted directory.

    The rsync command from docker-puppet.py uses --delete-after, so when
    they both use the same config_volume, they can end up deleting the files
    generated by the other (depending on the order of execution).

    Even though a separate config_volume is used, the rings must still end up
    in /etc/swift for the swift services containers. An additional
    container init task is used to copy the ring files into
    /var/lib/config-data/puppet-generated/swift/etc/swift so that they will
    be present when the actual swift services containers are started.

    Change-Id: I05821e76191f64212704ca8e3b7428cda6b3a4b7
    Closes-Bug: #1710952

Changed in tripleo:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/499457
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=e371ec664638fc104990c5e4cdfacac932468090
Submitter: Jenkins
Branch: stable/pike

commit e371ec664638fc104990c5e4cdfacac932468090
Author: James Slagle <email address hidden>
Date: Tue Aug 15 15:59:08 2017 -0400

    Separate config_volume for ringbuilder

    Use a separate config_volume for swift_ringbuilder puppet_config tasks.
    This is necessary so that the swift_ringbuilder and swift-storage
    services don't both rsync files to the same bind mounted directory.

    The rsync command from docker-puppet.py uses --delete-after, so when
    they both use the same config_volume, they can end up deleting the files
    generated by the other (depending on the order of execution).

    Even though a separate config_volume is used, the rings must still end up
    in /etc/swift for the swift services containers. An additional
    container init task is used to copy the ring files into
    /var/lib/config-data/puppet-generated/swift/etc/swift so that they will
    be present when the actual swift services containers are started.

    Change-Id: I05821e76191f64212704ca8e3b7428cda6b3a4b7
    Closes-Bug: #1710952
    (cherry picked from commit cba00abb7517efa6a8d9b8fb954563204323ffed)

tags: added: in-stable-pike

This issue was fixed in the openstack/tripleo-heat-templates 7.0.0.0rc2 release candidate.

This issue was fixed in the openstack/tripleo-heat-templates 8.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers