Stack updates fail during docker-puppet rsync for haproxy

Bug #1698323 reported by Steven Hardy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Tuan

Bug Description

Testing a multi-node HA setup (3 controllers 1 compute) I'm hitting this, tryting to re-run the deploy with no changes:

        " (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation')\u001b[0m",
        "+ '[' -z '' ']'",
        "+ archivedirs=(\"/etc\" \"/root\" \"/var/lib/ironic/tftpboot\" \"/var/lib/ironic/httpboot\" \"/var/www\")",
        "+ rsync_srcs=",
        "+ for d in '\"${archivedirs[@]}\"'",
        "+ '[' -d /etc ']'",
        "+ rsync_srcs+=' /etc'",
        "+ for d in '\"${archivedirs[@]}\"'",
        "+ '[' -d /root ']'",
        "+ rsync_srcs+=' /root'",
        "+ for d in '\"${archivedirs[@]}\"'",
        "+ '[' -d /var/lib/ironic/tftpboot ']'",
        "+ for d in '\"${archivedirs[@]}\"'",
        "+ '[' -d /var/lib/ironic/httpboot ']'",
        "+ for d in '\"${archivedirs[@]}\"'",
        "+ '[' -d /var/www ']'",
        "+ rsync -a -R --delay-updates --delete-after /etc /root /var/lib/config-data/haproxy",
        "rsync: rename failed for \"/var/lib/config-data/haproxy/etc/hostname\" (from etc/.~tmp~/hostname): Device or resource busy (16)",
        "rsync: rename failed for \"/var/lib/config-data/haproxy/etc/resolv.conf\" (from etc/.~tmp~/resolv.conf): Device or resource busy (16)",
        "rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9]",
        "",

It seems we've got some files held open on the bind to haproxy, so we either need to solve that or rework how we do the rsync (e.g create a unique directory per docker-puppet run and switch over to the latest when restarting the container.

Tags: containers
Steven Hardy (shardy)
Changed in tripleo:
status: New → Triaged
importance: Undecided → High
milestone: none → pike-3
tags: added: containers
Revision history for this message
Damien Ciabrini (dciabrin) wrote :

One potential difference between haproxy service and other HA services is the fact that we bind mount directly /var/lib/config-data/haproxy/etc in the container's /etc rather than bind-mounting in /var/lib/kolla/config_files/src and let Kolla bootstrap copy the files in /etc

I'm going to propose a change to see if it fixes the issue

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/474943

Changed in tripleo:
assignee: nobody → Damien Ciabrini (dciabrin)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (master)

Fix proposed to branch: master
Review: https://review.openstack.org/474947

Changed in tripleo:
assignee: Damien Ciabrini (dciabrin) → nobody
Revision history for this message
Dan Prince (dan-prince) wrote :

The rsync functionality is new as of f600d459f051288042ce531bab029953563a11b3. Wondering if we revert that if it fixes this?

Revision history for this message
Dan Prince (dan-prince) wrote :

OKay, so the answer to this is now (talking to shardy on IRC). The old code would fail on 'rm -Rf /var/lib/config-data/${NAME}'.

Revision history for this message
Dan Prince (dan-prince) wrote :

Sounds like reverting to using 'kolla_config' might be the easiest solution here.

Revision history for this message
Steven Hardy (shardy) wrote :

Yeah I think the rsync error is just another manifestation of the previous rm -fr failure, where the container has a file held open in the bind-mounted directory - incorrectly I assumed the rsync would solve this, but evidently not as it can't replace the files when they're open.

Going to try moving those services where this is a problem to kolla_config, starting with the haproxy patch from Damien - logically this seems simpler than the alternative, which is managing config-data directories so they're versioned (which could be done, but it's probably a bunch of additional complexity going into docker-puppet.py, and kolla-config already exists).

Changed in tripleo:
assignee: nobody → Damien Ciabrini (dciabrin)
Changed in tripleo:
assignee: Damien Ciabrini (dciabrin) → nobody
Changed in tripleo:
assignee: nobody → Damien Ciabrini (dciabrin)
Changed in tripleo:
assignee: Damien Ciabrini (dciabrin) → Martin André (mandre)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by Damien Ciabrini (<email address hidden>) on branch: master
Review: https://review.openstack.org/474943
Reason: Test have shown that this review can be abandoned in favour of the more general https://review.openstack.org/#/c/476153/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on puppet-tripleo (master)

Change abandoned by Damien Ciabrini (<email address hidden>) on branch: master
Review: https://review.openstack.org/474947
Reason: Agreed, I857c94ba5f7f064d7c58df621ec5d477654b9166 is the way to go

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/476153
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=cf18e865d14adc319b6c2dfafd650f32dad4d853
Submitter: Jenkins
Branch: master

commit cf18e865d14adc319b6c2dfafd650f32dad4d853
Author: Martin André <email address hidden>
Date: Wed Jun 21 16:02:55 2017 +0200

    Copy only generated puppet files into the container

    This solves a problem with bind-mounts when the containers are holding
    files descriptors open.

    At the same time this makes the template more robust to puppet changes
    since new config files will be available in the containers without
    needing to update the templates.

    Partial-Bug: #1698323
    Change-Id: Ia4ad6d77387e3dc354cd131c2f9756939fb8f736

Changed in tripleo:
assignee: Martin André (mandre) → Tuan (tuanla)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (master)

Reviewed: https://review.openstack.org/477535
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=1e90178298d77c23e654c374f17f999f4d9274e1
Submitter: Jenkins
Branch: master

commit 1e90178298d77c23e654c374f17f999f4d9274e1
Author: Martin André <email address hidden>
Date: Mon Jun 26 15:33:09 2017 +0200

    Leverage kolla config_files to copy config into containers

    This solves a problem with bind-mounts when the containers are holding
    files descriptors open.

    At the same time this makes the template more robust to puppet changes
    since new config files will be available in the containers without
    needing to update the templates.

    Closes-Bug: #1698323
    Change-Id: I857c94ba5f7f064d7c58df621ec5d477654b9166
    Depends-On: I78dcec741a941dc21adba33ba33a6dc6ff1d217c

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 7.2.0

This issue was fixed in the openstack/puppet-tripleo 7.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.