lxcbr0 disappeared run-playbooks.sh

Bug #1518485 reported by eil397
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
Low
Kevin Carter

Bug Description

aio installation failed because lxcbr0 disappeared.

looks like sometimes this one logic does not work correctly (https://github.com/openstack/openstack-ansible/blob/master/scripts/run-playbooks.sh#L58):
    ansible hosts -m shell \
                  -a '(ifdown lxcbr0 || true); ifup lxcbr0' \
                  -t "${COMMAND_LOGS}/host_net_bounce" \
                  &> ${COMMAND_LOGS}/host_net_bounce.log

in log file:
# cat /openstack/log/ansible_cmd_logs/host_net_bounce.log
aio1 | success | rc=0 >>
Stopping LXC dnsmasq.
Removing LXC IPtables rules.
IPtables rules removed.
Failed to bring up lxcbr0.Cannot find device "lxcbr0"

Revision history for this message
Bjoern (bjoern-t) wrote :

Which release it is you're installing.
This issue is familiar I had with https://bugs.launchpad.net/openstack-ansible/+bug/1507795 but that was only present in kilo

eil397 (anton-haldin)
description: updated
Revision history for this message
eil397 (anton-haldin) wrote :

I'm using trusty Ubuntu 14.04.3 LTS

Revision history for this message
eil397 (anton-haldin) wrote :

I've caught this situation few times. With liberty branch and with master.

Changed in openstack-ansible:
assignee: nobody → Anton Haldin (anton-haldin)
Revision history for this message
Jesse Pretorius (jesse-pretorius) wrote :

Can you confirm whether bootstrap-aio had been run (if this was an AIO), or whether lxc was still installed on the host concerned? If so, what version of LXC was installed?

Note that run-playbooks is a development/testing tool and not meant for production.

Changed in openstack-ansible:
status: New → Incomplete
Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

In addition to that, could you drop somewhere the content of the /etc/network/interfaces and /etc/network/interfaces.d/* ?

Revision history for this message
eil397 (anton-haldin) wrote :

Yes. That is right . I have ci job. It launches script run-aio-build.sh and it runs bootstrap-aio.sh.

content of all files below was genereated
/etc/network/interfaces.d/lxc-net-bridge.cfg https://gist.github.com/anton-haldin/fc0cb0d593632c62b69d
/etc/network/interfaces.d/aio_interfaces.cfg https://gist.github.com/anton-haldin/5276f6b1e94d39a74bba
/etc/network/interfaces https://gist.github.com/anton-haldin/9cd4568a006de8ae308d

dpkg -l | grep lxc
ii liblxc1 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools (library)
ii lxc 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools
ii lxc-dev 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools (development)
ii lxc-templates 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools (templates)
ii python3-lxc 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools (Python 3.x bindings)

Revision history for this message
Kevin Carter (kevin-carter) wrote :

the error "Failed to bring up lxcbr0.Cannot find device "lxcbr0"" normally happens when the line "/etc/network/interfaces.d/*" is not in the interface file causing the config for the additional interface not to be found. I've not been able to reproduce this without removing that one line. Is there something that you know of that we can do to recreate the problem?

Revision history for this message
eil397 (anton-haldin) wrote :

line "source /etc/network/interfaces.d/*" is presented in /etc/network/interfaces

this issue happens very often in my lab environment ( 5 servers: 1 aodh + 3 controllers + 1 compute)

for me it looks like issue can be caused by race confition.
when we put lxcbr0 down , sometimes it takes some time to finish this process in background. maybe udev is involved.
as a result it is not possible to bring up lxcbr0.

Changed in openstack-ansible:
importance: Undecided → Low
Changed in openstack-ansible:
assignee: eil397 (anton-haldin) → Kevin Carter (kevin-carter)
status: Incomplete → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-lxc_hosts (master)

Reviewed: https://review.openstack.org/516822
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-lxc_hosts/commit/?id=53a6cce9ed08dc5005a42f064a3b8811d65f9d70
Submitter: Zuul
Branch: master

commit 53a6cce9ed08dc5005a42f064a3b8811d65f9d70
Author: Kevin Carter <email address hidden>
Date: Tue Oct 31 21:42:16 2017 -0500

    Use handlers to restart services and move dnsmasq to a unit file

    These changes further optimise the lxc_host role so that it's using more
    of the built in modules and making better use of handlers.

    Moving the dnsmasq process to a unit file gives operators the ability to
    restart the dnsmasq process if there's an issue with the service. It
    also ensures the service stays running as systemd will take better care
    of the service by isolating it within a specific cgroup, ensuring good
    reporting and memory management, and providing the ability to recover
    from failures in an automated way.

    Closes-Bug: #1518485
    Change-Id: I42d0caa3b12e70a3601c30051eefc067e81a71bb
    Signed-off-by: Kevin Carter <email address hidden>

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible-lxc_hosts (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/517341

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible-lxc_hosts (stable/pike)

Reviewed: https://review.openstack.org/517341
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-lxc_hosts/commit/?id=be93ac8d3f1ba7f1e786c835436b4839cfbc0932
Submitter: Zuul
Branch: stable/pike

commit be93ac8d3f1ba7f1e786c835436b4839cfbc0932
Author: Kevin Carter <email address hidden>
Date: Mon Oct 30 20:54:12 2017 -0500

    Combined backport to fix issues and enhance efficiency

    The LXC host role can be tuned up for better overall efficiency.

    Highlights:
    * Move async wait to a later position for role performance. The
      async wait we're doing can be moved elsewhere in the role so
      that we're able to do more in parallel. This change simply moves
      the async wait to a postition just before its required.
    * Move container creation tasks into their own sub-files which are
      accessed using dynamic routing.
    * Several syntatic items were cleaned up.
    * All of the basic cache cleanup has been moved to handlers.

    These changes further optimise the lxc_host role so that it's using more
    of the built in modules and making better use of handlers.

    Moving the dnsmasq process to a unit file gives operators the ability to
    restart the dnsmasq process if there's an issue with the service. It
    also ensures the service stays running as systemd will take better care
    of the service by isolating it within a specific cgroup, ensuring good
    reporting and memory management, and providing the ability to recover
    from failures in an automated way.

    Closes-Bug: #1718979
    Closes-Bug: #1518485
    (cherry picked from commit 076493d01485822b1efbc962478150278ecbf566)
    (cherry picked from commit 53a6cce9ed08dc5005a42f064a3b8811d65f9d70)

    Change-Id: If7dfbae19429cb033d7fd7e33f1423627f091534

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-lxc_hosts 16.0.5

This issue was fixed in the openstack/openstack-ansible-lxc_hosts 16.0.5 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstack-ansible-lxc_hosts 17.0.0.0b2

This issue was fixed in the openstack/openstack-ansible-lxc_hosts 17.0.0.0b2 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.