Restarting containers leads to 'dangling' veth interfaces

Bug #1491440 reported by Major Hayden
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
openstack-ansible
Fix Released
Medium
Major Hayden
Kilo
Fix Released
Medium
Major Hayden
Trunk
Fix Released
Medium
Major Hayden

Bug Description

We've seen an issue where veth interfaces are left up on the host after stopping LXC containers. The veth will still be connected to the bridge and have the proper MAC address assigned. In some cases, the veths still respond to traffic sent to the IP of the network interface that was configured in the container.

Some cleaned themselves up on their own, but they occasionally required an `ip link del <veth>` to reap them. There are some LXC mailing list threads that suggest this might be related to half-open TCP connections and that a script must be run as the container goes down to forcefully remove the veths.

I'm currently testing this scenario and working on a patch that makes it easier to identify these dangling veth interfaces:

  https://blueprints.launchpad.net/openstack/?searchtext=named-veths
  https://review.openstack.org/#/c/219457/

Tags: in-kilo
Changed in openstack-ansible:
assignee: nobody → Major Hayden (rackerhacker)
Changed in openstack-ansible:
status: New → Confirmed
status: Confirmed → In Progress
Changed in openstack-ansible:
milestone: none → 11.2.0
Revision history for this message
Jesse Pretorius (jesse-pretorius) wrote :

Shifting milestone to 12.0.0 as it's too late for inclusion to 11.2.0 and I don't think this should be in a hotfix version either. If a decision is made to backport a fix then that can be targeted at a later date.

Changed in openstack-ansible:
milestone: 11.2.0 → 12.0.0
Changed in openstack-ansible:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-ansible-deployment (master)

Reviewed: https://review.openstack.org/219457
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=5e6dea8ab6f408e6610c34220519b628e8f8c45c
Submitter: Jenkins
Branch: master

commit 5e6dea8ab6f408e6610c34220519b628e8f8c45c
Author: Major Hayden <email address hidden>
Date: Mon Aug 31 20:17:32 2015 -0500

    Used named veth pairs that match container

    The default veth names from LXC make it difficult to tell which veth
    corresponds to each container. This patch sets a unique veth name
    that matches the container as well as the network device inside the
    container. It should make troubleshooting a little easier.

    Oddly enough, this patch seems to fix or greatly reduce the occurence
    of the issues seen in ticket 1491440.

    Partial-Bug: 1491440
    Implements: blueprint named-veths
    Change-Id: I28553fd1b4f36991e11d55d56c3f0f46af9e52be

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-ansible-deployment (kilo)

Fix proposed to branch: kilo
Review: https://review.openstack.org/222295

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-ansible-deployment (kilo)

Reviewed: https://review.openstack.org/222295
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=19970092e3801f8aa354cc41b3fe1cf3e3486203
Submitter: Jenkins
Branch: kilo

commit 19970092e3801f8aa354cc41b3fe1cf3e3486203
Author: Major Hayden <email address hidden>
Date: Mon Aug 31 20:17:32 2015 -0500

    Used named veth pairs that match container

    The default veth names from LXC make it difficult to tell which veth
    corresponds to each container. This patch sets a unique veth name
    that matches the container as well as the network device inside the
    container. It should make troubleshooting a little easier.

    Oddly enough, this patch seems to fix or greatly reduce the occurence
    of the issues seen in ticket 1491440.

    Partial-Bug: 1491440
    Implements: blueprint named-veths
    Change-Id: I28553fd1b4f36991e11d55d56c3f0f46af9e52be
    (cherry picked from commit 5e6dea8ab6f408e6610c34220519b628e8f8c45c)

tags: added: in-kilo
Revision history for this message
Major Hayden (rackerhacker) wrote :

This bug could also use some docs.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (master)

Fix proposed to branch: master
Review: https://review.openstack.org/223792

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (master)

Reviewed: https://review.openstack.org/223792
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=eb311fd0860009f537b554406ae04428f0971a8a
Submitter: Jenkins
Branch: master

commit eb311fd0860009f537b554406ae04428f0971a8a
Author: Major Hayden <email address hidden>
Date: Tue Sep 15 15:27:00 2015 -0500

    Docs for named veths + troubleshooting

    Partial-bug: 1491440

    Change-Id: I43d501f76ef4acff954e564f5fd33d1779bfcbd2

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (kilo)

Fix proposed to branch: kilo
Review: https://review.openstack.org/224750

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (kilo)

Reviewed: https://review.openstack.org/224750
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=c110786e1f13c1f1758d280f6e69d4bcec5d7cc6
Submitter: Jenkins
Branch: kilo

commit c110786e1f13c1f1758d280f6e69d4bcec5d7cc6
Author: Major Hayden <email address hidden>
Date: Tue Sep 15 15:27:00 2015 -0500

    Docs for named veths + troubleshooting

    Partial-bug: 1491440

    Change-Id: I43d501f76ef4acff954e564f5fd33d1779bfcbd2
    (cherry picked from commit eb311fd0860009f537b554406ae04428f0971a8a)

Revision history for this message
Major Hayden (rackerhacker) wrote :

At this point, I believe we've done all we can go get this fixed. There's still an issue upstream (in LXC and/or the Linux kernel), but we have efficient workarounds now.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related blueprints