Neutron agents previous envs not cleaned up on migration

Bug #1405293 reported by Aleksandr Shaposhnikov
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
Medium
Fuel Library (Deprecated)
5.0.x
Won't Fix
Medium
Fuel Library (Deprecated)
5.1.x
Won't Fix
Medium
Fuel Library (Deprecated)
6.0.x
Won't Fix
Medium
Fuel Library (Deprecated)
6.1.x
Invalid
Medium
Fuel Library (Deprecated)

Bug Description

So I found that if we have some of neutron agents (like l3 or dhcp) migrated to the different node their childs/namespaces and so on not cleared up on previous node. That is very dangerous behaviour because of l3 it will leave namespaces with routes and in case of dhcp it will leave a lot of namespaces and dnsmasq's on previous node.

ocf scripts should do a cleanup on all the nodes and only after that start migrated agent on new node.

Tags: scale
Changed in fuel:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Fuel Library Team (fuel-library)
milestone: none → 6.1
Revision history for this message
Miroslav Anashkin (manashkin) wrote :

Please confirm MOS version.
Recently we had third patch, targeted to solve exactly this issue.

Revision history for this message
Aleksandr Shaposhnikov (alashai8) wrote :

[root@fuel ~]# fuel --fuel-version
api: '1.0'
astute_sha: 16b252d93be6aaa73030b8100cf8c5ca6a970a91
auth_required: true
build_id: 2014-12-18_01-32-01
build_number: '56'
feature_groups:
- mirantis
fuellib_sha: 73332192a257ea02c40a39885c502ad1ebdf3eda
fuelmain_sha: 45caacadb878abfbd9d60e134d72229698b469c9
nailgun_sha: 5f91157daa6798ff522ca9f6d34e7e135f150a90
ostf_sha: a9afb68710d809570460c29d6c3293219d3624d4
production: docker
release: '6.0'
release_versions:
  2014.2-6.0:
    VERSION:
      api: '1.0'
      astute_sha: 16b252d93be6aaa73030b8100cf8c5ca6a970a91
      build_id: 2014-12-18_01-32-01
      build_number: '56'
      feature_groups:
      - mirantis
      fuellib_sha: 73332192a257ea02c40a39885c502ad1ebdf3eda
      fuelmain_sha: 45caacadb878abfbd9d60e134d72229698b469c9
      nailgun_sha: 5f91157daa6798ff522ca9f6d34e7e135f150a90
      ostf_sha: a9afb68710d809570460c29d6c3293219d3624d4
      production: docker
      release: '6.0'

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

This bug is invalid for 6.1 as we deploy N-agents as clones. The fix for 6.0, when committed, should be backported for earlier releases as well

Changed in fuel:
status: Confirmed → Invalid
Revision history for this message
Daniel Shafer (dshafer) wrote :

This bug is invalid for 6.0.1.

Namespace on node-1 before killing network:

[root@node-1 ~]# ip netns exec qrouter-33debcd2-8f98-4893-80e0-9fc1669f9bdd ifconfig
lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:65536 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

qg-2fc9d393-8d Link encap:Ethernet HWaddr FA:16:3E:75:E9:65
          inet addr:10.108.21.130 Bcast:10.108.21.255 Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fe75:e965/64 Scope:Link
          UP BROADCAST RUNNING MTU:1500 Metric:1
          RX packets:64 errors:0 dropped:0 overruns:0 frame:0
          TX packets:22 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:4024 (3.9 KiB) TX bytes:1216 (1.1 KiB)

qr-43142bdd-7c Link encap:Ethernet HWaddr FA:16:3E:11:EA:70
          inet addr:192.168.111.1 Bcast:192.168.111.255 Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fe11:ea70/64 Scope:Link
          UP BROADCAST RUNNING MTU:1500 Metric:1
          RX packets:1 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:70 (70.0 b) TX bytes:794 (794.0 b)

After network was brought back up on node-1:

[root@node-1 init.d]# ip netns exec qrouter-33debcd2-8f98-4893-80e0-9fc1669f9bdd ifconfig
lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:65536 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

Revision history for this message
Aleksandr Shaposhnikov (alashai8) wrote : Re: [Bug 1405293] Re: Neutron agents previous envs not cleaned up on migration
Download full text (3.2 KiB)

So the fact that namespace itself still there doesn't bothering you ? ;)
 On Feb 20, 2015 11:35 AM, "Daniel Shafer" <email address hidden> wrote:

> This bug is invalid for 6.0.1.
>
> Namespace on node-1 before killing network:
>
> [root@node-1 ~]# ip netns exec
> qrouter-33debcd2-8f98-4893-80e0-9fc1669f9bdd ifconfig
> lo Link encap:Local Loopback
> inet addr:127.0.0.1 Mask:255.0.0.0
> inet6 addr: ::1/128 Scope:Host
> UP LOOPBACK RUNNING MTU:65536 Metric:1
> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
>
> qg-2fc9d393-8d Link encap:Ethernet HWaddr FA:16:3E:75:E9:65
> inet addr:10.108.21.130 Bcast:10.108.21.255 Mask:255.255.255.0
> inet6 addr: fe80::f816:3eff:fe75:e965/64 Scope:Link
> UP BROADCAST RUNNING MTU:1500 Metric:1
> RX packets:64 errors:0 dropped:0 overruns:0 frame:0
> TX packets:22 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:4024 (3.9 KiB) TX bytes:1216 (1.1 KiB)
>
> qr-43142bdd-7c Link encap:Ethernet HWaddr FA:16:3E:11:EA:70
> inet addr:192.168.111.1 Bcast:192.168.111.255
> Mask:255.255.255.0
> inet6 addr: fe80::f816:3eff:fe11:ea70/64 Scope:Link
> UP BROADCAST RUNNING MTU:1500 Metric:1
> RX packets:1 errors:0 dropped:0 overruns:0 frame:0
> TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:70 (70.0 b) TX bytes:794 (794.0 b)
>
> After network was brought back up on node-1:
>
> [root@node-1 init.d]# ip netns exec
> qrouter-33debcd2-8f98-4893-80e0-9fc1669f9bdd ifconfig
> lo Link encap:Local Loopback
> inet addr:127.0.0.1 Mask:255.0.0.0
> inet6 addr: ::1/128 Scope:Host
> UP LOOPBACK RUNNING MTU:65536 Metric:1
> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1405293
>
> Title:
> Neutron agents previous envs not cleaned up on migration
>
> Status in Fuel: OpenStack installer that works:
> Invalid
> Status in Fuel for OpenStack 5.0.x series:
> Confirmed
> Status in Fuel for OpenStack 5.1.x series:
> Confirmed
> Status in Fuel for OpenStack 6.0.x series:
> Confirmed
>
> Bug description:
> So I found that if we have some of neutron agents (like l3 or dhcp)
> migrated to the different node their childs/namespaces and so on not
> cleared up on previous node. That is very dangerous behaviour because
> of l3 it will leave namespaces with routes and in case of dhcp it will
> leave a lot of namespaces and dnsmasq's on previous node.
>
> ocf scripts should do a cleanup on all the nodes and only after that
> start migrated agent on new node.
>
> To manage notifications...

Read more...

Revision history for this message
Andrew Woodward (xarses) wrote :

Medium in 6.x to remove stale namespaces

Changed in fuel:
status: Invalid → Confirmed
importance: High → Medium
Revision history for this message
Andrew Woodward (xarses) wrote :

After speaking with Alex, the leftover namespaces consume un-needed resource and must be iterated through for neutron to see if they can be re-used. We need to keep as few around as possible

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

Marking as Incomplete, need better understanding of performance impact of leaving namespaces behind.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.