Neutron qrouter does not migrate after kill network on the primary controller.

Bug #1371550 reported by Egor Kotko
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Fuel Library (Deprecated)
5.1.x
Fix Released
High
Fuel Library (Deprecated)
6.0.x
Fix Released
High
Fuel Library (Deprecated)

Bug Description

{"build_id": "2014-09-17_21-40-34", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "11", "auth_required": true, "api": "1.0", "nailgun_sha": "eb8f2b358ea4bb7eb0b2a0075e7ad3d3a905db0d", "production": "docker", "fuelmain_sha": "8ef433e939425eabd1034c0b70e90bdf888b69fd", "astute_sha": "f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13", "feature_groups": ["mirantis"], "release": "5.1", "release_versions": {"2014.1.1-5.1": {"VERSION": {"build_id": "2014-09-17_21-40-34", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "11", "api": "1.0", "nailgun_sha": "eb8f2b358ea4bb7eb0b2a0075e7ad3d3a905db0d", "production": "docker", "fuelmain_sha": "8ef433e939425eabd1034c0b70e90bdf888b69fd", "astute_sha": "f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13", "feature_groups": ["mirantis"], "release": "5.1", "fuellib_sha": "d9b16846e54f76c8ebe7764d2b5b8231d6b25079"}}}, "fuellib_sha": "d9b16846e54f76c8ebe7764d2b5b8231d6b25079"}

Steps to reproduce:
1) Deploy env: Ubuntu HA, GRE, 3 Controllers, 1Compute, 1Storage, 3Mongo
2) Delete all interfaces from bridges for primary controller on the host machine "sudo brctl delif dobrXXX donetXXXX"

Expected result: qrouter will migrate on other controller

Actual result: qrouter did not migrate on other controller

Tags: neutron
Revision history for this message
Egor Kotko (ykotko) wrote :
Revision history for this message
Irina Povolotskaya (ipovolotskaya) wrote :

Should this be put into Release Notes? Is there any workaround?

Revision history for this message
Egor Kotko (ykotko) wrote :

It is better that somebody from fuel-library or fuel-deployer team watch this issue and confirmed it.

Egor Kotko (ykotko)
Changed in fuel:
assignee: nobody → Fuel Library Team (fuel-library)
Changed in fuel:
milestone: 6.0 → 5.1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/122884

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (stable/5.1)

Related fix proposed to branch: stable/5.1
Review: https://review.openstack.org/122897

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/122884
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=c0af7efa8bf992261241c53af2b6b4c534935eeb
Submitter: Jenkins
Branch: master

commit c0af7efa8bf992261241c53af2b6b4c534935eeb
Author: Vladimir Kuklin <email address hidden>
Date: Fri Sep 19 15:38:55 2014 -0700

    Make rescheduling of l3 agent more resilient

    In order to make l3 agent restart more resilient
    and idempotent do:

    1) delete agent after resource is stopped and
    all the stuff is cleaned up

    2) reschedule not only dead routers/networks but also
    orphaned routers/networks

    3) enclose get_pid_list_for_ns_list function argument
    in cleanup function into double quoutes
    or it will use only the first argument
    and will not work with several networks

    Change-Id: I06a89f0015c8f188d212995ed0652c5ee2cc1c47
    Related-Bug: #1371550

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (stable/5.1)

Reviewed: https://review.openstack.org/122897
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=9f8bb531c59c5bcdb8cf7492e4496ae131f392dd
Submitter: Jenkins
Branch: stable/5.1

commit 9f8bb531c59c5bcdb8cf7492e4496ae131f392dd
Author: Vladimir Kuklin <email address hidden>
Date: Fri Sep 19 15:38:55 2014 -0700

    Make rescheduling of l3 agent more resilient

    In order to make l3 agent restart more resilient
    and idempotent do:

    1) delete agent after resource is stopped and
    all the stuff is cleaned up

    2) reschedule not only dead routers/networks but also
    orphaned routers/networks

    3) enclose get_pid_list_for_ns_list function argument
    in cleanup function into double quoutes
    or it will use only the first argument
    and will not work with several networks

    Change-Id: I06a89f0015c8f188d212995ed0652c5ee2cc1c47
    Related-Bug: #1371550

Changed in fuel:
status: New → Triaged
Egor Kotko (ykotko)
tags: added: in progress
Egor Kotko (ykotko)
tags: removed: in progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/125098

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (stable/5.1)

Related fix proposed to branch: stable/5.1
Review: https://review.openstack.org/125099

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/125110

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (stable/5.1)

Related fix proposed to branch: stable/5.1
Review: https://review.openstack.org/125116

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/125098
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=5358bd13d74e5a73ef6737b8f77a778bc6a87fb7
Submitter: Jenkins
Branch: master

commit 5358bd13d74e5a73ef6737b8f77a778bc6a87fb7
Author: Sergey Vasilenko <email address hidden>
Date: Tue Sep 30 19:30:41 2014 +0400

    Revert "Make rescheduling of l3 agent more resilient"

    This reverts commit c0af7efa8bf992261241c53af2b6b4c534935eeb.
    because this commit broke deployment

    Change-Id: I2a3764d104e893222cfeb767351ae52a0ebc87f6
    Related-Bug: #1371550
    Related-Bug: #1375817

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (stable/5.1)

Reviewed: https://review.openstack.org/125099
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=0aff6ffe62a750fe292669eeaf28c9486fd391aa
Submitter: Jenkins
Branch: stable/5.1

commit 0aff6ffe62a750fe292669eeaf28c9486fd391aa
Author: Sergey Vasilenko <email address hidden>
Date: Tue Sep 30 19:38:16 2014 +0400

    Revert "Make rescheduling of l3 agent more resilient"

    This reverts commit 9f8bb531c59c5bcdb8cf7492e4496ae131f392dd.
    because this commit broke deployment

    Change-Id: Ie54920954c9b50375224b54b397e31df4edbc8cd
    Related-Bug: #1371550
    Related-Bug: #1375817

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/125110
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=9f45b83f8e9ae7715058bc42e5f7ca20a10d55aa
Submitter: Jenkins
Branch: master

commit 9f45b83f8e9ae7715058bc42e5f7ca20a10d55aa
Author: Vladimir Kuklin <email address hidden>
Date: Fri Sep 19 15:38:55 2014 -0700

    Make rescheduling of l3 agent more resilient

    In order to make l3 agent restart more resilient
    and idempotent do:

    1) delete agent after resource is stopped and
    all the stuff is cleaned up

    2) reschedule not only dead routers/networks but also
    orphaned routers/networks

    3) enclose get_pid_list_for_ns_list function argument
    in cleanup function into double quoutes
    or it will use only the first argument
    and will not work with several networks

    Change-Id: I90338ff2df2703e121a6075ad1e60a442c88591f
    Related-Bug: #1371550
    Related-Bug: #1375817

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (stable/5.1)

Reviewed: https://review.openstack.org/125116
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=c77744508db1b5b17e853cfa1af7c949d7c4b44b
Submitter: Jenkins
Branch: stable/5.1

commit c77744508db1b5b17e853cfa1af7c949d7c4b44b
Author: Vladimir Kuklin <email address hidden>
Date: Fri Sep 19 15:38:55 2014 -0700

    Make rescheduling of l3 agent more resilient

    In order to make l3 agent restart more resilient
    and idempotent do:

    1) delete agent after resource is stopped and
    all the stuff is cleaned up

    2) reschedule not only dead routers/networks but also
    orphaned routers/networks

    3) enclose get_pid_list_for_ns_list function argument
    in cleanup function into double quoutes
    or it will use only the first argument
    and will not work with several networks

    Change-Id: I90338ff2df2703e121a6075ad1e60a442c88591f
    Related-Bug: #1371550
    Related-Bug: #1375817

Revision history for this message
Anastasia Palkina (apalkina) wrote :

Verified on ISO #53

"build_id": "2014-10-28_00-01-12", "ostf_sha": "f47fd1d66a7255213ee075d5c11b8f111f922000", "build_number": "53", "auth_required": true, "api": "1.0", "nailgun_sha": "fb18068382d522b735ecf446c0f4166c129269fb", "production": "docker", "fuelmain_sha": "f3ad22d12c26794a05e62d46317fa1e47f7f1138", "astute_sha": "97eea90efe0a1f17b4934919d6e459d270c10372", "feature_groups": ["mirantis", "techpreview"], "release": "6.0", "release_versions": {"2014.2-6.0": {"VERSION": {"build_id": "2014-10-28_00-01-12", "ostf_sha": "f47fd1d66a7255213ee075d5c11b8f111f922000", "build_number": "53", "api": "1.0", "nailgun_sha": "fb18068382d522b735ecf446c0f4166c129269fb", "production": "docker", "fuelmain_sha": "f3ad22d12c26794a05e62d46317fa1e47f7f1138", "astute_sha": "97eea90efe0a1f17b4934919d6e459d270c10372", "feature_groups": ["mirantis", "techpreview"], "release": "6.0", "fuellib_sha": "b8d244a900b25bed8f597e99b309f9ee4ad8ae56"}}}, "fuellib_sha": "b8d244a900b25bed8f597e99b309f9ee4ad8ae56"

Executed on primary controller to delete interfaces from bridges:
for port in $(ovs-vsctl list Interface | grep 'name\ ' | grep -v br | awk -F\" '{print $2}') ; do ovs-vsctl del-port $port ; done

[root@fuel ~]# ssh node-1
ssh: connect to host node-1 port 22: No route to host
[root@fuel ~]# ssh node-10
Warning: Permanently added 'node-10' (RSA) to the list of known hosts.
Last login: Tue Oct 28 14:07:22 2014 from 10.20.0.2
[root@node-10 ~]# ip netns
qrouter-d8923128-bc66-4ff5-956d-91431d85290e
haproxy

Revision history for this message
Anastasia Palkina (apalkina) wrote :

Verified on ISO #19

"build_id": "2014-11-17_21-00-23", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "19", "auth_required": true, "api": "1.0", "nailgun_sha": "2fcab95dc43a248ba867065e96ab764ee73882d1", "production": "docker", "fuelmain_sha": "ff22ca819e6eb7c63b6d7978fdd80ef9b84457d9", "astute_sha": "fce051a6d013b1c30aa07320d225f9af734545de", "feature_groups": ["mirantis"], "release": "5.1.1", "release_versions": {"2014.1.3-5.1.1": {"VERSION": {"build_id": "2014-11-17_21-00-23", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "19", "api": "1.0", "nailgun_sha": "2fcab95dc43a248ba867065e96ab764ee73882d1", "production": "docker", "fuelmain_sha": "ff22ca819e6eb7c63b6d7978fdd80ef9b84457d9", "astute_sha": "fce051a6d013b1c30aa07320d225f9af734545de", "feature_groups": ["mirantis"], "release": "5.1.1", "fuellib_sha": "add3fdd3e2af57b20dbb73a6bc53a9ccc4701c9a"}}}, "fuellib_sha": "add3fdd3e2af57b20dbb73a6bc53a9ccc4701c9a"

[root@node-7 ~]# ip netns
haproxy
qrouter-9107fbd9-3da6-4703-9f2d-ead8546ba548

Executed on primary controller to delete interfaces from bridges:
for port in $(ovs-vsctl list Interface | grep 'name\ ' | grep -v br | awk -F\" '{print $2}') ; do ovs-vsctl del-port $port ; done

[root@fuel ~]# ssh node-7
ssh: connect to host node-7 port 22: No route to host
[root@fuel ~]# ssh node-9
Warning: Permanently added 'node-9' (RSA) to the list of known hosts.
Last login: Wed Nov 19 12:35:43 2014 from 10.20.0.2
[root@node-9 ~]# ip netns
qrouter-9107fbd9-3da6-4703-9f2d-ead8546ba548
haproxy

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (stable/4.1)

Related fix proposed to branch: stable/4.1
Review: https://review.openstack.org/146994

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (stable/5.0)

Related fix proposed to branch: stable/5.0
Review: https://review.openstack.org/147095

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (stable/4.1)

Change abandoned by Ryan Moe (<email address hidden>) on branch: stable/4.1
Review: https://review.openstack.org/146994

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (stable/5.0)

Change abandoned by Sergey Kolekonov (<email address hidden>) on branch: stable/5.0
Review: https://review.openstack.org/147095

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.