neutron_tempest_plugin.scenario.test_migration.NetworkMigrationFromHA failing 100% times

Bug #1789434 reported by Slawek Kaplonski on 2018-08-28
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
High
Miguel Lavalle

Bug Description

Since few days all migration tests from DVR router fails.
Example of failure: http://logs.openstack.org/37/382037/71/check/neutron-tempest-plugin-dvr-multinode-scenario/605ed17/logs/testr_results.html.gz
May be related somehow to https://review.openstack.org/#/c/589410/ but I'm not sure yet.

tags: added: gate-failure l3-dvr-backlog
removed: gate
Miguel Lavalle (minsel) on 2018-08-30
Changed in neutron:
assignee: nobody → Miguel Lavalle (minsel)
Brian Haley (brian-haley) wrote :

So I did a little debugging here.

DVR router (router1)
$ openstack router set --disable router1
All ports (3) showed in DOWN state

$ openstack router set --ha router1
$ openstack router set --enable router1
All ports showed in ACTIVE state

$ openstack router set --disable router1
All ports still in ACTIVE state

Any future attempt to disable the router always showed ports ACTIVE, even if I changed it back to non-ha, so it was permanently broken.

Created a new router, just DVR (router2) - it was fine - could disable/enable and see ports change state.

The fact that the ports got "stuck" would point to an issue on the server-side.

I tried this on a 3 node setup, DVR (1 allinone, 2. compute 3.network ) using miguel's vagrant scripts dvrvagrant and below is my observation which is exactly same as Brian's

router1 disbabled -> all ports down,

router1-> set ha and --enabled -> all ports active

router1--> disable, --> ports still active.

comment2 --continued

the only difference in deployment I tried on, router1 was a legacy router what was initially created by devstack setup, which in brian's case was dvr !

def _update_ports(self, context, router_id):
    objs = l3_obj.RouterPort.get_objects(
         context, router_id=[router_id])
    for obj in objs:
        port_id = obj.port_id
        self._core_plugin.update_port(
            context, port_id, {'port': {'status': 'DOWN'}})

not sure if this would this be a good solution or not but that's what I think may fix the issue if we add this helper and call it in method at https://github.com/openstack/neutron/blob/master/neutron/db/l3_db.py#L277

if r['admin_state_up'] == False:
   self._update_ports(context, id)

Miguel Lavalle (minsel) on 2018-09-18
Changed in neutron:
assignee: Miguel Lavalle (minsel) → Manjeet Singh Bhatia (manjeet-s-bhatia)

Reviewed: https://review.openstack.org/605057
Committed: https://git.openstack.org/cgit/openstack/neutron-tempest-plugin/commit/?id=e137cd003b93d641189ba4d2c0dd1effe4795ba4
Submitter: Zuul
Branch: master

commit e137cd003b93d641189ba4d2c0dd1effe4795ba4
Author: Slawek Kaplonski <email address hidden>
Date: Tue Sep 25 14:28:22 2018 +0200

    Mark NetworkMigrationFromHA scenario tests as unstable

    We know that those tests are failing 100% times because
    router ports are not going DOWN when router's admin_state_up is
    set to FALSE.
    Let's make it unstable until this issue will be resolved to make
    scenario jobs passing at least sometimes ;)

    Change-Id: Ia9e4af5d798a769c5ff7056e686632bac6f79aec
    Related-Bug: #1789434

Change abandoned by Manjeet Singh Bhatia (<email address hidden>) on branch: master
Review: https://review.openstack.org/602161

Miguel Lavalle (minsel) wrote :
Slawek Kaplonski (slaweq) wrote :

This issue should be resolved with patch https://review.openstack.org/#/c/636710/2 as it looks that it is caused by bugs in rootwrap filters.

Miguel Lavalle (minsel) on 2019-03-13
Changed in neutron:
assignee: Manjeet Singh Bhatia (manjeet-s-bhatia) → Miguel Lavalle (minsel)
Changed in neutron:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/636710
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=25c432a05a57f794dcbb4f17ce224d914c65e071
Submitter: Zuul
Branch: master

commit 25c432a05a57f794dcbb4f17ce224d914c65e071
Author: Miguel Lavalle <email address hidden>
Date: Wed Feb 13 12:29:36 2019 -0600

    Add rootwrap filters to kill state change monitor

    When deleting HA routers, the keepalived state change monitor has to be
    deleted. This patch adds rootwrap filters to allow deleting the state
    change monitor.

    Change-Id: Icfb208d9b51eaa41cf01af81f1ede7420a19cc93
    Partial-Bug: #1795870
    Partial-Bug: #1789434

tags: added: neutron-proactive-backport-potential
tags: added: neutron-easy-proactive-backport-potential

Change abandoned by Miguel Lavalle (<email address hidden>) on branch: master
Review: https://review.openstack.org/611461

Reviewed: https://review.openstack.org/644931
Committed: https://git.openstack.org/cgit/openstack/neutron-tempest-plugin/commit/?id=0e8b686d84427532ffd8da842d4e4d2017995cbe
Submitter: Zuul
Branch: master

commit 0e8b686d84427532ffd8da842d4e4d2017995cbe
Author: Miguel Lavalle <email address hidden>
Date: Wed Mar 20 11:46:38 2019 -0500

    Reenable tests cases in NetworkMigrationFromHA

    The unstable tag for test cases in NetworkMigrationFromHA is removed
    after de merge of https://review.openstack.org/636710

    Change-Id: Icdc4f4c84add3731237cfa64ab57716037372f39
    Partial-Bug: #1789434

Reviewed: https://review.openstack.org/645283
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=6f3620aa88feef9527f8c9599dec049a831b49fa
Submitter: Zuul
Branch: stable/queens

commit 6f3620aa88feef9527f8c9599dec049a831b49fa
Author: Miguel Lavalle <email address hidden>
Date: Wed Feb 13 12:29:36 2019 -0600

    Add rootwrap filters to kill state change monitor

    When deleting HA routers, the keepalived state change monitor has to be
    deleted. This patch adds rootwrap filters to allow deleting the state
    change monitor.

    Change-Id: Icfb208d9b51eaa41cf01af81f1ede7420a19cc93
    Partial-Bug: #1795870
    Partial-Bug: #1789434
    (cherry picked from commit 25c432a05a57f794dcbb4f17ce224d914c65e071)

tags: added: in-stable-queens
Miguel Lavalle (minsel) on 2019-03-27
Changed in neutron:
status: In Progress → Fix Committed

Reviewed: https://review.openstack.org/645282
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=8b7955dade3e388ad030ce2291651cef72d55108
Submitter: Zuul
Branch: stable/rocky

commit 8b7955dade3e388ad030ce2291651cef72d55108
Author: Miguel Lavalle <email address hidden>
Date: Wed Feb 13 12:29:36 2019 -0600

    Add rootwrap filters to kill state change monitor

    When deleting HA routers, the keepalived state change monitor has to be
    deleted. This patch adds rootwrap filters to allow deleting the state
    change monitor.

    Change-Id: Icfb208d9b51eaa41cf01af81f1ede7420a19cc93
    Partial-Bug: #1795870
    Partial-Bug: #1789434
    (cherry picked from commit 25c432a05a57f794dcbb4f17ce224d914c65e071)

tags: added: in-stable-rocky

Reviewed: https://review.openstack.org/650255
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=672a4328a97d6dae98cec208851198040530bd35
Submitter: Zuul
Branch: stable/pike

commit 672a4328a97d6dae98cec208851198040530bd35
Author: Miguel Lavalle <email address hidden>
Date: Wed Feb 13 12:29:36 2019 -0600

    Add rootwrap filters to kill state change monitor

    When deleting HA routers, the keepalived state change monitor has to be
    deleted. This patch adds rootwrap filters to allow deleting the state
    change monitor.

    Change-Id: Icfb208d9b51eaa41cf01af81f1ede7420a19cc93
    Partial-Bug: #1795870
    Partial-Bug: #1789434
    (cherry picked from commit 25c432a05a57f794dcbb4f17ce224d914c65e071)
    (cherry picked from commit 6f3620aa88feef9527f8c9599dec049a831b49fa)

tags: added: in-stable-pike
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers