DVR and HA migration tests failing intermittently for gate-tempest-dsvm-neutron-dvr-multinode-scenario-ubuntu-xenial-nv job
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
Undecided
|
venkata anil |
Bug Description
For the migration test failures Jakub has already created this etherpad https:/
My analysis is this -
DVR and HA migration tempest scenario tests are failing(or passing) intermittently. In the existing tests, immediately after the port update API is returned we are trying ssh connectivity, without checking the dependent resources (like below) created or updated properly.
1) new interfaces are created
2) existing interfaces updated
3) interfaces bound to agents
4) interfaces status updated
5) agents creates namespaces etc
For example, during DVR to HA migration, as soon as the router update api is returned, ssh test might try to use old data plane created with DVR router, as agents might have not synced(removed namespaces, ovs flows and ip routes) with server. If the ssh reply packets arrived back before the old data plane is removed, then ssh can be succesful. If this data path is reconstructed(
When I updated tests to check for the dependent resources before trying for ssh, tests are passing reliably. So we can have these checks before we try for ssh connectivity.
Changed in neutron: | |
assignee: | nobody → venkata anil (anil-venkata) |
tags: | added: l3-dvr-backlog l3-ha tempest |
Changed in neutron: | |
assignee: | venkata anil (anil-venkata) → Brian Haley (brian-haley) |
Changed in neutron: | |
assignee: | Brian Haley (brian-haley) → venkata anil (anil-venkata) |
tags: | added: neutron-proactive-backport-potential |
tags: | removed: neutron-proactive-backport-potential |
Fix proposed to branch: master /review. openstack. org/500384
Review: https:/