[neutron][dvr]Distrubuted router fails to remove from one l3-agent and add to another

Bug #1593238 reported by Rodion Promyshlennikov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
High
Oleg Bondarev

Bug Description

Distributed router fails to remove from one l3-agent to another during manual rescheduling.
After running "neutron l3-agent-router-remove l3_agent_id router" router is staying on l3_agent_id (there is some time range about some milliseconds, when cmd "neutron l3-agent-list-hosting-router" can show that there is no hosting agents after remove).

 Expected result:

    Router is removed from l3-agent

Actual result:

    Router stays on old l3-agent

Steps to reproduce (history from real environment):

 neutron net-create net_NS
 neutron subnet-create --name net_NS__subnet net_NS 10.2.0.0/24
 neutron router-create router_NS
 neutron router-gateway-set router_NS admin_floating_net
 neutron router-interface-add router_NS net_NS__subnet
 NET_ID_3=$(neutron net-list | grep net_NS | awk '{print$2}')
 nova boot vm_NS --flavor 1 --image TestVM --nic net-id=$NET_ID_3
 neutron router-list
 neutron l3-agent-list-hosting-router router_NS # agent from node-1
neutron agent-list | grep l3
| 36ff9cbf-1cc1-40d4-89d1-c66c38a0a8b4 | L3 agent | node-2.test.domain.local | nova | :-) | True | neutron-l3-agent |
| 575709e0-043f-44af-a461-259c90ffa86f | L3 agent | node-5.test.domain.local | nova | :-) | True | neutron-l3-agent |
| 576094fe-fa31-462b-9979-ec2e06bc7489 | L3 agent | node-3.test.domain.local | nova | :-) | True | neutron-l3-agent |
| 9fc9f755-a5c8-4f5c-841b-d5f5eb6bd388 | L3 agent | node-4.test.domain.local | nova | :-) | True | neutron-l3-agent |
| d378ee31-8d24-4cf2-83ef-fef82a2ead04 | L3 agent | node-1.test.domain.local | nova | :-) | True | neutron-l3-agent |

 neutron l3-agent-router-remove d378ee31-8d24-4cf2-83ef-fef82a2ead04 router_NS
 neutron l3-agent-router-add 9fc9f755-a5c8-4f5c-841b-d5f5eb6bd388 router_NS
 neutron l3-agent-list-hosting-router router_NS # on node-4
 neutron l3-agent-router-remove 9fc9f755-a5c8-4f5c-841b-d5f5eb6bd388 router_NS
 neutron l3-agent-list-hosting-router router_NS # still node-4

 neutron l3-agent-router-add d378ee31-8d24-4cf2-83ef-fef82a2ead04 router_NS

  error message:
The router dc713065-a0b2-4dd1-8dc2-309a494bacc6 has been already hosted by the L3 Agent 9fc9f755-a5c8-4f5c-841b-d5f5eb6bd388

VERSION:
openstack_version: mitaka-9.0
release: '9.0'
build_number: 481

Snapshot:
https://drive.google.com/open?id=0B-QiiEr4w70UX0FVZjI1OVB1NWs

Revision history for this message
Kristina Berezovskaia (kkuznetsova) wrote :

Also, reproduce on
cat /etc/fuel_build_id:
 492
cat /etc/fuel_build_number:
 492
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-release-9.0.0-1.mos6349.noarch
 fuel-misc-9.0.0-1.mos8460.noarch
 python-packetary-9.0.0-1.mos140.noarch
 fuel-bootstrap-cli-9.0.0-1.mos285.noarch
 fuel-migrate-9.0.0-1.mos8460.noarch
 rubygem-astute-9.0.0-1.mos750.noarch
 fuel-mirror-9.0.0-1.mos140.noarch
 shotgun-9.0.0-1.mos90.noarch
 fuel-openstack-metadata-9.0.0-1.mos8743.noarch
 fuel-notify-9.0.0-1.mos8460.noarch
 nailgun-mcagents-9.0.0-1.mos750.noarch
 python-fuelclient-9.0.0-1.mos325.noarch
 fuel-9.0.0-1.mos6349.noarch
 fuel-utils-9.0.0-1.mos8460.noarch
 fuel-setup-9.0.0-1.mos6349.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8743.noarch
 fuel-library9.0-9.0.0-1.mos8460.noarch
 network-checker-9.0.0-1.mos74.x86_64
 fuel-agent-9.0.0-1.mos285.noarch
 fuel-ui-9.0.0-1.mos2717.noarch
 fuel-ostf-9.0.0-1.mos936.noarch
 fuelmenu-9.0.0-1.mos274.noarch
 fuel-nailgun-9.0.0-1.mos8743.noarch

Changed in mos:
milestone: none → 9.0
assignee: nobody → MOS Neutron (mos-neutron)
importance: Undecided → High
tags: added: area-neutron
Revision history for this message
Bug Checker Bot (bug-checker) wrote : Autochecker

(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:

actual result

expected result

For more detailed information on the contents of each of the listed sections see https://wiki.openstack.org/wiki/Fuel/How_to_contribute#Here_is_how_you_file_a_bug

tags: added: need-info
description: updated
tags: removed: need-info
Changed in mos:
assignee: MOS Neutron (mos-neutron) → Oleg Bondarev (obondarev)
Revision history for this message
Oleg Bondarev (obondarev) wrote :

Regression from https://review.openstack.org/#/c/319397 :(
Will see if can be fixed easily or better to revert

As a workaround can set router_auto_schedule=False in neutron.conf

Changed in mos:
status: New → Confirmed
Revision history for this message
Oleg Bondarev (obondarev) wrote :

A more cleaner workaround (without restarting services) is to set admin_state_up to False for the agent if need to reschedule resources from it.

Not a Critical since has workarounds.

Revision history for this message
Dina Belova (dbelova) wrote :

Moving to 9.0-updates due to previous comment

Changed in mos:
milestone: 9.0 → 9.0-updates
tags: added: release-notes
Revision history for this message
Dina Belova (dbelova) wrote :

Added release-notes tag to include workaround information to the release notes ^^

Revision history for this message
Oleg Bondarev (obondarev) wrote :
Revision history for this message
Oleg Bondarev (obondarev) wrote :

Fixed by sync with upstream https://review.fuel-infra.org/#/c/23625/

Changed in mos:
status: Confirmed → Fix Committed
tags: added: on-verification
Revision history for this message
Kristina Berezovskaia (kkuznetsova) wrote :

Verify on
CUSTOM_VERSION=snapshot #142
MAGNET_LINK=magnet:?xt=urn:btih:bfec808dd71ff42c5613a3527733d9012bb1fabc&dn=MirantisOpenStack-9.0.iso&tr=http%3A%2F%2Ftracker01-bud.infra.mirantis.net%3A8080%2Fannounce&tr=http%3A%2F%2Ftracker01-scc.infra.mirantis.net%3A8080%2Fannounce&tr=http%3A%2F%2Ftracker01-msk.infra.mirantis.net%3A8080%2Fannounce&ws=http%3A%2F%2Fvault.infra.mirantis.net%2FMirantisOpenStack-9.0.iso
FUEL_QA_COMMIT=16d1700e2307bc06cacf969060bd454826a7d4db
UBUNTU_MIRROR_ID=ubuntu-2016-08-03-174238
CENTOS_MIRROR_ID=centos-7.2.1511-2016-05-31-083834
MOS_UBUNTU_MIRROR_ID=9.0-2016-08-15-132321
MOS_CENTOS_OS_MIRROR_ID=os-2016-06-23-135731
MOS_CENTOS_PROPOSED_MIRROR_ID=proposed-2016-08-15-172323
MOS_CENTOS_UPDATES_MIRROR_ID=updates-2016-06-23-135916
MOS_CENTOS_HOLDBACK_MIRROR_ID=holdback-2016-06-23-140047
MOS_CENTOS_HOTFIX_MIRROR_ID=hotfix-2016-07-18-162958
MOS_CENTOS_SECURITY_MIRROR_ID=security-2016-06-23-140002
(vxlan+dvr)
Create net, subnet, router, boot vm
Manually reschedule router from one l3 agent to another
Check that router is on the new agent
Check that connectivity from vm to 8.8.8.8 is still available

Changed in mos:
status: Fix Committed → Fix Released
tags: removed: on-verification
tags: added: release-notes-done
removed: release-notes
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.