Neutron L3 agents are not correctly started after re-scheduling agent for L3 router

Bug #1458633 reported by Dennis Dmitriev
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
Low
Sergey Kolekonov
7.0.x
Fix Released
Low
Sergey Kolekonov

Bug Description

Neutron L3 agents doesn't match IP namespaces created on controllers.

        Scenario:
            1. Deploy cluster: HA, Neutron with GRE segmentation, 3 controllers, 2 compute, 1 cinder.
            2. Create an instance with a key pair
            3. Manually reschedule router from primary controller to another one:
                https://github.com/stackforge/fuel-qa/blob/master/fuelweb_test/tests/tests_strength/test_neutron.py#L54
            4. Check network connectivity from instance via dhcp namespace

Expected result:
    - Instance is pinging from dhcp namespace like before router migrated: http://paste.openstack.org/show/236086/

Actual result: There is no dhcp namespace on the primary controller after router has been migrated: http://paste.openstack.org/show/236005/ , but there is an excess qrouter namespace.

Details:
    - Pacemaker status: http://paste.openstack.org/show/236007/
    - Neutron router-list: http://paste.openstack.org/show/236021/ (only node-3 there, but namespaces are on node-1 and node-3)
    - Neutron dhcp list http://paste.openstack.org/show/236022/ (no running agent for node-1)
    - Neutron agents list: http://paste.openstack.org/show/236107/

Revision history for this message
Dennis Dmitriev (ddmitriev) wrote :
Changed in fuel:
assignee: nobody → MOS Neutron (mos-neutron)
Revision history for this message
Sergey Kolekonov (skolekonov) wrote :

Could you please provide more specific description of the problem you observed?

I've checked neutron-dhcp-agent.log from node-1 and it looks like the agent on node-1 had never served net04, so there's no dhcp namespace for it on node-1.

Speaking about excess qrouter namespace on the node-1, it's expected behavior, if a router has been removed from the agent on that node. Please take a look here https://github.com/openstack/neutron/blob/stable/juno/etc/l3_agent.ini#L75
L3 agent doesn't remove namespace after router removal. but it removes network interfaces from it, so it doesn't affect connectivity.

Revision history for this message
Ilya Shakhat (shakhat) wrote :

Reassigning to the reporter per Sergey's comment

Changed in fuel:
assignee: MOS Neutron (mos-neutron) → Dennis Dmitriev (ddmitriev)
status: New → Incomplete
tags: added: ha neutron
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Reproduced on my environment, but, in the same time, looks like this issue doesn't affects end users.

Steps To Reproduce:
neutron router-list
neutron l3-agent-list-hosting-router 648b86f6-284d-4137-ab83-e0448ee78d8b
neutron l3-agent-router-remove f9c652e2-5b8d-481e-a1ce-e17cd18b94e6 648b86f6-284d-4137-ab83-e0448ee78d8b
neutron agent-list
neutron l3-agent-router-add 5e175bcb-4203-40d2-a0c4-780393b8f8c3 648b86f6-284d-4137-ab83-e0448ee78d8b

Observed Result:
We can see the namespace qrouter-648b86f6-284d-4137-ab83-e0448ee78d8b exists on two controllers, looks like it is not right behaviour. So, I can login to the VM from controller by Floating IP and can ping external network (mail.ru) from this VM even with two q-router spaces for the router.
Only one problem is here - we can't ping VM using it's internal IP throw the DHCP namespace. I'm not sure that it is critical (because it will affects only cloud operators/admins, who will be able to access VMs via DHCP namespace), but if it is possible we need to fix in MOS 6.1, let's do it just to have no bugs in neutron in MOS 6.1 ;)

[root@node-1 ~]# ip netns
haproxy
qdhcp-6005f8d0-8fa7-4c81-9824-7f9b01812d1e
vrouter
qrouter-648b86f6-284d-4137-ab83-e0448ee78d8b

[root@node-2 ~]# ip netns
qrouter-0c96373f-8e2d-4a2d-8daa-16f906722f3e
haproxy
qdhcp-6005f8d0-8fa7-4c81-9824-7f9b01812d1e
qdhcp-54dd1b12-e143-4dc6-acd5-9ff1c0ddf7e9
vrouter

[root@node-3 ~]# ip netns
haproxy
qdhcp-54dd1b12-e143-4dc6-acd5-9ff1c0ddf7e9
vrouter
qrouter-648b86f6-284d-4137-ab83-e0448ee78d8b

Changed in fuel:
status: Incomplete → Confirmed
importance: Undecided → High
assignee: Dennis Dmitriev (ddmitriev) → MOS Neutron (mos-neutron)
affects: fuel → mos
Changed in mos:
milestone: 6.1 → none
milestone: none → 6.1
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

So, this issue not affects any functionality, we can just can see qrouter namespace which has no any interfaces:

[root@node-1 ~]# ip netns exec qrouter-648b86f6-284d-4137-ab83-e0448ee78d8b ip a
36: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever

looks like it is issue with Nice To Have priority, not blocker for MOS 6.1

Changed in mos:
importance: High → Low
status: Confirmed → Won't Fix
tags: added: release-note
tags: added: release-notes
removed: release-note
Revision history for this message
Sergey Kolekonov (skolekonov) wrote :
Anna Babich (ababich)
tags: added: on-verification
Revision history for this message
Anna Babich (ababich) wrote :

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "7.0"
  openstack_version: "2015.1.0-7.0"
  api: "1.0"
  build_number: "187"
  build_id: "2015-08-18_03-05-20"
  nailgun_sha: "4710801a2f4a6d61d652f8f1e64215d9dde37d2e"
  python-fuelclient_sha: "4c74a60aa60c06c136d9197c7d09fa4f8c8e2863"
  fuel-agent_sha: "57145b1d8804389304cd04322ba0fb3dc9d30327"
  fuel-nailgun-agent_sha: "e01693992d7a0304d926b922b43f3b747c35964c"
  astute_sha: "e24ca066bf6160bc1e419aaa5d486cad1aaa937d"
  fuel-library_sha: "0062e69db17f8a63f85996039bdefa87aea498e1"
  fuel-ostf_sha: "17786b86b78e5b66d2b1c15500186648df10c63d"
  fuelmain_sha: "c9dad194e82a60bf33060eae635fff867116a9ce"

Verified on cluster: Neutron with VxLAN+L2pop, 3 controllers, 2 computes

Verification scenario

1. Create network net01, create router01, attach net01 to the router01

2. Launch an instance in the net01

3. Find a controller, where router01 is hosted:
root@node-1:~# router_id=$(neutron router-show router01 | grep ' id ' | awk '{print $4}')
root@node-1:~# neutron l3-agent-list-hosting-router $router_id
+--------------------------------------+-------------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+-------------------+----------------+-------+----------+
| bc3a3068-f7dd-41a3-88c4-821f9cedce7c | node-2.domain.tld | True | :-) | |
+--------------------------------------+-------------------+----------------+-------+----------+

4. Reschedule router01 to another controller:
root@node-1:~# neutron l3-agent-router-remove bc3a3068-f7dd-41a3-88c4-821f9cedce7c $router_id
Removed router 15f54a37-0ae6-4851-b158-8d71380a6265 from L3 agent
root@node-1:~# neutron agent-list | grep l3- | grep node-3
| 42b9e870-a983-40ed-b520-226a81a0b17b | L3 agent | node-3.domain.tld | :-) | True | neutron-l3-agent |
root@node-1:~# neutron l3-agent-router-add 42b9e870-a983-40ed-b520-226a81a0b17b $router_id
Added router 15f54a37-0ae6-4851-b158-8d71380a6265 to L3 agent

5. Check that namespace for router01 appeared on a controller where it was rescheduled to:
root@node-3:~# . openrc
root@node-3:~# router_id=$(neutron router-show router01 | grep ' id ' | awk '{print $4}')
root@node-3:~# ip netns list | grep qrouter-$router_id
qrouter-15f54a37-0ae6-4851-b158-8d71380a6265

6. Check that namespace for router01 disappeared on a controller where it was rescheduled from:
root@node-2:~# ip netns list
qdhcp-556a5bc6-de58-4c61-b4cd-23fb71bccc55
haproxy
vrouter

Anna Babich (ababich)
tags: removed: on-verification
tags: added: release-notes-done-7.0
removed: release-notes
tags: added: release-notes-done rn7.0
removed: release-notes-done-7.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.