Ha router changes state Master/Backup from time to time.

Bug #1572469 reported by sklgromek
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Expired
Undecided
Unassigned

Bug Description

When default route are set in router attributes (not from: neutron router-gateway-set)
Router changes state from master to backup from time to time.
In keepalived config file in section virtual_routes route entry are duplicated,
every time when router state was changed.

My environment:
Ubuntu 14.04 LTS
Openstack Kilo from ubuntu-cloud-archive
Keepalived v1.2.7 (08/14,2013)
l3-agent: neutron-vpn-agent 2015.1.3

Replicate:
Set default route:
neutron router-update my-router --routes type=dict list=true destination=0.0.0.0/0,nexthop=192.168.0.10
Wait for a while...
Router will periodically change state, in keepalived config file route entry will be duplicated.

Tags: l3-ha
Adriano (dritec)
Changed in neutron:
assignee: nobody → Adriano (dritec)
Revision history for this message
Doug Wiegley (dougwig) wrote :

Assaf, can I get a triage assist on this one?

Changed in neutron:
assignee: Adriano (dritec) → Assaf Muller (amuller)
Revision history for this message
Doug Wiegley (dougwig) wrote :

Taking awhile to find a triager (summit week), so I'm marking this confirmed in the meantime.

Changed in neutron:
status: New → Confirmed
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

I am not sure I understand what periodically mean. Like on a schedule or intermittently but in a random way? Besides, L3 HA's code after Kilo has changed substantially and since Kilo is security bugs only, I wonder if we can get this reproduced on newer platforms.

tags: added: l3-ha
Revision history for this message
Darragh O'Reilly (darragh-oreilly) wrote :

I have a Kilo installation and I tried that. Now I see 2 default routes:

neutron router-update ds-cvr-ha-rtr \
    --routes type=dict list=true \
   destination=0.0.0.0/0,nexthop=172.16.35.100

Now the keepalived confs have:

    virtual_routes {
        0.0.0.0/0 via 172.16.35.1 dev qg-dc1fc66a-9d
        0.0.0.0/0 via 172.16.35.100
    }

# ip netns exec qrouter-5aa06109-990f-415d-91f8-90ea7d90de7f ip r show default
default via 172.16.35.100 dev qg-dc1fc66a-9d
default via 172.16.35.1 dev qg-dc1fc66a-9d
10.3.0.0/24 dev qr-6cbb9afd-bb proto kernel scope link src 10.3.0.1
169.254.0.0/24 dev ha-399711bf-bd proto kernel scope link src 169.254.0.1
169.254.192.0/18 dev ha-399711bf-bd proto kernel scope link src 169.254.192.1
172.16.35.0/24 dev qg-dc1fc66a-9d proto kernel scope link src 172.16.35.32

I left it a while and the master stayed on the same node - so I didn't see it is causing periodic moves. I induced a move by stopping the agent on the master node, but did not see any further replication of routes in KAD.

The code is using 0.0.0.0/0 to determine the current default route:
https://github.com/openstack/neutron/blob/2015.1.3/neutron/agent/l3/ha_router.py#L184-L186

There is a definite problem, because now I can't get rid of this route without restarting the agents.

What is the use case for adding the default route this way? What do you expect to happen?

Revision history for this message
Ann Taraday (akamyshnikova) wrote :

I wonder if this has the same cause as the https://bugs.launchpad.net/neutron/+bug/1497272, did you try keepalived higher version?

Changed in neutron:
status: Confirmed → Won't Fix
status: Won't Fix → Confirmed
Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

We would need a fresh reproducer with new keepalived to make progress here.

Changed in neutron:
status: Confirmed → Incomplete
assignee: Assaf Muller (amuller) → nobody
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.