Comment 3 for bug 1523479

Revision history for this message
Sergey Arkhipov (sarkhipov) wrote : Re: [Backport 1522436] No need to autoreschedule routers if l3 agent is back online

Problem was reproduced on MOS 8.0, build #496

Steps to reproduce (having 3 controllers):

1. Create ~200-300 routers and connect them to some external gateway
   for i in {1..300}; do
       neutron router-create --distributed False --ha False rbug-$i && \
       neutron router-gateway-set rbug-$i my_ext_network
   done

2. Stop all L3 agents except of one (let's say, it belongs to node-1)
   $ pcs resource ban p_neutron-l3-agent node-2.domain.tld
   $ pcs resource ban p_neutron-l3-agent node-3.domain.tld

3. Wait till all routers will be migrated to node-1. You may check that with `neutron router-list-on-l3-agent`

4. Enable all other L3 agents.
   $ pcs resource clear p_neutron-l3-agent node-2.domain.tld
   $ pcs resource clear p_neutron-l3-agent node-3.domain.tld

5. Check that nothing has been happening with routers, they are connected to L3 agent on `node-1`.

6. Stop L3 agent on node-1:
   $ pcs resource ban p_neutron-l3-agent node-1.domain.tld

7. Wait till routers will start to migrate to "live" L3 agents. After that IMMEDIATELY enable L3 agent on node-1:
   $ pcs resource clear p_neutron-l3-agent node-1.domain.tld

8. Ensure that rest of routers will stay on node-1 and "drain" will eventually stopped.

In reality, all routers will leave node-1.

Please check attached logs. You may observe this usecase (from step 6 ~ 08 Feb 2016 14:15:28 UTC) here:
https://drive.google.com/a/mirantis.com/file/d/0B9tzODpFABxkN2R1bHhyZVloelk/view?usp=sharing