1. Create ~200-300 routers and connect them to some external gateway
for i in {1..300}; do
neutron router-create --distributed False --ha False rbug-$i && \
neutron router-gateway-set rbug-$i my_ext_network
done
2. Stop all L3 agents except of one (let's say, it belongs to node-1)
$ pcs resource ban p_neutron-l3-agent node-2.domain.tld
$ pcs resource ban p_neutron-l3-agent node-3.domain.tld
3. Wait till all routers will be migrated to node-1. You may check that with `neutron router-list-on-l3-agent`
4. Enable all other L3 agents.
$ pcs resource clear p_neutron-l3-agent node-2.domain.tld
$ pcs resource clear p_neutron-l3-agent node-3.domain.tld
5. Check that nothing has been happening with routers, they are connected to L3 agent on `node-1`.
6. Stop L3 agent on node-1:
$ pcs resource ban p_neutron-l3-agent node-1.domain.tld
7. Wait till routers will start to migrate to "live" L3 agents. After that IMMEDIATELY enable L3 agent on node-1:
$ pcs resource clear p_neutron-l3-agent node-1.domain.tld
8. Ensure that rest of routers will stay on node-1 and "drain" will eventually stopped.
Problem was reproduced on MOS 8.0, build #496
Steps to reproduce (having 3 controllers):
1. Create ~200-300 routers and connect them to some external gateway
for i in {1..300}; do
neutron router-create --distributed False --ha False rbug-$i && \
neutron router-gateway-set rbug-$i my_ext_network
done
2. Stop all L3 agents except of one (let's say, it belongs to node-1)
$ pcs resource ban p_neutron-l3-agent node-2.domain.tld
$ pcs resource ban p_neutron-l3-agent node-3.domain.tld
3. Wait till all routers will be migrated to node-1. You may check that with `neutron router- list-on- l3-agent`
4. Enable all other L3 agents.
$ pcs resource clear p_neutron-l3-agent node-2.domain.tld
$ pcs resource clear p_neutron-l3-agent node-3.domain.tld
5. Check that nothing has been happening with routers, they are connected to L3 agent on `node-1`.
6. Stop L3 agent on node-1:
$ pcs resource ban p_neutron-l3-agent node-1.domain.tld
7. Wait till routers will start to migrate to "live" L3 agents. After that IMMEDIATELY enable L3 agent on node-1:
$ pcs resource clear p_neutron-l3-agent node-1.domain.tld
8. Ensure that rest of routers will stay on node-1 and "drain" will eventually stopped.
In reality, all routers will leave node-1.
Please check attached logs. You may observe this usecase (from step 6 ~ 08 Feb 2016 14:15:28 UTC) here: /drive. google. com/a/mirantis. com/file/ d/0B9tzODpFABxk N2R1bHhyZVloelk /view?usp= sharing
https:/