I was trying to understand on one example what happens there that this failover happens sometimes.
I was based on test result http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/testr_results.html.gz
Two „hosts”: host-3f3dad1b and host-6d630618
Router id: 3d3c2c83-234a-4b63-bd6b-c450da34a7d2
First time router was created: * host-3f3dad1b was backup, router transitioned to backup at 3:37:35.482 http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/dsvm-fullstack-logs/TestHAL3Agent.test_ha_router_restart_agents_no_packet_lost/neutron-l3-agent--2018-11-30--03-37-05-357461.txt.gz#_2018-11-30_03_37_35_482
* host-6d630618 was active, router transitioned first to backup at 03:37:32.245 http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/dsvm-fullstack-logs/TestHAL3Agent.test_ha_router_restart_agents_no_packet_lost/neutron-l3-agent--2018-11-30--03-37-05-693500.txt.gz#_2018-11-30_03_37_32_245 and later transitioned to master at 03:37:47.489 http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/dsvm-fullstack-logs/TestHAL3Agent.test_ha_router_restart_agents_no_packet_lost/neutron-l3-agent--2018-11-30--03-37-05-693500.txt.gz#_2018-11-30_03_37_47_489
Restarts of agents: * First restart of backup agent (host-3f3dad1b) at 03:37:50.546 http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/dsvm-fullstack-logs/TestHAL3Agent.test_ha_router_restart_agents_no_packet_lost.txt.gz#_2018-11-30_03_37_50_546 Pinging gateway IP address from external vm for 1 minute is fine, * New process on this host is started and router is agent transitioned to backup at 03:37:59.021: http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/dsvm-fullstack-logs/TestHAL3Agent.test_ha_router_restart_agents_no_packet_lost/neutron-l3-agent--2018-11-30--03-37-50-590709.txt.gz#_2018-11-30_03_37_59_388
* Then restart of master agent happens at 03:38:50.909 http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/dsvm-fullstack-logs/TestHAL3Agent.test_ha_router_restart_agents_no_packet_lost.txt.gz#_2018-11-30_03_38_50_909 Router is then transitioned to active on host-3f3dad1b at 03:39:02.322 http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/dsvm-fullstack-logs/TestHAL3Agent.test_ha_router_restart_agents_no_packet_lost/neutron-l3-agent--2018-11-30--03-37-50-590709.txt.gz#_2018-11-30_03_39_02_322 And it is transitioned to backup at 03:39:03.522: http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/dsvm-fullstack-logs/TestHAL3Agent.test_ha_router_restart_agents_no_packet_lost/neutron-l3-agent--2018-11-30--03-38-50-946978.txt.gz#_2018-11-30_03_39_03_522 On this host it is also transitioned to backup once again at 03:39:19.314 http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/dsvm-fullstack-logs/TestHAL3Agent.test_ha_router_restart_agents_no_packet_lost/neutron-l3-agent--2018-11-30--03-38-50-946978.txt.gz#_2018-11-30_03_39_19_314 And finally it is transitioned back to active on host host-6d630618 at 03:39:36.339 http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/dsvm-fullstack-logs/TestHAL3Agent.test_ha_router_restart_agents_no_packet_lost/neutron-l3-agent--2018-11-30--03-38-50-946978.txt.gz#_2018-11-30_03_39_36_339
And I still don't know why some VRRP packets are lost (probably) and keepalived switches this VIP. Related keepalived logs can be found in http://logs.openstack.org/09/608909/20/check/neutron-fullstack/c7b6401/logs/journal.log around 03:39:00
I was trying to understand on one example what happens there that this failover happens sometimes.
I was based on test result http:// logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/testr_ results. html.gz
Two „hosts”: host-3f3dad1b and host-6d630618
Router id: 3d3c2c83- 234a-4b63- bd6b-c450da34a7 d2
First time router was created: logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/dsvm- fullstack- logs/TestHAL3Ag ent.test_ ha_router_ restart_ agents_ no_packet_ lost/neutron- l3-agent- -2018-11- 30--03- 37-05-357461. txt.gz# _2018-11- 30_03_37_ 35_482
* host-3f3dad1b was backup, router transitioned to backup at 3:37:35.482
http://
* host-6d630618 was active, router transitioned first to backup at 03:37:32.245 logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/dsvm- fullstack- logs/TestHAL3Ag ent.test_ ha_router_ restart_ agents_ no_packet_ lost/neutron- l3-agent- -2018-11- 30--03- 37-05-693500. txt.gz# _2018-11- 30_03_37_ 32_245 logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/dsvm- fullstack- logs/TestHAL3Ag ent.test_ ha_router_ restart_ agents_ no_packet_ lost/neutron- l3-agent- -2018-11- 30--03- 37-05-693500. txt.gz# _2018-11- 30_03_37_ 47_489
http://
and later transitioned to master at 03:37:47.489
http://
Restarts of agents: logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/dsvm- fullstack- logs/TestHAL3Ag ent.test_ ha_router_ restart_ agents_ no_packet_ lost.txt. gz#_2018- 11-30_03_ 37_50_546 logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/dsvm- fullstack- logs/TestHAL3Ag ent.test_ ha_router_ restart_ agents_ no_packet_ lost/neutron- l3-agent- -2018-11- 30--03- 37-50-590709. txt.gz# _2018-11- 30_03_37_ 59_388
* First restart of backup agent (host-3f3dad1b) at 03:37:50.546
http://
Pinging gateway IP address from external vm for 1 minute is fine,
* New process on this host is started and router is agent transitioned to backup at 03:37:59.021:
http://
* Then restart of master agent happens at 03:38:50.909 logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/dsvm- fullstack- logs/TestHAL3Ag ent.test_ ha_router_ restart_ agents_ no_packet_ lost.txt. gz#_2018- 11-30_03_ 38_50_909 logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/dsvm- fullstack- logs/TestHAL3Ag ent.test_ ha_router_ restart_ agents_ no_packet_ lost/neutron- l3-agent- -2018-11- 30--03- 37-50-590709. txt.gz# _2018-11- 30_03_39_ 02_322 logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/dsvm- fullstack- logs/TestHAL3Ag ent.test_ ha_router_ restart_ agents_ no_packet_ lost/neutron- l3-agent- -2018-11- 30--03- 38-50-946978. txt.gz# _2018-11- 30_03_39_ 03_522 logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/dsvm- fullstack- logs/TestHAL3Ag ent.test_ ha_router_ restart_ agents_ no_packet_ lost/neutron- l3-agent- -2018-11- 30--03- 38-50-946978. txt.gz# _2018-11- 30_03_39_ 19_314 logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/dsvm- fullstack- logs/TestHAL3Ag ent.test_ ha_router_ restart_ agents_ no_packet_ lost/neutron- l3-agent- -2018-11- 30--03- 38-50-946978. txt.gz# _2018-11- 30_03_39_ 36_339
http://
Router is then transitioned to active on host-3f3dad1b at 03:39:02.322
http://
And it is transitioned to backup at 03:39:03.522:
http://
On this host it is also transitioned to backup once again at 03:39:19.314
http://
And finally it is transitioned back to active on host host-6d630618 at 03:39:36.339
http://
And I still don't know why some VRRP packets are lost (probably) and keepalived switches this VIP. logs.openstack. org/09/ 608909/ 20/check/ neutron- fullstack/ c7b6401/ logs/journal. log around 03:39:00
Related keepalived logs can be found in http://