tripleo

Keepalived's VRRP child process is constantly dying and respawning on the controller

Bug #1558490 reported by Attila Darazs on 2016-03-17

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	tripleo	Invalid	Undecided	Unassigned

Bug Description

While testing the new IPv6 gate jobs, they time out during the overcloud deployment.

Looking at the /var/log/messages of the controller, the following error is repeated multiple times every second:

Mar 17 08:52:53 localhost Keepalived[6947]: VRRP child process(8100) died: Respawning
Mar 17 08:52:53 localhost Keepalived_vrrp[8101]: Netlink reflector reports IP fe80::278:f2ff:fe20:52be added
Mar 17 08:52:53 localhost Keepalived_vrrp[8101]: Netlink reflector reports IP fe80::278:f2ff:fe20:52c0 added
Mar 17 08:52:53 localhost Keepalived_vrrp[8101]: Netlink reflector reports IP fe80::278:f2ff:fe20:52ba added
Mar 17 08:52:53 localhost Keepalived_vrrp[8101]: Netlink reflector reports IP 2001:db8:fd00:1000::11 added
Mar 17 08:52:53 localhost Keepalived_vrrp[8101]: Netlink reflector reports IP fe80::278:f2ff:fe20:52bc added
Mar 17 08:52:53 localhost Keepalived_vrrp[8101]: Registering Kernel netlink reflector
Mar 17 08:52:53 localhost Keepalived_vrrp[8101]: Registering Kernel netlink command channel
Mar 17 08:52:53 localhost Keepalived_vrrp[8101]: Registering gratuitous ARP shared channel
Mar 17 08:52:53 localhost Keepalived_vrrp[8101]: Opening file '/etc/keepalived/keepalived.conf'.
Mar 17 08:52:53 localhost Keepalived_vrrp[8101]: Cant find interface for vrrp_instance 53 !!!
Mar 17 08:52:53 localhost Keepalived_vrrp[8101]: Configuration error: VRRP definition must belong to an interface
Mar 17 08:52:53 localhost Keepalived[6947]: VRRP child process(8101) died: Respawning

[same repeats with different pids over and over]

These were the deployment arguments:

OVERCLOUD_DEPLOY_ARGS='--libvirt-type=qemu -t 80 -e /tmp/tripleo-ci/test-environments/swap-partition.yaml --ntp-server 0.centos.pool.ntp.org -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation-v6.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-multiple-nics-v6.yaml -e /tmp/tripleo-ci/test-environments/net-iso.yaml'

Full logs available here: http://logs.openstack.org/45/289445/15/check-tripleo/gate-tripleo-ci-f22-nonha/390d4a3/

It happen during a gate job for: https://review.openstack.org/#/c/289445/15