liberty neutron-l3-agent ha failes to spawn keepalived

Bug #1583977 reported by Tobias Urdin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Invalid
Undecided
Unassigned
neutron
Invalid
Undecided
Unassigned

Bug Description

After upgrading to 7.0.4 I have several routers that fails to spawn the keepalived process.

The logs say
2016-05-20 11:01:11.181 23023 ERROR neutron.agent.linux.external_process [-] default-service for router with uuid c1cc1a5d-c0ef-47b7-8d5c-88403e134725 not found. The process should not have died
2016-05-20 11:01:11.181 23023 ERROR neutron.agent.linux.external_process [-] respawning keepalived for uuid c1cc1a5d-c0ef-47b7-8d5c-88403e134725
2016-05-20 11:01:11.182 23023 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-c1cc1a5d-c0ef-47b7-8d5c-88403e134725', 'keepalived', '-P', '-f', '/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725/keepalived.conf', '-p', '/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.pid', '-r', '/var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.pid-vrrp'] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:85

All these spawns fail and keepalived outputs to syslog
May 20 11:01:11 neutron1 Keepalived[46558]: Starting Keepalived v1.2.19 (09/04,2015)
May 20 11:01:11 neutron1 Keepalived[46558]: daemon is already running

but the daemon is not running
the only thing running is the neutron-keepalived-state-change

root@neutron1:~# ps auxf | grep c1cc1a5d
root 48137 0.0 0.0 11740 936 pts/4 S+ 11:03 0:00 | \_ grep --color=auto c1cc1a5d
neutron 21671 0.0 0.0 124924 40172 ? S May19 0:00 /usr/bin/python /usr/bin/neutron-keepalived-state-change --router_id=c1cc1a5d-c0ef-47b7-8d5c-88403e134725 --namespace=qrouter-c1cc1a5d-c0ef-47b7-8d5c-88403e134725 --conf_dir=/var/lib/neutron/ha_confs/c1cc1a5-c0ef-47b7-8d5c-88403e134725 --monitor_interface=ha-ef4e2a2f-66 --monitor_cidr=169.254.0.1/24 --pid_file=/var/lib/neutron/external/pids/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.monitor.pid --state_path=/var/lib/neutron --user=107 --group=112

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: neutron-l3-agent 2:7.0.4-0ubuntu1~cloud0 [origin: Canonical]
ProcVersionSignature: Ubuntu 3.13.0-86.131-generic 3.13.11-ckt39
Uname: Linux 3.13.0-86-generic x86_64
NonfreeKernelModules: hcpdriver
ApportVersion: 2.14.1-0ubuntu3.20
Architecture: amd64
CrashDB:
 {
                "impl": "launchpad",
                "project": "cloud-archive",
                "bug_pattern_url": "http://people.canonical.com/~ubuntu-archive/bugpatterns/bugpatterns.xml",
             }
Date: Fri May 20 11:00:01 2016
PackageArchitecture: all
SourcePackage: neutron
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Tobias Urdin (tobias-urdin) wrote :
tags: added: regression-update
Revision history for this message
Tobias Urdin (tobias-urdin) wrote :

Moving the pid files for the affected router solves the issue.
mv /var/lib/neutron/ha_confs/c1cc1a5d-c0ef-47b7-8d5c-88403e134725.* /root

Found fix thanks to frickler on IRC. It has been merged for liberty https://review.openstack.org/#/c/299138/3

Changed in cloud-archive:
status: New → Invalid
Changed in neutron:
status: New → Invalid
Revision history for this message
Hong Hui Xiao (xiaohhui) wrote :

If there is no other keepalived process, it might be because there are some orphan the pid files, according to [1].

https://github.com/acassen/keepalived/blob/03da0d2d0393808bbb2feac7abc07aaf8d647855/keepalived/core/pidfile.c#L89-L98

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.