Activity log for bug #1644530

Date Who What changed Old value New value Message
2016-11-24 12:20:20 Yutani bug added bug
2016-11-24 12:21:28 Yutani description Because "PIDFile=" directive is missing in the systemd unit file, keepalived sometimes fails to kill all old processes. The old processes remain with old settings and cause unexpected behaviors. The detail of this bug is described in this ticket in upstream: https://github.com/acassen/keepalived/issues/443. The official systemd unit file is available since version 1.2.24 by this commit: https://github.com/acassen/keepalived/commit/635ab69afb44cd8573663e62f292c6bb84b44f15 This includes "PIDFile" directive correctly. We should go the same way. I am using Ubuntu 16.04.1, kernel 4.4.0-45-generic. Package: keepalived Version: 1.2.19-1 Because "PIDFile=" directive is missing in the systemd unit file, keepalived sometimes fails to kill all old processes. The old processes remain with old settings and cause unexpected behaviors. The detail of this bug is described in this ticket in upstream: https://github.com/acassen/keepalived/issues/443. The official systemd unit file is available since version 1.2.24 by this commit: https://github.com/acassen/keepalived/commit/635ab69afb44cd8573663e62f292c6bb84b44f15 This includes "PIDFile" directive correctly: PIDFile=/var/run/keepalived.pid We should go the same way. I am using Ubuntu 16.04.1, kernel 4.4.0-45-generic. Package: keepalived Version: 1.2.19-1
2016-11-24 12:40:32 Yutani attachment added add PIDFile= directive in systemd unit file https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/1644530/+attachment/4782407/+files/fix-systemd-unit.patch
2016-11-24 12:41:31 Launchpad Janitor keepalived (Ubuntu): status New Confirmed
2016-11-24 13:13:25 Yutani attachment removed add PIDFile= directive in systemd unit file https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/1644530/+attachment/4782407/+files/fix-systemd-unit.patch
2016-11-24 13:14:16 Yutani attachment added add PIDFile= directive in systemd unit file https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/1644530/+attachment/4782451/+files/fix-systemd-unit.patch
2016-11-24 16:28:25 Ubuntu Foundations Team Bug Bot tags patch
2016-11-24 16:28:33 Ubuntu Foundations Team Bug Bot bug added subscriber Ubuntu Review Team
2016-11-28 10:03:56 Christian Ehrhardt  nominated for series Ubuntu Xenial
2016-11-28 10:03:56 Christian Ehrhardt  bug task added keepalived (Ubuntu Xenial)
2016-11-28 10:04:02 Christian Ehrhardt  keepalived (Ubuntu): status Confirmed Fix Released
2016-11-28 10:04:04 Christian Ehrhardt  keepalived (Ubuntu Xenial): status New Confirmed
2016-11-28 10:04:06 Christian Ehrhardt  keepalived (Ubuntu Xenial): importance Undecided Medium
2016-11-28 10:04:17 Christian Ehrhardt  bug added subscriber Ubuntu Server Team
2017-03-04 02:02:02 Yutani description Because "PIDFile=" directive is missing in the systemd unit file, keepalived sometimes fails to kill all old processes. The old processes remain with old settings and cause unexpected behaviors. The detail of this bug is described in this ticket in upstream: https://github.com/acassen/keepalived/issues/443. The official systemd unit file is available since version 1.2.24 by this commit: https://github.com/acassen/keepalived/commit/635ab69afb44cd8573663e62f292c6bb84b44f15 This includes "PIDFile" directive correctly: PIDFile=/var/run/keepalived.pid We should go the same way. I am using Ubuntu 16.04.1, kernel 4.4.0-45-generic. Package: keepalived Version: 1.2.19-1 Because "PIDFile=" directive is missing in the systemd unit file, keepalived sometimes fails to kill all old processes. The old processes remain with old settings and cause unexpected behaviors. The detail of this bug is described in this ticket in upstream: https://github.com/acassen/keepalived/issues/443. The official systemd unit file is available since version 1.2.24 by this commit: https://github.com/acassen/keepalived/commit/635ab69afb44cd8573663e62f292c6bb84b44f15 This includes "PIDFile" directive correctly: PIDFile=/var/run/keepalived.pid We should go the same way. I am using Ubuntu 16.04.1, kernel 4.4.0-45-generic. Package: keepalived Version: 1.2.19-1 ======================================================================= How to reproduce: I used the two instances of Ubuntu 16.04.2 on DigitalOcean: Configurations -------------- MASTER server's /etc/keepalived/keepalived.conf: vrrp_script chk_nothing { script "/bin/true" interval 2 } vrrp_instance G1 { interface eth1 state BACKUP priority 100 virtual_router_id 123 unicast_src_ip <primal IP> unicast_peer { <secondal IP> } track_script { chk_nothing } } BACKUP server's /etc/keepalived/keepalived.conf: vrrp_script chk_nothing { script "/bin/true" interval 2 } vrrp_instance G1 { interface eth1 state MASTER priority 200 virtual_router_id 123 unicast_src_ip <secondal IP> unicast_peer { <primal IP> } track_script { chk_nothing } } Procedures ---------- 1) Start keepalived on both servers $ sudo systemctl start keepalived.service 2) Restart keepalived on either one $ sudo systemctl restart keepalived.service 3) Check status and PID $ systemctl status -n0 keepalived.service Result ------ The results are bellow. As you can easily notice, on even numbers of restarts, systemd takes wrong PIDs as Main PID; the process is actually exited but is considered active. root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived ● keepalived.service - Keepalive Daemon (LVS and VRRP) Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2017-03-04 01:37:12 UTC; 14min ago Process: 3402 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS) Main PID: 3403 (keepalived) Tasks: 3 Memory: 1.7M CPU: 1.900s CGroup: /system.slice/keepalived.service ├─3403 /usr/sbin/keepalived ├─3405 /usr/sbin/keepalived └─3406 /usr/sbin/keepalived root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived ● keepalived.service - Keepalive Daemon (LVS and VRRP) Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2017-03-04 01:51:45 UTC; 1s ago Process: 4782 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS) Main PID: 3403 (code=exited, status=0/SUCCESS) Tasks: 3 Memory: 1.7M CPU: 11ms CGroup: /system.slice/keepalived.service ├─4783 /usr/sbin/keepalived ├─4784 /usr/sbin/keepalived └─4785 /usr/sbin/keepalived root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived ● keepalived.service - Keepalive Daemon (LVS and VRRP) Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2017-03-04 01:51:49 UTC; 1s ago Process: 4796 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS) Main PID: 4783 (keepalived) Tasks: 3 Memory: 1.7M CPU: 6ms CGroup: /system.slice/keepalived.service ├─4783 /usr/sbin/keepalived ├─4784 /usr/sbin/keepalived └─4785 /usr/sbin/keepalived root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived ● keepalived.service - Keepalive Daemon (LVS and VRRP) Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2017-03-04 01:51:53 UTC; 2s ago Process: 4810 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS) Main PID: 4783 (code=exited, status=0/SUCCESS) Tasks: 3 Memory: 1.7M CPU: 17ms CGroup: /system.slice/keepalived.service ├─4811 /usr/sbin/keepalived ├─4812 /usr/sbin/keepalived └─4813 /usr/sbin/keepalived
2017-03-04 02:43:41 Yutani description Because "PIDFile=" directive is missing in the systemd unit file, keepalived sometimes fails to kill all old processes. The old processes remain with old settings and cause unexpected behaviors. The detail of this bug is described in this ticket in upstream: https://github.com/acassen/keepalived/issues/443. The official systemd unit file is available since version 1.2.24 by this commit: https://github.com/acassen/keepalived/commit/635ab69afb44cd8573663e62f292c6bb84b44f15 This includes "PIDFile" directive correctly: PIDFile=/var/run/keepalived.pid We should go the same way. I am using Ubuntu 16.04.1, kernel 4.4.0-45-generic. Package: keepalived Version: 1.2.19-1 ======================================================================= How to reproduce: I used the two instances of Ubuntu 16.04.2 on DigitalOcean: Configurations -------------- MASTER server's /etc/keepalived/keepalived.conf: vrrp_script chk_nothing { script "/bin/true" interval 2 } vrrp_instance G1 { interface eth1 state BACKUP priority 100 virtual_router_id 123 unicast_src_ip <primal IP> unicast_peer { <secondal IP> } track_script { chk_nothing } } BACKUP server's /etc/keepalived/keepalived.conf: vrrp_script chk_nothing { script "/bin/true" interval 2 } vrrp_instance G1 { interface eth1 state MASTER priority 200 virtual_router_id 123 unicast_src_ip <secondal IP> unicast_peer { <primal IP> } track_script { chk_nothing } } Procedures ---------- 1) Start keepalived on both servers $ sudo systemctl start keepalived.service 2) Restart keepalived on either one $ sudo systemctl restart keepalived.service 3) Check status and PID $ systemctl status -n0 keepalived.service Result ------ The results are bellow. As you can easily notice, on even numbers of restarts, systemd takes wrong PIDs as Main PID; the process is actually exited but is considered active. root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived ● keepalived.service - Keepalive Daemon (LVS and VRRP) Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2017-03-04 01:37:12 UTC; 14min ago Process: 3402 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS) Main PID: 3403 (keepalived) Tasks: 3 Memory: 1.7M CPU: 1.900s CGroup: /system.slice/keepalived.service ├─3403 /usr/sbin/keepalived ├─3405 /usr/sbin/keepalived └─3406 /usr/sbin/keepalived root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived ● keepalived.service - Keepalive Daemon (LVS and VRRP) Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2017-03-04 01:51:45 UTC; 1s ago Process: 4782 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS) Main PID: 3403 (code=exited, status=0/SUCCESS) Tasks: 3 Memory: 1.7M CPU: 11ms CGroup: /system.slice/keepalived.service ├─4783 /usr/sbin/keepalived ├─4784 /usr/sbin/keepalived └─4785 /usr/sbin/keepalived root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived ● keepalived.service - Keepalive Daemon (LVS and VRRP) Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2017-03-04 01:51:49 UTC; 1s ago Process: 4796 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS) Main PID: 4783 (keepalived) Tasks: 3 Memory: 1.7M CPU: 6ms CGroup: /system.slice/keepalived.service ├─4783 /usr/sbin/keepalived ├─4784 /usr/sbin/keepalived └─4785 /usr/sbin/keepalived root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived ● keepalived.service - Keepalive Daemon (LVS and VRRP) Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2017-03-04 01:51:53 UTC; 2s ago Process: 4810 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS) Main PID: 4783 (code=exited, status=0/SUCCESS) Tasks: 3 Memory: 1.7M CPU: 17ms CGroup: /system.slice/keepalived.service ├─4811 /usr/sbin/keepalived ├─4812 /usr/sbin/keepalived └─4813 /usr/sbin/keepalived Because "PIDFile=" directive is missing in the systemd unit file, keepalived sometimes fails to kill all old processes. The old processes remain with old settings and cause unexpected behaviors. The detail of this bug is described in this ticket in upstream: https://github.com/acassen/keepalived/issues/443. The official systemd unit file is available since version 1.2.24 by this commit: https://github.com/acassen/keepalived/commit/635ab69afb44cd8573663e62f292c6bb84b44f15 This includes "PIDFile" directive correctly: PIDFile=/var/run/keepalived.pid We should go the same way. I am using Ubuntu 16.04.1, kernel 4.4.0-45-generic. Package: keepalived Version: 1.2.19-1 ======================================================================= How to reproduce: I used the two instances of Ubuntu 16.04.2 on DigitalOcean: Configurations -------------- MASTER server's /etc/keepalived/keepalived.conf:   vrrp_script chk_nothing {      script "/bin/true"      interval 2   }   vrrp_instance G1 {     interface eth1     state BACKUP     priority 100     virtual_router_id 123     unicast_src_ip <primal IP>     unicast_peer {       <secondal IP>     }     track_script {       chk_nothing     }   } BACKUP server's /etc/keepalived/keepalived.conf:   vrrp_script chk_nothing {      script "/bin/true"      interval 2   }   vrrp_instance G1 {     interface eth1     state MASTER     priority 200     virtual_router_id 123     unicast_src_ip <secondal IP>     unicast_peer {       <primal IP>     }     track_script {       chk_nothing     }   } Procedures ---------- 1) Start keepalived on both servers   $ sudo systemctl start keepalived.service 2) Restart keepalived on either one   $ sudo systemctl restart keepalived.service 3) Check status and PID   $ systemctl status -n0 keepalived.service Result ------ 0) Before restart Main PID is 3402 and the subprocesses' PIDs are 3403-3406. So far so good.   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:37:12 UTC; 14min ago     Process: 3402 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 3403 (keepalived)       Tasks: 3      Memory: 1.7M         CPU: 1.900s      CGroup: /system.slice/keepalived.service              ├─3403 /usr/sbin/keepalived              ├─3405 /usr/sbin/keepalived              └─3406 /usr/sbin/keepalived 1) First restart Now Main PID is 3403, which was one of the previous subprocesses and is actually exited. Something is wrong. Yet, the previous processes are all exited; we are not likely to see no weird behaviors here.   root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:51:45 UTC; 1s ago     Process: 4782 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 3403 (code=exited, status=0/SUCCESS)       Tasks: 3      Memory: 1.7M         CPU: 11ms      CGroup: /system.slice/keepalived.service              ├─4783 /usr/sbin/keepalived              ├─4784 /usr/sbin/keepalived              └─4785 /usr/sbin/keepalived 2) Second restart Now Main PID is 4783 and subprocesses' PIDs are 4783-4785. This is problematic as 4783 is the old process, which should have exited before new processes arose. Therefore, keepalived remains in old settings while users believe it uses the new setting.   root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:51:49 UTC; 1s ago     Process: 4796 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 4783 (keepalived)       Tasks: 3      Memory: 1.7M         CPU: 6ms      CGroup: /system.slice/keepalived.service              ├─4783 /usr/sbin/keepalived              ├─4784 /usr/sbin/keepalived              └─4785 /usr/sbin/keepalived
2017-03-04 04:36:42 Yutani keepalived (Ubuntu): status Fix Released Confirmed
2017-03-04 10:30:22 Mitsuya Shibata bug added subscriber Mitsuya Shibata
2017-03-05 13:31:52 Fumihito YOSHIDA bug added subscriber Fumihito YOSHIDA
2017-03-07 06:42:50 Christian Ehrhardt  bug added subscriber ChristianEhrhardt
2017-03-08 08:10:33 Christian Ehrhardt  keepalived (Ubuntu): status Confirmed Fix Released
2017-03-08 09:18:52 Christian Ehrhardt  description Because "PIDFile=" directive is missing in the systemd unit file, keepalived sometimes fails to kill all old processes. The old processes remain with old settings and cause unexpected behaviors. The detail of this bug is described in this ticket in upstream: https://github.com/acassen/keepalived/issues/443. The official systemd unit file is available since version 1.2.24 by this commit: https://github.com/acassen/keepalived/commit/635ab69afb44cd8573663e62f292c6bb84b44f15 This includes "PIDFile" directive correctly: PIDFile=/var/run/keepalived.pid We should go the same way. I am using Ubuntu 16.04.1, kernel 4.4.0-45-generic. Package: keepalived Version: 1.2.19-1 ======================================================================= How to reproduce: I used the two instances of Ubuntu 16.04.2 on DigitalOcean: Configurations -------------- MASTER server's /etc/keepalived/keepalived.conf:   vrrp_script chk_nothing {      script "/bin/true"      interval 2   }   vrrp_instance G1 {     interface eth1     state BACKUP     priority 100     virtual_router_id 123     unicast_src_ip <primal IP>     unicast_peer {       <secondal IP>     }     track_script {       chk_nothing     }   } BACKUP server's /etc/keepalived/keepalived.conf:   vrrp_script chk_nothing {      script "/bin/true"      interval 2   }   vrrp_instance G1 {     interface eth1     state MASTER     priority 200     virtual_router_id 123     unicast_src_ip <secondal IP>     unicast_peer {       <primal IP>     }     track_script {       chk_nothing     }   } Procedures ---------- 1) Start keepalived on both servers   $ sudo systemctl start keepalived.service 2) Restart keepalived on either one   $ sudo systemctl restart keepalived.service 3) Check status and PID   $ systemctl status -n0 keepalived.service Result ------ 0) Before restart Main PID is 3402 and the subprocesses' PIDs are 3403-3406. So far so good.   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:37:12 UTC; 14min ago     Process: 3402 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 3403 (keepalived)       Tasks: 3      Memory: 1.7M         CPU: 1.900s      CGroup: /system.slice/keepalived.service              ├─3403 /usr/sbin/keepalived              ├─3405 /usr/sbin/keepalived              └─3406 /usr/sbin/keepalived 1) First restart Now Main PID is 3403, which was one of the previous subprocesses and is actually exited. Something is wrong. Yet, the previous processes are all exited; we are not likely to see no weird behaviors here.   root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:51:45 UTC; 1s ago     Process: 4782 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 3403 (code=exited, status=0/SUCCESS)       Tasks: 3      Memory: 1.7M         CPU: 11ms      CGroup: /system.slice/keepalived.service              ├─4783 /usr/sbin/keepalived              ├─4784 /usr/sbin/keepalived              └─4785 /usr/sbin/keepalived 2) Second restart Now Main PID is 4783 and subprocesses' PIDs are 4783-4785. This is problematic as 4783 is the old process, which should have exited before new processes arose. Therefore, keepalived remains in old settings while users believe it uses the new setting.   root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:51:49 UTC; 1s ago     Process: 4796 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 4783 (keepalived)       Tasks: 3      Memory: 1.7M         CPU: 6ms      CGroup: /system.slice/keepalived.service              ├─4783 /usr/sbin/keepalived              ├─4784 /usr/sbin/keepalived              └─4785 /usr/sbin/keepalived Because "PIDFile=" directive is missing in the systemd unit file, keepalived sometimes fails to kill all old processes. The old processes remain with old settings and cause unexpected behaviors. The detail of this bug is described in this ticket in upstream: https://github.com/acassen/keepalived/issues/443. The official systemd unit file is available since version 1.2.24 by this commit: https://github.com/acassen/keepalived/commit/635ab69afb44cd8573663e62f292c6bb84b44f15 This includes "PIDFile" directive correctly: PIDFile=/var/run/keepalived.pid We should go the same way. I am using Ubuntu 16.04.1, kernel 4.4.0-45-generic. Package: keepalived Version: 1.2.19-1 ======================================================================= How to reproduce: I used the two instances of Ubuntu 16.04.2 on DigitalOcean: Configurations -------------- MASTER server's /etc/keepalived/keepalived.conf:   vrrp_script chk_nothing {      script "/bin/true"      interval 2   }   vrrp_instance G1 {     interface eth1     state BACKUP     priority 100     virtual_router_id 123     unicast_src_ip <primal IP>     unicast_peer {       <secondal IP>     }     track_script {       chk_nothing     }   } BACKUP server's /etc/keepalived/keepalived.conf:   vrrp_script chk_nothing {      script "/bin/true"      interval 2   }   vrrp_instance G1 {     interface eth1     state MASTER     priority 200     virtual_router_id 123     unicast_src_ip <secondal IP>     unicast_peer {       <primal IP>     }     track_script {       chk_nothing     }   } Loop based probing for the Error to exist: ------------------------------------------ After the setup above start keepalived on both servers: $ sudo systemctl start keepalived.service Then run the following loop $ for j in $(seq 1 20); do sleep 11s; time for i in $(seq 1 5); do sudo systemctl restart keepalived; sudo systemctl status keepalived | egrep 'Main.*exited'; done; done Expected: no error, only time reports Error case: Showing Main PID exited, details below Step by Step Procedures ----------------------- 1) Start keepalived on both servers   $ sudo systemctl start keepalived.service 2) Restart keepalived on either one   $ sudo systemctl restart keepalived.service 3) Check status and PID   $ systemctl status -n0 keepalived.service Result ------ 0) Before restart Main PID is 3402 and the subprocesses' PIDs are 3403-3406. So far so good.   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:37:12 UTC; 14min ago     Process: 3402 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 3403 (keepalived)       Tasks: 3      Memory: 1.7M         CPU: 1.900s      CGroup: /system.slice/keepalived.service              ├─3403 /usr/sbin/keepalived              ├─3405 /usr/sbin/keepalived              └─3406 /usr/sbin/keepalived 1) First restart Now Main PID is 3403, which was one of the previous subprocesses and is actually exited. Something is wrong. Yet, the previous processes are all exited; we are not likely to see no weird behaviors here.   root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:51:45 UTC; 1s ago     Process: 4782 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 3403 (code=exited, status=0/SUCCESS)       Tasks: 3      Memory: 1.7M         CPU: 11ms      CGroup: /system.slice/keepalived.service              ├─4783 /usr/sbin/keepalived              ├─4784 /usr/sbin/keepalived              └─4785 /usr/sbin/keepalived 2) Second restart Now Main PID is 4783 and subprocesses' PIDs are 4783-4785. This is problematic as 4783 is the old process, which should have exited before new processes arose. Therefore, keepalived remains in old settings while users believe it uses the new setting.   root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:51:49 UTC; 1s ago     Process: 4796 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 4783 (keepalived)       Tasks: 3      Memory: 1.7M         CPU: 6ms      CGroup: /system.slice/keepalived.service              ├─4783 /usr/sbin/keepalived              ├─4784 /usr/sbin/keepalived              └─4785 /usr/sbin/keepalived
2017-03-08 09:19:41 Christian Ehrhardt  bug task added systemd (Ubuntu)
2017-03-08 10:44:22 Dimitri John Ledkov systemd (Ubuntu): milestone ubuntu-17.03
2017-03-08 10:44:23 Dimitri John Ledkov systemd (Ubuntu): assignee Dimitri John Ledkov (xnox)
2017-03-11 08:22:40 Christian Ehrhardt  keepalived (Ubuntu Xenial): status Confirmed Triaged
2017-03-11 08:22:42 Christian Ehrhardt  keepalived (Ubuntu Xenial): assignee ChristianEhrhardt (paelzer)
2017-03-13 11:10:44 Christian Ehrhardt  bug watch added http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857618
2017-03-13 11:10:44 Christian Ehrhardt  bug task added keepalived (Debian)
2017-03-13 12:19:37 Christian Ehrhardt  description Because "PIDFile=" directive is missing in the systemd unit file, keepalived sometimes fails to kill all old processes. The old processes remain with old settings and cause unexpected behaviors. The detail of this bug is described in this ticket in upstream: https://github.com/acassen/keepalived/issues/443. The official systemd unit file is available since version 1.2.24 by this commit: https://github.com/acassen/keepalived/commit/635ab69afb44cd8573663e62f292c6bb84b44f15 This includes "PIDFile" directive correctly: PIDFile=/var/run/keepalived.pid We should go the same way. I am using Ubuntu 16.04.1, kernel 4.4.0-45-generic. Package: keepalived Version: 1.2.19-1 ======================================================================= How to reproduce: I used the two instances of Ubuntu 16.04.2 on DigitalOcean: Configurations -------------- MASTER server's /etc/keepalived/keepalived.conf:   vrrp_script chk_nothing {      script "/bin/true"      interval 2   }   vrrp_instance G1 {     interface eth1     state BACKUP     priority 100     virtual_router_id 123     unicast_src_ip <primal IP>     unicast_peer {       <secondal IP>     }     track_script {       chk_nothing     }   } BACKUP server's /etc/keepalived/keepalived.conf:   vrrp_script chk_nothing {      script "/bin/true"      interval 2   }   vrrp_instance G1 {     interface eth1     state MASTER     priority 200     virtual_router_id 123     unicast_src_ip <secondal IP>     unicast_peer {       <primal IP>     }     track_script {       chk_nothing     }   } Loop based probing for the Error to exist: ------------------------------------------ After the setup above start keepalived on both servers: $ sudo systemctl start keepalived.service Then run the following loop $ for j in $(seq 1 20); do sleep 11s; time for i in $(seq 1 5); do sudo systemctl restart keepalived; sudo systemctl status keepalived | egrep 'Main.*exited'; done; done Expected: no error, only time reports Error case: Showing Main PID exited, details below Step by Step Procedures ----------------------- 1) Start keepalived on both servers   $ sudo systemctl start keepalived.service 2) Restart keepalived on either one   $ sudo systemctl restart keepalived.service 3) Check status and PID   $ systemctl status -n0 keepalived.service Result ------ 0) Before restart Main PID is 3402 and the subprocesses' PIDs are 3403-3406. So far so good.   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:37:12 UTC; 14min ago     Process: 3402 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 3403 (keepalived)       Tasks: 3      Memory: 1.7M         CPU: 1.900s      CGroup: /system.slice/keepalived.service              ├─3403 /usr/sbin/keepalived              ├─3405 /usr/sbin/keepalived              └─3406 /usr/sbin/keepalived 1) First restart Now Main PID is 3403, which was one of the previous subprocesses and is actually exited. Something is wrong. Yet, the previous processes are all exited; we are not likely to see no weird behaviors here.   root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:51:45 UTC; 1s ago     Process: 4782 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 3403 (code=exited, status=0/SUCCESS)       Tasks: 3      Memory: 1.7M         CPU: 11ms      CGroup: /system.slice/keepalived.service              ├─4783 /usr/sbin/keepalived              ├─4784 /usr/sbin/keepalived              └─4785 /usr/sbin/keepalived 2) Second restart Now Main PID is 4783 and subprocesses' PIDs are 4783-4785. This is problematic as 4783 is the old process, which should have exited before new processes arose. Therefore, keepalived remains in old settings while users believe it uses the new setting.   root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:51:49 UTC; 1s ago     Process: 4796 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 4783 (keepalived)       Tasks: 3      Memory: 1.7M         CPU: 6ms      CGroup: /system.slice/keepalived.service              ├─4783 /usr/sbin/keepalived              ├─4784 /usr/sbin/keepalived              └─4785 /usr/sbin/keepalived [Impact] * Restarts of keepalived can leave stale processes with the old configuration around. * The systemd detection of the MainPID is suboptimal, and combined with not waiting on signals being handled it can fail on second restart killing the (still) remaining process of the first start. * Upstream has a PIDFile statement, this has proven to avoid the issue in the MainPID guessing code of systemd. [Test Case] * Set up keepalived, the more complex the config is the "bigger" is the reace window, below in the description is a trivial sample config that works well. * As a test run the loop restarting the service head-to-head while staying under the max-restart limit $ for j in $(seq 1 20); do sleep 11s; time for i in $(seq 1 5); do sudo systemctl restart keepalived; sudo systemctl status keepalived | egrep 'Main.*exited'; done; done Expectation: no output other than timing Without fix: sometimes MainPIDs do no more exist, in these cases the child processes are the "old" ones from last execution with the old config. [Regression Potential] * Low because * A PIDFile statement is recommended by systemd for type=forking services anyway. * Upstream keepalived has this statement in their service file * By the kind of change, it should have no functional impact to other parts of the service other than for the PID detection of the job by Systemd. * Yet regression potential is never zero. There might be the unlikely case, which were considered working before due to a new config not properly being picked up. After the fix they will behave correctly and might show up as false-positives then if e.g. config was bad. [Other Info] * Usually a fix has to be in at least the latest Development release before SRUing it. But as I outlined below in later Releases than Xenial systemd seems to have improved making this change not-required. We haven't identified the bits for this (there is a bug task here), and they might as well be very complex. I think it is correct to fix Xenial in this regard with the simple change to the service file for now. * To eventually match I created a Debian bug task to ask them for the inclusion of the PIDFile so it can slowly tickle back down to newer Ubuntu Releases - also there more often people run backports where the issue might occur on older systemd versions (just as it does for us on Xenial) --- Because "PIDFile=" directive is missing in the systemd unit file, keepalived sometimes fails to kill all old processes. The old processes remain with old settings and cause unexpected behaviors. The detail of this bug is described in this ticket in upstream: https://github.com/acassen/keepalived/issues/443. The official systemd unit file is available since version 1.2.24 by this commit: https://github.com/acassen/keepalived/commit/635ab69afb44cd8573663e62f292c6bb84b44f15 This includes "PIDFile" directive correctly: PIDFile=/var/run/keepalived.pid We should go the same way. I am using Ubuntu 16.04.1, kernel 4.4.0-45-generic. Package: keepalived Version: 1.2.19-1 ======================================================================= How to reproduce: I used the two instances of Ubuntu 16.04.2 on DigitalOcean: Configurations -------------- MASTER server's /etc/keepalived/keepalived.conf:   vrrp_script chk_nothing {      script "/bin/true"      interval 2   }   vrrp_instance G1 {     interface eth1     state BACKUP     priority 100     virtual_router_id 123     unicast_src_ip <primal IP>     unicast_peer {       <secondal IP>     }     track_script {       chk_nothing     }   } BACKUP server's /etc/keepalived/keepalived.conf:   vrrp_script chk_nothing {      script "/bin/true"      interval 2   }   vrrp_instance G1 {     interface eth1     state MASTER     priority 200     virtual_router_id 123     unicast_src_ip <secondal IP>     unicast_peer {       <primal IP>     }     track_script {       chk_nothing     }   } Loop based probing for the Error to exist: ------------------------------------------ After the setup above start keepalived on both servers:     $ sudo systemctl start keepalived.service Then run the following loop     $ for j in $(seq 1 20); do sleep 11s; time for i in $(seq 1 5); do sudo systemctl restart keepalived; sudo systemctl status keepalived | egrep 'Main.*exited'; done; done Expected: no error, only time reports Error case: Showing Main PID exited, details below Step by Step Procedures ----------------------- 1) Start keepalived on both servers   $ sudo systemctl start keepalived.service 2) Restart keepalived on either one   $ sudo systemctl restart keepalived.service 3) Check status and PID   $ systemctl status -n0 keepalived.service Result ------ 0) Before restart Main PID is 3402 and the subprocesses' PIDs are 3403-3406. So far so good.   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:37:12 UTC; 14min ago     Process: 3402 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 3403 (keepalived)       Tasks: 3      Memory: 1.7M         CPU: 1.900s      CGroup: /system.slice/keepalived.service              ├─3403 /usr/sbin/keepalived              ├─3405 /usr/sbin/keepalived              └─3406 /usr/sbin/keepalived 1) First restart Now Main PID is 3403, which was one of the previous subprocesses and is actually exited. Something is wrong. Yet, the previous processes are all exited; we are not likely to see no weird behaviors here.   root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:51:45 UTC; 1s ago     Process: 4782 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 3403 (code=exited, status=0/SUCCESS)       Tasks: 3      Memory: 1.7M         CPU: 11ms      CGroup: /system.slice/keepalived.service              ├─4783 /usr/sbin/keepalived              ├─4784 /usr/sbin/keepalived              └─4785 /usr/sbin/keepalived 2) Second restart Now Main PID is 4783 and subprocesses' PIDs are 4783-4785. This is problematic as 4783 is the old process, which should have exited before new processes arose. Therefore, keepalived remains in old settings while users believe it uses the new setting.   root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived   root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived   ● keepalived.service - Keepalive Daemon (LVS and VRRP)      Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)      Active: active (running) since Sat 2017-03-04 01:51:49 UTC; 1s ago     Process: 4796 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)    Main PID: 4783 (keepalived)       Tasks: 3      Memory: 1.7M         CPU: 6ms      CGroup: /system.slice/keepalived.service              ├─4783 /usr/sbin/keepalived              ├─4784 /usr/sbin/keepalived              └─4785 /usr/sbin/keepalived
2017-03-13 12:38:33 Bug Watch Updater keepalived (Debian): status Unknown New
2017-03-23 22:42:08 Brian Murray keepalived (Ubuntu Xenial): status Triaged Fix Committed
2017-03-23 22:42:11 Brian Murray bug added subscriber Ubuntu Stable Release Updates Team
2017-03-23 22:42:13 Brian Murray bug added subscriber SRU Verification
2017-03-23 22:42:19 Brian Murray tags patch patch verification-needed
2017-03-24 07:18:24 Christian Ehrhardt  tags patch verification-needed patch verification-done
2017-03-29 15:46:06 Dimitri John Ledkov systemd (Ubuntu): assignee Dimitri John Ledkov (xnox)
2017-03-29 15:46:09 Dimitri John Ledkov systemd (Ubuntu): milestone ubuntu-17.03
2017-03-31 05:04:23 Launchpad Janitor keepalived (Ubuntu Xenial): status Fix Committed Fix Released
2017-03-31 05:04:27 Steve Langasek removed subscriber Ubuntu Stable Release Updates Team
2017-05-17 16:22:58 Dimitri John Ledkov systemd (Ubuntu): status New Incomplete
2017-05-17 16:23:00 Dimitri John Ledkov systemd (Ubuntu Xenial): status New Incomplete
2017-05-18 06:07:59 Christian Ehrhardt  systemd (Ubuntu): status Incomplete Opinion
2017-05-18 06:08:01 Christian Ehrhardt  systemd (Ubuntu Xenial): status Incomplete Opinion
2017-11-16 17:58:58 Bug Watch Updater keepalived (Debian): status New Fix Released