Same setup as above in two Xenial VMs
Again I see the backup taking over and reverting to backup once the restart is complete.
Note: the restart is much faster now, like it is just sending a signal or so while on zesty it was feeling like waiting for completion.
0) pre restart
Process: 2416 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS
Main PID: 2418 (keepalived)
├─2418 /usr/sbin/keepalived
├─2419 /usr/sbin/keepalived
└─2420 /usr/sbin/keepalived
1) First restart
Process: 2559 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS
Main PID: 2560 (keepalived)
├─2560 /usr/sbin/keepalived
├─2561 /usr/sbin/keepalived
└─2562 /usr/sbin/keepalived
WTF ??
Your repro seems so complete I thought I might have missed something.
So I tried around and I think I found something which relates to my note above (on newer evrsion waiting)
I was able to reproduce your case by restarting and then restarting very soon (i.e. faster than the wait on zesty was).
Now I got your case:
Main PID: 2848 (code=exited, status=0/SUCCESS)
Being the old one, but of course gone.
And after the next restart the childs being the "old ones"
Same setup as above in two Xenial VMs
Again I see the backup taking over and reverting to backup once the restart is complete.
Note: the restart is much faster now, like it is just sending a signal or so while on zesty it was feeling like waiting for completion.
0) pre restart /usr/sbin/ keepalived $DAEMON_ARGS keepalived keepalived keepalived
Process: 2416 ExecStart=
Main PID: 2418 (keepalived)
├─2418 /usr/sbin/
├─2419 /usr/sbin/
└─2420 /usr/sbin/
1) First restart /usr/sbin/ keepalived $DAEMON_ARGS keepalived keepalived keepalived
Process: 2559 ExecStart=
Main PID: 2560 (keepalived)
├─2560 /usr/sbin/
├─2561 /usr/sbin/
└─2562 /usr/sbin/
WTF ??
Your repro seems so complete I thought I might have missed something.
So I tried around and I think I found something which relates to my note above (on newer evrsion waiting)
I was able to reproduce your case by restarting and then restarting very soon (i.e. faster than the wait on zesty was).
Now I got your case:
Main PID: 2848 (code=exited, status=0/SUCCESS)
Being the old one, but of course gone.
And after the next restart the childs being the "old ones"