Comment 15 for bug 1644530

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

First check with the head-to-head restart is confirming my former assumption on Zesty it is:
1. not failing
2. taking longer each restart

I wanted to go further but a trivial test comparing with this:
$ time for i in $(seq 1 200); do sudo systemctl restart keepalived; sudo systemctl status keepalived | egrep 'Main.*exited'; done
Failed for the default start limit kicking in.

Well, lets start softer and go with reload first.
$ time for i in $(seq 1 200); do sudo systemctl reload keepalived; sudo systemctl status keepalived | egrep 'Main.*exited'; done

That works fine on both (but also the HUP just forces a reload and the PIDs stay). So this by-design does not fall into the same fault.

Ok, so with that lets obey the default start-limit of 5 starts per 10 seconds and compare again.
Xenial:
$ time for i in $(seq 1 5); do sudo systemctl restart keepalived; sudo systemctl status keepalived | egrep 'Main.*exited'; done
 Main PID: 8800 (code=exited, status=0/SUCCESS)
 Main PID: 8836 (code=exited, status=0/SUCCESS)

real 0m0.156s
user 0m0.008s
sys 0m0.000s

Note: 2 cases is all we can get, restart #1 works, #2 is too early triggering the issue, #3 cleans up, #4 triggering again, #5 cleaning up again.

Zesty:
$ time for i in $(seq 1 5); do sudo systemctl restart keepalived; sudo systemctl status keepalived | egrep 'Main.*exited'; done

real 0m2,258s
user 0m0,012s
sys 0m0,000s

So that seems to be the repro we need:
- head to head restarts with no time in between
- showing the error on Xenial
- showing it is not occuring on Zesty
- showing something on Zesty makes it "wait" which is what avoids the issue