systemd RestartSec does not seem to work consistently

Bug #1711255 reported by Leo Davis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

The unit file for my service xxxx-33005.service:

[Unit]
Description=product xxxx 33005 service
Wants=some.mount
After=some.mount
PartOf=product.target
PartOf=someother.service
[Service]
Type=simple
ExecStart=/opt/product/bin/xxxx -conf=/etc/product.conf
Restart=on-failure
User=product
Group=product
LimitCORE=infinity
LimitNOFILE=10000
Slice=xxxx.slice
WatchdogSec=120
RestartSec=10
StartLimitInterval=120
StartLimitBurst=10
[Install]
WantedBy=product.target

specifies RestartSec=10, which is also what systemd sees:

$ systemctl show xxxx-33005.service
Type=simple
Restart=on-failure
NotifyAccess=main
RestartUSec=10s
...

However, today I saw in the journalctl log:
$ journalctl -u xxxx-33005.service
...
Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Watchdog timeout (limit 2min)!
Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Unit entered failed state.
Aug 15 23:28:29 a-hostname systemd[1]: xxxx-33005.service: Failed with result 'exit-code'.
Aug 15 23:28:39 a-hostname systemd[1]: xxxx-33005.service: Service hold-off time over, scheduling restart.
Aug 15 23:28:39 a-hostname systemd[1]: Stopped product xxxx 33005 service.
Aug 15 23:28:39 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:28:45 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:28:45 a-hostname systemd[1]: Stopping product xxxx 33005 service...
Aug 15 23:28:45 a-hostname systemd[1]: Stopped product xxxx 33005 service.
Aug 15 23:33:51 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:33:51 a-hostname xxxx[3629]: Get http://127.0.0.1:8081/: dial tcp 127.0.0.1:8081: getsockopt: connection refused
Aug 15 23:33:51 a-hostname systemd[1]: xxxx-33005.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Aug 15 23:33:51 a-hostname systemd[1]: xxxx-33005.service: Unit entered failed state.
Aug 15 23:33:51 a-hostname systemd[1]: xxxx-33005.service: Failed with result 'exit-code'.
Aug 15 23:34:01 a-hostname systemd[1]: xxxx-33005.service: Service hold-off time over, scheduling restart.
Aug 15 23:34:02 a-hostname systemd[1]: Stopped product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: Started product xxxx 33005 service.
Aug 15 23:34:02 a-hostname systemd[1]: xxxx-33005.service: Start request repeated too quickly.
Aug 15 23:34:02 a-hostname systemd[1]: Failed to start product xxxx 33005 service.

Everything looks to me that systemd is behaving as expected until Aug 15 23:34:02 when it attempts to start xxxx-33005.service not at the rate I specified in RestartSec, but possibly at the default 100ms rate. This excerpt shows both the expected 10s restart rate as well as the unexpected restart rate.

$ lsb_release -rd
Description: Ubuntu 16.04.1 LTS
Release: 16.04

$ uname -a
Linux a-hostname 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

$ apt-cache policy systemd
systemd:
  Installed: 229-4ubuntu16
  Candidate: 229-4ubuntu19
  Version table:
     229-4ubuntu19 500
        500 http://internal-repo/ubuntu xenial-updates/main amd64 Packages
 *** 229-4ubuntu16 100
        100 /var/lib/dpkg/status
     229-4ubuntu10 500
        500 http://internal-repo/ubuntu xenial-security/main amd64 Packages
     229-4ubuntu4 500
        500 http://internal-repo/ubuntu xenial/main amd64 Packages

Leo Davis (leo-davis)
description: updated
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

A minimal working reproducer with e.g. ExecStart=/bin/false would be useful. To remove the uncertainties w.r.t. executables involved. As I do not have access to the masked product.

Is this reproducible in Ubuntu Artful which has much newer systemd?

Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Leo Davis (leo-davis) wrote :

>A minimal working reproducer with e.g. ExecStart=/bin/false would be useful. To remove the >uncertainties w.r.t. executables involved. As I do not have access to the masked product.

OK, I'll try and reproduce it with that.

>Is this reproducible in Ubuntu Artful which has much newer systemd?

Never tried running our software on Artful since Xenial is our current target platform.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for systemd (Ubuntu) because there has been no activity for 60 days.]

Changed in systemd (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.