systemctl reload unbound command times out
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
unbound (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Impact]
* Due to a bug in the upstream code on a reload command all worker
processes will exit and the reload will fail. After a while a kill-
detection kicks and restarts things, but the reload was intended to not
shred the main process after all.
* Backport upstream fix to Bionic to unbreak reloading the service
[Test Case]
Note: can be tested in a LXD container
* Install unbound
$ apt get install unbound
* Reload it
$ systemctl reload unbound
Job for unbound.service failed because a timeout was exceeded.
See "systemctl status unbound.service" and "journalctl -xe" for
details.
* You might also see the timeout detection kick in and restarting the service now.
systemd[1]: unbound.service: State 'stop-sigterm' timed out. Killing.
systemd[1]: unbound.service: Killing process 13713 (unbound) with signal SIGKILL.
systemd[1]: unbound.service: Main process exited, code=killed, status=9/KILL
systemd[1]: unbound.service: Failed with result 'timeout'.
systemd[1]: Reload failed for Unbound DNS server.
* This takes a while and can break the service, to reset for a new test
you might need to start the service again.
[Regression Potential]
* The code changed is rather small, which is good for a review. I'd think
if there is a regression it might be around unexpected conditions
leading to not all workers shutting down in cases one wants to shut
them down. I wasn't able to trigger such in my tests, but that is what
would come to my mind.
[Other Info]
* Test PPA at https:/
--
Unbound 1.6.7-1ubuntu2.1 on 18.04 (bionic):
$ sudo systemctl reload unbound
Job for unbound.service failed because a timeout was exceeded.
See "systemctl status unbound.service" and "journalctl -xe" for details.
Looks like https:/
Related branches
- Andreas Hasenack: Approve
- Canonical Server packageset reviewers: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 66 lines (+45/-0)3 files modifieddebian/changelog (+7/-0)
debian/patches/lp-1788622-fix-systemd-reload.patch (+37/-0)
debian/patches/series (+1/-0)
Changed in unbound (Ubuntu): | |
status: | Triaged → Fix Released |
Changed in unbound (Ubuntu Bionic): | |
status: | New → Triaged |
description: | updated |
Hi Alexey,
I agree - I can reproduce the error and see just the same.
Bionic - reload hangs and then fails, eventually the service is fully restarted due to the kill after timeout
Cosmic - working fine and reports:
The logs contain a few unrelated warnings as I run it in a container, but one can see the difference. Both logs show
1. install (and start of service)
2. calling systemctl reload
Bionic: helper[ 13708]: /var/lib/ unbound/ root.key does not exist, copying from /usr/share/ dns/root. key helper[ 13708]: /var/lib/ unbound/ root.key has content helper[ 13708]: success: the anchor is ok control[ 13840]: ok
$ journalctl -u unbound
-- Logs begin at Fri 2018-08-10 09:34:15 UTC, end at Fri 2018-08-24 05:21:24 UTC. --
Aug 24 05:19:42 b systemd[1]: unbound.service: Failed to reset devices.list: Operation not permitted
Aug 24 05:19:42 b systemd[1]: Starting Unbound DNS server...
Aug 24 05:19:42 b package-
Aug 24 05:19:42 b package-
Aug 24 05:19:42 b package-
Aug 24 05:19:42 b unbound[13713]: [13713:0] notice: init module 0: subnet
Aug 24 05:19:42 b unbound[13713]: [13713:0] notice: init module 1: validator
Aug 24 05:19:42 b unbound[13713]: [13713:0] notice: init module 2: iterator
Aug 24 05:19:42 b unbound[13713]: [13713:0] info: start of service (unbound 1.6.7).
Aug 24 05:19:42 b systemd[1]: Started Unbound DNS server.
Aug 24 05:19:42 b systemd[1]: unbound.service: Failed to reset devices.list: Operation not permitted
Aug 24 05:19:43 b systemd[1]: unbound.service: Failed to reset devices.list: Operation not permitted
Aug 24 05:19:43 b systemd[1]: unbound.service: Failed to reset devices.list: Operation not permitted
Aug 24 05:19:54 b systemd[1]: Reloading Unbound DNS server.
Aug 24 05:19:54 b unbound[13713]: [13713:0] info: service stopped (unbound 1.6.7).
Aug 24 05:19:54 b unbound-
Aug 24 05:19:54 b unbound[13713]: [13713:0] info: server stats for thread 0: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
Aug 24 05:19:54 b unbound[13713]: [13713:0] info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0
Aug 24 05:19:54 b unbound[13713]: [13713:0] notice: Restart of unbound 1.6.7.
Aug 24 05:19:54 b unbound[13713]: [13713:0] notice: init module 0: subnet
Aug 24 05:19:54 b unbound[13713]: [13713:0] notice: init module 1: validator
Aug 24 05:19:54 b unbound[13713]: [13713:0] notice: init module 2: iterator
Aug 24 05:19:54 b unbound[13713]: [13713:0] info: start of service (unbound 1.6.7).
Aug 24 05:21:24 b systemd[1]: unbound.service: State 'stop-sigterm' timed out. Killing.
Aug 24 05:21:24 b systemd[1]: unbound.service: Killing process 13713 (unbound) with signal SIGKILL.
Aug 24 05:21:24 b systemd[1]: unbound.service: Main process exited, code=killed, status=9/KILL
Aug 24 05:21:24 b systemd[1]: unbound.service: Failed with result 'timeout'.
Aug 24 05:21:24 b systemd[1]: Reload failed for Unbound DNS server.
Aug 24 05:21:24 b systemd[1]: unbound.service: Service hold-off time over, scheduling restart.
Aug 24 05:21:24 b systemd[1]: unbound.service: Scheduled restart job, restart counter is at 1.
Aug 24 05:21:24 b systemd[1]: Stopped Unbound DNS server.
Aug 24 05:21:24 b systemd[1]: unbound.service...