systemd-time-wait-sync.service stuck in "activating" state after boot, blocks timers from starting

Bug #1937238 reported by Rasmus Bondesson
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Undecided
Unassigned
Focal
Low
Dan Streetman

Bug Description

[impact]

systemd-time-wait-sync service sometimes misses sync completed event and remains in 'activating' state

[test case]

this isn't consistently reproducable, see original description for test case

[regression potential]

possible problems with the systemd-time-wait-sync service completing too early or not completing on time

[scope]

this is needed only for f

this is fixed upstream with commit f6f4f5fe5395a57f10dd446c7266c53f0673eaac which is in v246, so this is fixed in h and later already

the service does not exist in b so this does not apply there

[original description]

When I start my server running Ubuntu 20.04 the systemd-time-wait-sync.service is stuck in "activating" state. I noticed this because none of the systemd timer units triggered, because all the timers depend on systemd-time-wait-sync.service. Running "systemctl restart systemd-time-wait-sync.service" manually works around the problem.

Some logs and command outputs:

raek@mizar:~$ lsb_release -rd
Description: Ubuntu 20.04.2 LTS
Release: 20.04

raek@mizar:~$ systemctl | grep systemd-time-wait-sync.service
  systemd-time-wait-sync.service loaded activating start start Wait Until Kernel Time Synchronized

raek@mizar:~$ systemctl status systemd-time-wait-sync.service
● systemd-time-wait-sync.service - Wait Until Kernel Time Synchronized
     Loaded: loaded (/lib/systemd/system/systemd-time-wait-sync.service; enabled; vendor preset: enabled)
     Active: activating (start) since Thu 2021-07-22 11:06:52 CEST; 27min ago
       Docs: man:systemd-time-wait-sync.service(8)
   Main PID: 514 (systemd-time-wa)
      Tasks: 1 (limit: 9415)
     Memory: 972.0K
     CGroup: /system.slice/systemd-time-wait-sync.service
             └─514 /lib/systemd/systemd-time-wait-sync

Jul 22 11:06:52 mizar systemd-time-wait-sync[514]: adjtime state 5 status 40 time Thu 2021-07-22 09:06:52.216338 UTC
Warning: journal has been rotated since unit was started, output may be incomplete.

raek@mizar:~$ journalctl -b -u systemd-time-wait-sync.service
-- Logs begin at Wed 2020-07-08 16:34:13 CEST, end at Thu 2021-07-22 11:36:44 CEST. --
Jul 22 11:06:52 mizar systemd-time-wait-sync[514]: adjtime state 5 status 40 time Thu 2021-07-22 09:06:52.216338 UTC

raek@mizar:~$ dpkg -S /lib/systemd/system/systemd-time-wait-sync.service
systemd: /lib/systemd/system/systemd-time-wait-sync.service

raek@mizar:~$ apt-cache policy systemd
systemd:
  Installed: 245.4-4ubuntu3.11
  Candidate: 245.4-4ubuntu3.11
  Version table:
 *** 245.4-4ubuntu3.11 500
        500 http://se.archive.ubuntu.com/ubuntu focal-security/main amd64 Packages
        100 /var/lib/dpkg/status
     245.4-4ubuntu3.10 500
        500 http://se.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages
     245.4-4ubuntu3.8 400
        400 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 Packages
     245.4-4ubuntu3 500
        500 http://se.archive.ubuntu.com/ubuntu focal/main amd64 Packages

Revision history for this message
Rasmus Bondesson (raek) wrote :

This started happen to me in January of February. This is easy for me to reproduce since it happens at every boot. I would be happy to provide more information or run more tests if needed.

Revision history for this message
Dan Streetman (ddstreet) wrote :

This is almost certainly because you don't have the systemd-timesyncd package installed, or you don't have the systemd-timesyncd service enabled.

On a very quick glance, it seems the systemd-time-wait-sync service should have been bundled into the systemd-timesyncd package when it was split from the main systemd package, however I'll have to look closer to see if there is some reason it wasn't.

description: updated
Revision history for this message
Rasmus Bondesson (raek) wrote :

Hmm. The systemd-timesyncd package is installed and systemd-timesyncd.service is enabled.

raek@mizar:~/code/raek-totp-token/firmware$ apt-cache policy systemd-timesyncd
systemd-timesyncd:
  Installed: 245.4-4ubuntu3.11
  Candidate: 245.4-4ubuntu3.11
  Version table:
 *** 245.4-4ubuntu3.11 500
        500 http://se.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages
        500 http://se.archive.ubuntu.com/ubuntu focal-security/main amd64 Packages
        100 /var/lib/dpkg/status
     245.4-4ubuntu3 500
        500 http://se.archive.ubuntu.com/ubuntu focal/main amd64 Packages
raek@mizar:~/code/raek-totp-token/firmware$ systemctl status systemd-timesyncd.service
● systemd-timesyncd.service - Network Time Synchronization
     Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2021-07-22 11:06:53 CEST; 3 days ago
       Docs: man:systemd-timesyncd.service(8)
   Main PID: 792 (systemd-timesyn)
     Status: "Initial synchronization to time server [2a01:3f7::1]:123 (ntp.se)."
      Tasks: 2 (limit: 9415)
     Memory: 3.2M
     CGroup: /system.slice/systemd-timesyncd.service
             └─792 /lib/systemd/systemd-timesyncd

Jul 22 11:06:53 mizar systemd[1]: Starting Network Time Synchronization...
Jul 22 11:06:53 mizar systemd[1]: Started Network Time Synchronization.
Jul 22 11:06:59 mizar systemd-timesyncd[792]: Network configuration changed, trying to establish connection.
Jul 22 11:07:01 mizar systemd-timesyncd[792]: Network configuration changed, trying to establish connection.
Jul 22 11:07:02 mizar systemd-timesyncd[792]: Network configuration changed, trying to establish connection.
Jul 22 11:07:02 mizar systemd-timesyncd[792]: Network configuration changed, trying to establish connection.
Jul 22 11:07:04 mizar systemd-timesyncd[792]: Network configuration changed, trying to establish connection.
Jul 22 11:07:04 mizar systemd-timesyncd[792]: Initial synchronization to time server [2a01:3f7::1]:123 (ntp.se).

I have done some NTP-related configuration (added an NTP server more close by(. I'll try to find all deviations from the default I have made and come back.

Revision history for this message
Rasmus Bondesson (raek) wrote :

This is my contents of /etc/systemd/timesyncd.conf

# This file is part of systemd.
#
# systemd is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.
#
# Entries in this file show the compile time defaults.
# You can change settings by editing this file.
# Defaults can be restored by simply deleting this file.
#
# See timesyncd.conf(5) for details.

[Time]
NTP=ntp.se se.pool.ntp.org
FallbackNTP=ntp.ubuntu.com
#RootDistanceMaxSec=5
#PollIntervalMinSec=32
#PollIntervalMaxSec=2048

And this is the outputs of some timedatectl status commands:

raek@mizar:~$ timedatectl status
               Local time: Sun 2021-07-25 20:13:01 CEST
           Universal time: Sun 2021-07-25 18:13:01 UTC
                 RTC time: Sun 2021-07-25 18:13:01
                Time zone: Europe/Stockholm (CEST, +0200)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no
raek@mizar:~$ timedatectl timesync-status
       Server: 2a01:3f7::1 (ntp.se)
Poll interval: 34min 8s (min: 32s; max 34min 8s)
         Leap: normal
      Version: 4
      Stratum: 1
    Reference: PPS
    Precision: 1us (-25)
Root distance: 30us (max: 5s)
       Offset: -104us
        Delay: 11.063ms
       Jitter: 5.968ms
 Packet count: 147
    Frequency: +4.989ppm

Revision history for this message
Dan Streetman (ddstreet) wrote :

> Hmm. The systemd-timesyncd package is installed and systemd-timesyncd.service is enabled.

can you attach journalctl output from a full boot reproducing the problem? I used your exact ntp config and it works fine for me, though the systemd-time-wait-sync.service does of course stay 'activating' for a few seconds until time is actually synced.

no longer affects: systemd (Ubuntu Focal)
no longer affects: systemd (Ubuntu Groovy)
no longer affects: systemd (Ubuntu Hirsute)
no longer affects: systemd (Ubuntu Impish)
description: updated
Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Dan Streetman (ddstreet) wrote :

I opened bug 1938058 for the problem of moving systemd-time-sync-wait.service into the systemd-timesyncd package, leaving this bug just for looking at your (possible) problem

Revision history for this message
Rasmus Bondesson (raek) wrote :

Here is the output from the journal from a boot where time-wait-sync.service "stuck". (I removed some line relating to syncthing and to a private IRC bot I run.) After boot I logged in, checked "journalctl" and saw that time-wait-sync.service was "activating" and then ran "timedatectl status" and "timedatectl timesync-status". The outputs said that time was synced. Now that I look in the journal, I see these interesting lines:

Jul 26 22:19:41 mizar dbus-daemon[835]: [system] Activating via systemd: service name='org.freedesktop.timedate1' unit='dbus-org.freedesktop.timedate1.service' requested by ':1.29' (uid=1000 pid=1406 comm="timedatectl status " label="unconfined")
Jul 26 22:19:41 mizar systemd[1]: Starting Time & Date Service...
Jul 26 22:19:41 mizar dbus-daemon[835]: [system] Successfully activated service 'org.freedesktop.timedate1'
Jul 26 22:19:41 mizar systemd[1]: Started Time & Date Service.

Maybe that means that some service started because I tried to check the status? Just a thought.

Revision history for this message
Rasmus Bondesson (raek) wrote :

I found these two bug tickets that could be relevant:

https://github.com/home-assistant/operating-system/issues/896

https://github.com/systemd/systemd/issues/8683

I should also add that this seems to happen on the majority of boots for me, but not every boot. So there is some amount of non-determinism involved.

I'm not familiar with what service / tool / part of kernel does what in the boot of an Ubuntu 20.04 system, but I might read up on what systemd-time-wait-sync.service waits for to see if I can figure out why that does not happen.

Anyway, thanks for helping me debug this issue!

Revision history for this message
Rasmus Bondesson (raek) wrote :

One more boot. This time I checked the timestamp of the /run/systemd/timesync/synchronized file (and did not run any timedatectl commands):

raek@mizar:~$ ls -la /run/systemd/timesync
total 0
drwxr-xr-x 2 systemd-timesync systemd-timesync 60 Jul 26 22:51 .
drwxr-xr-x 24 root root 540 Jul 26 22:52 ..
-rw-r--r-- 1 systemd-timesync systemd-timesync 0 Jul 26 22:53 synchronized

Revision history for this message
Rasmus Bondesson (raek) wrote :

My problem looks similar to the on reported in the Home Assistant bug tracker.

raek@mizar:~$ systemctl status systemd-time-wait-sync
... *SNIP* ...
             └─508 /lib/systemd/systemd-time-wait-sync

raek@mizar:~$ sudo lsof -nPp 508
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
... *SNIP* ...
systemd-t 508 root 0r CHR 1,3 0t0 6 /dev/null
systemd-t 508 root 1u unix 0xffff99d964ad0c00 0t0 16894 type=STREAM
systemd-t 508 root 2u unix 0xffff99d964ad0c00 0t0 16894 type=STREAM
systemd-t 508 root 3u a_inode 0,14 0 11428 [eventpoll]
systemd-t 508 root 4u a_inode 0,14 0 11428 [signalfd]
systemd-t 508 root 5r a_inode 0,14 0 11428 inotify
systemd-t 508 root 6u a_inode 0,14 0 11428 [timerfd]

raek@mizar:~$ sudo strace -ttTp 508
strace: Process 508 attached
22:59:52.923972 gettid() = 508 <0.000023>
22:59:52.925523 epoll_wait(3,

raek@mizar:~$ sudo cat /proc/508/fdinfo/5
pos: 0
flags: 02004000
mnt_id: 15
inotify wd:1 ino:17b sdev:18 mask:100 ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:ac95aa8d7b010000

raek@mizar:~$ sudo cat /proc/508/mountinfo | grep :15
39 34 0:33 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,memory

The mountinfo lines look different for me than for tmm1 (see previously linked home assistant ticket).

Revision history for this message
Dan Streetman (ddstreet) wrote :

> https://github.com/home-assistant/operating-system/issues/896

yes the upstream commit to fix this does look like what would cause this, thanks!

Dan Streetman (ddstreet)
description: updated
Changed in systemd (Ubuntu):
status: Incomplete → Fix Released
Changed in systemd (Ubuntu Focal):
status: New → In Progress
importance: Undecided → Low
assignee: nobody → Dan Streetman (ddstreet)
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

But the things mentioned in systemd issue were supposedly resolved, and https://github.com/systemd/systemd/commit/d84af414180a4a8a7dd8772fc9d5302b5f9f28c9 is in focal already...... so is there something else not working in focal?

Revision history for this message
Dan Streetman (ddstreet) wrote :

> so is there something else not working in focal?

as noted in the [scope] sru template section this is fixed by upstream commit f6f4f5fe5395a57f10dd446c7266c53f0673eaac which isn't in focal

Revision history for this message
Dan Streetman (ddstreet) wrote :

Rasmus are you able to test with the systemd build from this ppa to see if it fixes the problem for you:
https://launchpad.net/~ddstreet/+archive/ubuntu/systemd

Revision history for this message
Rasmus Bondesson (raek) wrote :

Dan: I just added the PPA, upgraded the packages and did 3 reboots. In all cases "systemctl" showed "starting" when I logged in and then "started" some seconds later. So it seems like the problem is solved!

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.