systemd-time-wait-sync.service stuck in "activating" state after boot, blocks timers from starting

Bug #1937238 reported by Rasmus Bondesson
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Undecided
Unassigned
Focal
Low
Dan Streetman

Bug Description

[impact]

systemd-time-wait-sync service sometimes misses sync completed event and remains in 'activating' state

[test case]

this isn't consistently reproducable, see original description for test case

[regression potential]

possible problems with the systemd-time-wait-sync service completing too early or not completing on time

[scope]

this is needed only for f

this is fixed upstream with commit f6f4f5fe5395a57f10dd446c7266c53f0673eaac which is in v246, so this is fixed in h and later already

the service does not exist in b so this does not apply there

[original description]

When I start my server running Ubuntu 20.04 the systemd-time-wait-sync.service is stuck in "activating" state. I noticed this because none of the systemd timer units triggered, because all the timers depend on systemd-time-wait-sync.service. Running "systemctl restart systemd-time-wait-sync.service" manually works around the problem.

Some logs and command outputs:

raek@mizar:~$ lsb_release -rd
Description: Ubuntu 20.04.2 LTS
Release: 20.04

raek@mizar:~$ systemctl | grep systemd-time-wait-sync.service
  systemd-time-wait-sync.service loaded activating start start Wait Until Kernel Time Synchronized

raek@mizar:~$ systemctl status systemd-time-wait-sync.service
● systemd-time-wait-sync.service - Wait Until Kernel Time Synchronized
     Loaded: loaded (/lib/systemd/system/systemd-time-wait-sync.service; enabled; vendor preset: enabled)
     Active: activating (start) since Thu 2021-07-22 11:06:52 CEST; 27min ago
       Docs: man:systemd-time-wait-sync.service(8)
   Main PID: 514 (systemd-time-wa)
      Tasks: 1 (limit: 9415)
     Memory: 972.0K
     CGroup: /system.slice/systemd-time-wait-sync.service
             └─514 /lib/systemd/systemd-time-wait-sync

Jul 22 11:06:52 mizar systemd-time-wait-sync[514]: adjtime state 5 status 40 time Thu 2021-07-22 09:06:52.216338 UTC
Warning: journal has been rotated since unit was started, output may be incomplete.

raek@mizar:~$ journalctl -b -u systemd-time-wait-sync.service
-- Logs begin at Wed 2020-07-08 16:34:13 CEST, end at Thu 2021-07-22 11:36:44 CEST. --
Jul 22 11:06:52 mizar systemd-time-wait-sync[514]: adjtime state 5 status 40 time Thu 2021-07-22 09:06:52.216338 UTC

raek@mizar:~$ dpkg -S /lib/systemd/system/systemd-time-wait-sync.service
systemd: /lib/systemd/system/systemd-time-wait-sync.service

raek@mizar:~$ apt-cache policy systemd
systemd:
  Installed: 245.4-4ubuntu3.11
  Candidate: 245.4-4ubuntu3.11
  Version table:
 *** 245.4-4ubuntu3.11 500
        500 http://se.archive.ubuntu.com/ubuntu focal-security/main amd64 Packages
        100 /var/lib/dpkg/status
     245.4-4ubuntu3.10 500
        500 http://se.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages
     245.4-4ubuntu3.8 400
        400 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 Packages
     245.4-4ubuntu3 500
        500 http://se.archive.ubuntu.com/ubuntu focal/main amd64 Packages

Revision history for this message
Rasmus Bondesson (raek) wrote :

This started happen to me in January of February. This is easy for me to reproduce since it happens at every boot. I would be happy to provide more information or run more tests if needed.

Revision history for this message
Dan Streetman (ddstreet) wrote :

This is almost certainly because you don't have the systemd-timesyncd package installed, or you don't have the systemd-timesyncd service enabled.

On a very quick glance, it seems the systemd-time-wait-sync service should have been bundled into the systemd-timesyncd package when it was split from the main systemd package, however I'll have to look closer to see if there is some reason it wasn't.

description: updated
Revision history for this message
Rasmus Bondesson (raek) wrote :

Hmm. The systemd-timesyncd package is installed and systemd-timesyncd.service is enabled.

raek@mizar:~/code/raek-totp-token/firmware$ apt-cache policy systemd-timesyncd
systemd-timesyncd:
  Installed: 245.4-4ubuntu3.11
  Candidate: 245.4-4ubuntu3.11
  Version table:
 *** 245.4-4ubuntu3.11 500
        500 http://se.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages
        500 http://se.archive.ubuntu.com/ubuntu focal-security/main amd64 Packages
        100 /var/lib/dpkg/status
     245.4-4ubuntu3 500
        500 http://se.archive.ubuntu.com/ubuntu focal/main amd64 Packages
raek@mizar:~/code/raek-totp-token/firmware$ systemctl status systemd-timesyncd.service
● systemd-timesyncd.service - Network Time Synchronization
     Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2021-07-22 11:06:53 CEST; 3 days ago
       Docs: man:systemd-timesyncd.service(8)
   Main PID: 792 (systemd-timesyn)
     Status: "Initial synchronization to time server [2a01:3f7::1]:123 (ntp.se)."
      Tasks: 2 (limit: 9415)
     Memory: 3.2M
     CGroup: /system.slice/systemd-timesyncd.service
             └─792 /lib/systemd/systemd-timesyncd

Jul 22 11:06:53 mizar systemd[1]: Starting Network Time Synchronization...
Jul 22 11:06:53 mizar systemd[1]: Started Network Time Synchronization.
Jul 22 11:06:59 mizar systemd-timesyncd[792]: Network configuration changed, trying to establish connection.
Jul 22 11:07:01 mizar systemd-timesyncd[792]: Network configuration changed, trying to establish connection.
Jul 22 11:07:02 mizar systemd-timesyncd[792]: Network configuration changed, trying to establish connection.
Jul 22 11:07:02 mizar systemd-timesyncd[792]: Network configuration changed, trying to establish connection.
Jul 22 11:07:04 mizar systemd-timesyncd[792]: Network configuration changed, trying to establish connection.
Jul 22 11:07:04 mizar systemd-timesyncd[792]: Initial synchronization to time server [2a01:3f7::1]:123 (ntp.se).

I have done some NTP-related configuration (added an NTP server more close by(. I'll try to find all deviations from the default I have made and come back.

Revision history for this message
Rasmus Bondesson (raek) wrote :

This is my contents of /etc/systemd/timesyncd.conf

# This file is part of systemd.
#
# systemd is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.
#
# Entries in this file show the compile time defaults.
# You can change settings by editing this file.
# Defaults can be restored by simply deleting this file.
#
# See timesyncd.conf(5) for details.

[Time]
NTP=ntp.se se.pool.ntp.org
FallbackNTP=ntp.ubuntu.com
#RootDistanceMaxSec=5
#PollIntervalMinSec=32
#PollIntervalMaxSec=2048

And this is the outputs of some timedatectl status commands:

raek@mizar:~$ timedatectl status
               Local time: Sun 2021-07-25 20:13:01 CEST
           Universal time: Sun 2021-07-25 18:13:01 UTC
                 RTC time: Sun 2021-07-25 18:13:01
                Time zone: Europe/Stockholm (CEST, +0200)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no
raek@mizar:~$ timedatectl timesync-status
       Server: 2a01:3f7::1 (ntp.se)
Poll interval: 34min 8s (min: 32s; max 34min 8s)
         Leap: normal
      Version: 4
      Stratum: 1
    Reference: PPS
    Precision: 1us (-25)
Root distance: 30us (max: 5s)
       Offset: -104us
        Delay: 11.063ms
       Jitter: 5.968ms
 Packet count: 147
    Frequency: +4.989ppm

Revision history for this message
Dan Streetman (ddstreet) wrote :

> Hmm. The systemd-timesyncd package is installed and systemd-timesyncd.service is enabled.

can you attach journalctl output from a full boot reproducing the problem? I used your exact ntp config and it works fine for me, though the systemd-time-wait-sync.service does of course stay 'activating' for a few seconds until time is actually synced.

no longer affects: systemd (Ubuntu Focal)
no longer affects: systemd (Ubuntu Groovy)
no longer affects: systemd (Ubuntu Hirsute)
no longer affects: systemd (Ubuntu Impish)
description: updated
Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Dan Streetman (ddstreet) wrote :

I opened bug 1938058 for the problem of moving systemd-time-sync-wait.service into the systemd-timesyncd package, leaving this bug just for looking at your (possible) problem

Revision history for this message
Rasmus Bondesson (raek) wrote :

Here is the output from the journal from a boot where time-wait-sync.service "stuck". (I removed some line relating to syncthing and to a private IRC bot I run.) After boot I logged in, checked "journalctl" and saw that time-wait-sync.service was "activating" and then ran "timedatectl status" and "timedatectl timesync-status". The outputs said that time was synced. Now that I look in the journal, I see these interesting lines:

Jul 26 22:19:41 mizar dbus-daemon[835]: [system] Activating via systemd: service name='org.freedesktop.timedate1' unit='dbus-org.freedesktop.timedate1.service' requested by ':1.29' (uid=1000 pid=1406 comm="timedatectl status " label="unconfined")
Jul 26 22:19:41 mizar systemd[1]: Starting Time & Date Service...
Jul 26 22:19:41 mizar dbus-daemon[835]: [system] Successfully activated service 'org.freedesktop.timedate1'
Jul 26 22:19:41 mizar systemd[1]: Started Time & Date Service.

Maybe that means that some service started because I tried to check the status? Just a thought.

Revision history for this message
Rasmus Bondesson (raek) wrote :

I found these two bug tickets that could be relevant:

https://github.com/home-assistant/operating-system/issues/896

https://github.com/systemd/systemd/issues/8683

I should also add that this seems to happen on the majority of boots for me, but not every boot. So there is some amount of non-determinism involved.

I'm not familiar with what service / tool / part of kernel does what in the boot of an Ubuntu 20.04 system, but I might read up on what systemd-time-wait-sync.service waits for to see if I can figure out why that does not happen.

Anyway, thanks for helping me debug this issue!

Revision history for this message
Rasmus Bondesson (raek) wrote :

One more boot. This time I checked the timestamp of the /run/systemd/timesync/synchronized file (and did not run any timedatectl commands):

raek@mizar:~$ ls -la /run/systemd/timesync
total 0
drwxr-xr-x 2 systemd-timesync systemd-timesync 60 Jul 26 22:51 .
drwxr-xr-x 24 root root 540 Jul 26 22:52 ..
-rw-r--r-- 1 systemd-timesync systemd-timesync 0 Jul 26 22:53 synchronized

Revision history for this message
Rasmus Bondesson (raek) wrote :

My problem looks similar to the on reported in the Home Assistant bug tracker.

raek@mizar:~$ systemctl status systemd-time-wait-sync
... *SNIP* ...
             └─508 /lib/systemd/systemd-time-wait-sync

raek@mizar:~$ sudo lsof -nPp 508
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
... *SNIP* ...
systemd-t 508 root 0r CHR 1,3 0t0 6 /dev/null
systemd-t 508 root 1u unix 0xffff99d964ad0c00 0t0 16894 type=STREAM
systemd-t 508 root 2u unix 0xffff99d964ad0c00 0t0 16894 type=STREAM
systemd-t 508 root 3u a_inode 0,14 0 11428 [eventpoll]
systemd-t 508 root 4u a_inode 0,14 0 11428 [signalfd]
systemd-t 508 root 5r a_inode 0,14 0 11428 inotify
systemd-t 508 root 6u a_inode 0,14 0 11428 [timerfd]

raek@mizar:~$ sudo strace -ttTp 508
strace: Process 508 attached
22:59:52.923972 gettid() = 508 <0.000023>
22:59:52.925523 epoll_wait(3,

raek@mizar:~$ sudo cat /proc/508/fdinfo/5
pos: 0
flags: 02004000
mnt_id: 15
inotify wd:1 ino:17b sdev:18 mask:100 ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:ac95aa8d7b010000

raek@mizar:~$ sudo cat /proc/508/mountinfo | grep :15
39 34 0:33 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,memory

The mountinfo lines look different for me than for tmm1 (see previously linked home assistant ticket).

Revision history for this message
Dan Streetman (ddstreet) wrote :

> https://github.com/home-assistant/operating-system/issues/896

yes the upstream commit to fix this does look like what would cause this, thanks!

Dan Streetman (ddstreet)
description: updated
Changed in systemd (Ubuntu):
status: Incomplete → Fix Released
Changed in systemd (Ubuntu Focal):
status: New → In Progress
importance: Undecided → Low
assignee: nobody → Dan Streetman (ddstreet)
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

But the things mentioned in systemd issue were supposedly resolved, and https://github.com/systemd/systemd/commit/d84af414180a4a8a7dd8772fc9d5302b5f9f28c9 is in focal already...... so is there something else not working in focal?

Revision history for this message
Dan Streetman (ddstreet) wrote :

> so is there something else not working in focal?

as noted in the [scope] sru template section this is fixed by upstream commit f6f4f5fe5395a57f10dd446c7266c53f0673eaac which isn't in focal

Revision history for this message
Dan Streetman (ddstreet) wrote :

Rasmus are you able to test with the systemd build from this ppa to see if it fixes the problem for you:
https://launchpad.net/~ddstreet/+archive/ubuntu/systemd

Revision history for this message
Rasmus Bondesson (raek) wrote :

Dan: I just added the PPA, upgraded the packages and did 3 reboots. In all cases "systemctl" showed "starting" when I logged in and then "started" some seconds later. So it seems like the problem is solved!

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Let me accept this officially into focal-proposed, but I'll request that @raek - as the person that can fully and reliably reproduces it - performs the required validation using the exact binary packages from our official proposed archive.

Thank you!

Changed in systemd (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Rasmus, or anyone else affected,

Accepted systemd into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/245.4-4ubuntu3.12 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Rasmus Bondesson (raek) wrote :

I tested systemd 245.4-4ubuntu3.12 just now. It worked immediately on three boots!

raek@mizar:~$ apt-cache policy systemd
systemd:
  Installed: 245.4-4ubuntu3.12
  Candidate: 245.4-4ubuntu3.12
  Version table:
 *** 245.4-4ubuntu3.12 400
        400 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     245.4-4ubuntu3.11 500
        500 http://se.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages
        500 http://se.archive.ubuntu.com/ubuntu focal-security/main amd64 Packages
     245.4-4ubuntu3 500
        500 http://se.archive.ubuntu.com/ubuntu focal/main amd64 Packages

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/245.4-4ubuntu3.12)

All autopkgtests for the newly accepted systemd (245.4-4ubuntu3.12) for focal have finished running.
The following regressions have been reported in tests triggered by the package:

flatpak/1.6.5-0ubuntu0.3 (amd64)
gvfs/1.44.1-1ubuntu1 (amd64, arm64)
munin/2.0.56-1ubuntu1 (s390x)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/focal/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Dan Streetman (ddstreet) wrote :

marking verified per comment 18, thanks!

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Rasmus, or anyone else affected,

Accepted systemd into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/245.4-4ubuntu3.13 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

tags: added: verification-needed verification-needed-focal
removed: verification-done verification-done-focal
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/245.4-4ubuntu3.13)

All autopkgtests for the newly accepted systemd (245.4-4ubuntu3.13) for focal have finished running.
The following regressions have been reported in tests triggered by the package:

gvfs/1.44.1-1ubuntu1 (amd64, ppc64el)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/focal/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Dan Streetman (ddstreet) wrote :

as systemd was respun due to bug 1942899, and only the one udev (hwdb) patch was reverted which shouldn't affect this at all, I'm remarking this as verified still based on testing for the previous version above.

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Chris Halse Rogers (raof) wrote : Update Released

The verification of the Stable Release Update for systemd has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 245.4-4ubuntu3.13

---------------
systemd (245.4-4ubuntu3.13) focal; urgency=medium

  * d/p/dell-clamshell-accel-location-base-with-sku.patch:
    Revert incorrect patch (LP: #1942899)

systemd (245.4-4ubuntu3.12) focal; urgency=medium

  [ Yao Wei ]
  * d/p/dell-clamshell-accel-location-base.patch:
    Add ACCEL_LOCATION=base property for Dell clamshell models (LP: #1938259)
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=5c1be33900edee94da0dc9a4ade8edcd079b4c85

  [ Lukas Märdian ]
  * Add d/p/lp1934221-resolved-disable-event-sources-before-unreffing-them.patch
    - Fix segfault in systemd-resolve (LP: #1934221)
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=6c401900c70962052f56c7108fdc02fe7f84c9bf

  [ Simon Chopin ]
  * d/p/lp1914740-network-enable-DHCP-broadcast-flag-if-required-by-in.patch:
    - Apply upstream patch to fix Hipersocket DHCP mode (LP: #1914740)
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=326ae43b7966d9e7c5f7124027185a79a07fa276

  [ Dan Streetman ]
  * d/p/lp1934981-correct-suspend-then-sleep-string.patch:
    Fix sleep verb used by logind during suspend-then-hibernate
    (LP: #1934981)
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=997f3a7da3d5db22e3c63626c3f7dc3dff0830b0
  * d/p/lp1937238-util-return-the-correct-correct-wd-from-inotify-help.patch:
    Fix watch for time sync (LP: #1937238)
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=dbabff8a03eb232c19174eff1335cd7cb7d7860c
  * d/extra/dhclient-enter-resolved-hook:
    Reset start limit counter for systemd-resolved in dhclient hook
    (LP: #1939255)
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=9d3a91a0b70a4b2bcc166f366cd0a880fd494812
  * d/p/lp1935051-shared-unit-file-make-sure-the-old-hashmaps-and-sets.patch:
    Fix memory leak in path cache (LP: #1935051)
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=12d6bdeb35f309158fe8d4242c6dd9be4d067604
  * d/p/lp1934147/0001-cgroup-do-catchup-for-unit-cgroup-inotify-watch-file.patch,
    d/p/lp1934147/0002-core-Make-sure-cgroup_oom_queue-is-flushed-on-manage.patch:
    Catchup cgroup inotify watch after reexec/reload (LP: #1934147)
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=63eabc88b8e0005eb40b15b543538ce35377bdbd

 -- Dan Streetman <email address hidden> Tue, 07 Sep 2021 14:37:22 -0400

Changed in systemd (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.