pam login 25s timeout delay due to tickling ntp systemd units
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu |
New
|
Undecided
|
Unassigned |
Bug Description
TLDR:
attempting to run ntp in standalone instead of dhcp driven somehow leads to logind/dbus pam timeouts and throw errors to auth.log and results in pam logins via console/ssh/su to be delayed by 25s for said timeout but otherwise work. at first I thought I had a regression on 1591411 but this is different. I do not understand why this happened but presume it is because nm has some weird side effect on network-pre.target so disabling the two ntp units didn't work as I intended.
I have since run into 1823098 which has the same starting issue I had which is to run ntp in standalone mode with a provided /etc/ntp.conf that isn't modified by dhcp. I am filing this bug in case somebody else travels down this road, hits these log messages and wonders wtf broke.
the ntp issues:
- there should be a knob in /etc/default/ntp to specify which form of ntp you want (none/standalon
- vaguely using netplan 'use-ntp' shouldn't be the only way to handle this and even that's locked into a bool instead of a selection since it has sync semantics
- dhcp shouldn't assume 'ntp-servers' is requested without checking for it. there's no /etc/default/dhcp that I can see so you have to modify /etc/dhcp/
perhaps there's an intended way to do this but I didn't run into it thus far. open to suggestions. meanwhile my solution was simply to disable the two units and no longer be a network-pre.target. it seemed to work for 2days and then I started seeing these on pam auths:
===
DISTRIB_
$ uptime #all users are just xterm login shells for me in X
23:34:24 up 34 days, 2:55, 31 users, load average: 0.55, 0.44, 0.38
$ dpkg -l|egrep 'systemd |dbus-user-
ii dbus-user-session 1.12.2-1ubuntu1.2 amd64
ii libsystemd0:amd64 237-3ubuntu10.42 amd64
ii libsystemd0:i386 237-3ubuntu10.42 i386
ii python3-systemd 234-1build1 amd64
ii systemd 237-3ubuntu10.42 amd64
$ systemd --version
systemd 237
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-
$ ps laxwww|grep dbus-[d]
4 106 1021 1 20 0 52740 3548 - Ss ? 13:45 /usr/bin/
0 1000 1839 1712 20 0 51580 2160 ep_pol Ss ? 1:37 /usr/bin/
0 1000 2286 2273 20 0 50052 984 ep_pol S ? 0:16 /usr/bin/
===8<: trigger1
$ date;sudo su - build -c /bin/date
Thu Feb 18 22:54:40 PST 2021
[sudo] password for luser:
Thu Feb 18 22:55:06 PST 2021
===8<: trigger1 auth.log
Feb 18 22:54:10 bahx dbus-daemon[1021]: [system] Failed to activate service 'org.freedeskto
Feb 18 22:54:10 bahx sshd[22024]: pam_systemd(
Feb 18 22:54:10 bahx systemd-
Feb 18 22:54:10 bahx sshd[22090]: Received disconnect from 127.0.0.1 port 50456:11: disconnected by user
Feb 18 22:54:10 bahx sshd[22090]: Disconnected from user luser 127.0.0.1 port 50456
Feb 18 22:54:10 bahx sshd[22024]: pam_unix(
Feb 18 22:54:41 bahx sudo: luser : TTY=pts/26 ; PWD=/home/luser ; USER=root ; COMMAND=/bin/su - build -c /bin/date
Feb 18 22:54:41 bahx sudo: pam_unix(
Feb 18 22:54:41 bahx su[22111]: Successful su for build by root
Feb 18 22:54:41 bahx su[22111]: + /dev/pts/26 root:build
Feb 18 22:54:41 bahx su[22111]: pam_unix(
Feb 18 22:55:06 bahx dbus-daemon[1021]: [system] Failed to activate service 'org.freedeskto
Feb 18 22:55:06 bahx su[22111]: pam_systemd(
Feb 18 22:55:06 bahx systemd-
Feb 18 22:55:06 bahx su[22111]: pam_unix(
Feb 18 22:55:06 bahx sudo: pam_unix(
===8<: trigger2
$ date;ssh localhost date
Thu Feb 18 22:55:52 PST 2021
generic banner
luser@localhost's password:
Thu Feb 18 22:56:18 PST 2021
===8<: trigger2 auth.log
Feb 18 22:55:53 bahx sshd[22135]: Accepted password for luser from 127.0.0.1 port 50608 ssh2
Feb 18 22:55:53 bahx sshd[22135]: pam_unix(
Feb 18 22:55:56 bahx systemd-
Feb 18 22:55:56 bahx dbus-daemon[1021]: [system] Failed to activate service 'org.freedeskto
Feb 18 22:56:12 bahx dbus-daemon[1021]: [system] Failed to activate service 'org.freedeskto
Feb 18 22:56:18 bahx sshd[22135]: pam_systemd(
Feb 18 22:56:18 bahx sshd[22202]: Received disconnect from 127.0.0.1 port 50608:11: disconnected by user
Feb 18 22:56:18 bahx sshd[22202]: Disconnected from user luser 127.0.0.1 port 50608
Feb 18 22:56:18 bahx sshd[22135]: pam_unix(
Feb 18 22:56:21 bahx dbus-daemon[1021]: [system] Failed to activate service 'org.freedeskto
Feb 18 22:56:21 bahx systemd-
===
all other references to these log messages end up with a generic restart logind/daemon reexec/reload and ignore the cause so I didn't repeat them blindly.
instead i reverted and reenabled/started the two units I disabled/stopped and voila pam worked and no 25s delay on login.
===
$ for i in /var/log/
2 Feb 15
277 Feb 17
424 Feb 18 #mostly me triaging ssh/sudo su
$ for i in /var/log/
/var/log/auth.log Feb 15 06:51:05 bahx polkitd(
/var/log/auth.log Feb 15 07:06:43 bahx polkitd(
===
this was me attempting to shut them down in a xterm without sudo and canceling the auth dialog and redoing them with sudo. the units disabled/stopped:
ntp-systemd-
ntp-systemd-
because of:
===
$ systemctl list-units --failed --all
UNIT LOAD ACTIVE SUB DESCRIPTION
● ntp-systemd-
with:
sudo systemctl reset-failed ntp-systemd-
===
at this point I have disabled/stopped the units again and pam is still happy. I'll let it run to see if it returns. i did change one thing though since then. because the failure of the unit was because the target script hook for ntp was removed I replaced the ntp hook with a simple exit 0 bash script so it doesn't trigger failure. perhaps that will be enough to prevent this cascade? time will tell
in any event something this minor in a single daemon shouldn't break pam across the system I should think. that's rather a violent overreach methinks.
whether this classifies under nm, dbus, systemd-logind, or ntp is the real mystery. nm barked first it seems. there's some nuance in ntp's systemd config not aligning with nm I guess? why it took 2 days to trigger I assume is like the previous issue requiring enough usage or SOMAXCONN or something like it before 'it' happens. still tbd. i triggered another nonsudo auth dialog to try to reproduce but seems fine still
Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https:/ /wiki.ubuntu. com/Bugs/ FindRightPackag e. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.
To change the source package that this bug is filed about visit https:/ /bugs.launchpad .net/ubuntu/ +bug/1916226/ +editstatus and add the package name in the text box next to the word Package.
[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]