systemd-logind must be restarted every ~1000 SSH logins to prevent a ~25 second delay
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
D-Bus |
Fix Released
|
Medium
|
|||
systemd |
Fix Released
|
Unknown
|
|||
dbus (Ubuntu) |
Fix Released
|
Medium
|
Łukasz Zemczak | ||
Xenial |
Fix Released
|
Medium
|
Łukasz Zemczak | ||
Yakkety |
Won't Fix
|
Medium
|
Łukasz Zemczak | ||
systemd (Ubuntu) |
Fix Released
|
Medium
|
Unassigned | ||
Xenial |
Invalid
|
Medium
|
Unassigned | ||
Yakkety |
Invalid
|
Undecided
|
Unassigned |
Bug Description
[Impact]
The bug affects multiple users and introduces an user visible delay (~25 seconds) on SSH connections after a large number of sessions have been processed. This has a serious impact on big systems and servers running our software.
The currently proposed fix is actually a safe workaround for the bug as proposed by the dbus upstream. The workaround makes uid 0 immune to the pending_fd_timeout limit that kicks in and causes the original issue.
[Test Case]
lxc launch ubuntu:x test
lxc exec test -- login -f ubuntu
ssh-import-id <whatever>
Then ran a script as follows (passing in ubuntu@
while [ 1 ]; do
(time ssh $1 "echo OK > /dev/null") 2>&1 | grep ^real >> log
done
Then checking the log file if there are any ssh sessions that are taking 25+ seconds to complete.
Multiple instances of the same script can be used at the same time.
[Regression Potential]
The fix has a rather low regression potential as the workaround is a very small change only affecting one particular case - handling of uid 0. The fix has been tested by multiple users and has been around in zesty for a while, with multiple people involved in reviewing the change. It's also a change that has been proposed by upstream.
[Original Description]
I noticed on a system that accepts large numbers of SSH connections that after awhile, SSH sessions were taking ~25 seconds to complete.
Looking in /var/log/auth.log, systemd-logind starts failing with the following:
Jun 10 23:55:28 test sshd[3666]: pam_unix(
Jun 10 23:55:28 test systemd-
Jun 10 23:55:28 test systemd-
Jun 10 23:55:28 test sshd[3666]: pam_systemd(
I reproduced this in an LXD container by doing something like:
lxc launch ubuntu:x test
lxc exec test -- login -f ubuntu
ssh-import-id <whatever>
Then ran a script as follows (passing in ubuntu@
while [ 1 ]; do
(time ssh $1 "echo OK > /dev/null") 2>&1 | grep ^real >> log
done
In my case, after 1052 logins, the 1053rd and thereafter were taking 25+ seconds to complete. Here are some snippets from the log file:
$ cat log | grep 0m0 | wc -l
1052
$ cat log | grep 0m25 | wc -l
4
$ tail -5 log
real 0m0.222s
real 0m25.232s
real 0m25.235s
real 0m25.236s
real 0m25.239s
ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: systemd 229-4ubuntu5
ProcVersionSign
Uname: Linux 4.4.0-22-generic x86_64
ApportVersion: 2.20.1-0ubuntu2
Architecture: amd64
Date: Sat Jun 11 00:09:34 2016
MachineType: Notebook W230SS
ProcEnviron:
TERM=xterm-
PATH=(custom, no user)
ProcKernelCmdLine: BOOT_IMAGE=
SourcePackage: systemd
SystemdDelta:
[EXTENDED] /lib/systemd/
[EXTENDED] /lib/systemd/
2 overridden configuration files found.
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/15/2014
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 4.6.5
dmi.board.
dmi.board.name: W230SS
dmi.board.vendor: Notebook
dmi.board.version: Not Applicable
dmi.chassis.
dmi.chassis.type: 9
dmi.chassis.vendor: Notebook
dmi.chassis.
dmi.modalias: dmi:bvnAmerican
dmi.product.name: W230SS
dmi.product.
dmi.sys.vendor: Notebook
CVE References
tags: | added: cdo-qa-blocker |
Changed in systemd (Ubuntu): | |
importance: | Undecided → Medium |
Changed in systemd (Ubuntu Xenial): | |
status: | New → Confirmed |
importance: | Undecided → Medium |
Changed in systemd (Ubuntu): | |
assignee: | Łukasz Zemczak (sil2100) → nobody |
status: | Confirmed → Fix Released |
tags: | added: patch |
tags: | added: patch |
Changed in dbus (Ubuntu Xenial): | |
importance: | Undecided → Medium |
Changed in dbus (Ubuntu): | |
importance: | Undecided → Medium |
Changed in dbus (Ubuntu Xenial): | |
assignee: | nobody → Łukasz Zemczak (sil2100) |
Changed in dbus (Ubuntu): | |
assignee: | nobody → Łukasz Zemczak (sil2100) |
Changed in dbus (Ubuntu Yakkety): | |
status: | New → Confirmed |
Changed in systemd (Ubuntu Yakkety): | |
status: | New → Invalid |
Changed in dbus (Ubuntu Yakkety): | |
assignee: | nobody → Łukasz Zemczak (sil2100) |
importance: | Undecided → Medium |
description: | updated |
Changed in dbus (Ubuntu Xenial): | |
status: | Confirmed → In Progress |
Changed in dbus (Ubuntu Yakkety): | |
status: | Confirmed → In Progress |
Changed in dbus: | |
importance: | Unknown → Medium |
status: | Unknown → Fix Released |
tags: | added: canonical-bootstack canonical-is |
Changed in systemd: | |
status: | Unknown → Fix Released |
Here's my (sad) workaround:
$ sudo crontab -l
1,11,21,31,41,51 * * * * service systemd-logind restart