systemd/logind parsing problem: HTX exercisers stopped on error: rc 11, errno 11 from main(): pthread_create

Bug #1651518 reported by bugproxy on 2016-12-20
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
High
Unassigned
systemd (Ubuntu)
Undecided
Unassigned
Xenial
High
Dimitri John Ledkov
Yakkety
Undecided
Steve Langasek

Bug Description

[SRU justification]
Before systemd 232, UserTasksMax=infinity is not respected in logind.conf, despite the documentation referring to systemd.resource-control(5) for the definition of this field. This limits the use of Ubuntu 16.04 and later in contexts where a user session should be permitted to allocate large numbers of processes.

[Test case]
1. Set UserTasksMax=infinity in /etc/systemd/logind.conf.
2. Create a new login session.
3. Check the ulimit for user processes with 'ulimit -u'.
4. Verify that the limit has a numeric value.
5. Upgrade to systemd from -proposed.
6. Create a new login session.
7. Check the ulimit for user processes with 'ulimit -u'.
8. Verify that the limit is now set to 'unlimited'.

[Regression potential]
This is an upstream patch which is part of 232 and later without issues and clearly addresses the bug in question. While the upstream commit includes code changes that are not strictly required in order to fix this bug, these are mostly cosmetic and should not carry significant additional risk.

== Comment: #1 - Application Cdeadmin <email address hidden> - 2016-12-19 04:15:10 ==

Configuration: IBM 8001-22C (S822LC), LSI SAS adapters, SMC 4U90 disk drawers, HDD (180) 7.3TB

Problem: HTX exercisers stopped on error, with HTX log showing "rc 11, errno 11 from main(): pthread_create"

htxubuntu-425

lpar: busybee.aus.stglabs.ibm.com (root/ lab passwd)

root@busybee:~# uname -a
Linux busybee 4.4.0-51-generic #72-Ubuntu SMP Thu Nov 24 18:27:59 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux

root@busybee:~# cat /tmp/htxerr

/dev/sdh Dec 12 23:52:42 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sdh Dec 12 23:52:42 2016 err=0000000b sev=1 hxestorage
Hardware Exerciser stopped on an error

/dev/sdao Dec 12 23:52:42 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sdao Dec 12 23:52:42 2016 err=0000000b sev=1 hxestorage
Hardware Exerciser stopped on an error

/dev/sddx Dec 12 23:52:42 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sddx Dec 12 23:52:42 2016 err=0000000b sev=1 hxestorage
Hardware Exerciser stopped on an error

/dev/sdcz Dec 12 23:52:42 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sdcz Dec 12 23:52:42 2016 err=0000000b sev=1 hxestorage
Hardware Exerciser stopped on an error

/dev/sddp Dec 12 23:52:42 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sddp Dec 12 23:52:42 2016 err=0000000b sev=1 hxestorage
Hardware Exerciser stopped on an error

No errors logged in syslog after starting HTX:

==== State: Open by: asperez on 16 December 2016 10:28:02 ====
This error recreated on the smaller 1U Open Power system with the same smaller 1-adapter/1-4U90 drawer/90 HDD. There are 2 cables connected to the drawer (one to each ESM) that requires multipath enabled.

lpar: yellowbee

root@yellowbee:~# cat /tmp/htxerr

/dev/sdao Dec 16 01:14:44 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sdak Dec 16 01:14:44 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sdt Dec 16 01:14:44 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sdaz Dec 16 01:14:44 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sdn Dec 16 01:14:44 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sdv Dec 16 01:14:44 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sdaj Dec 16 01:14:44 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

/dev/sdal Dec 16 01:14:44 2016 err=0000000b sev=1 hxestorage
rc 11, errno 11 from main(): pthread_create

==== State: Open by: cde00 on 19 December 2016 02:48:46 ====

This defect will go to Linux as even after making the below 2 changes in systemd resource limit, errors are seen:

root@yellowbee:/etc/systemd/logind.conf.d# cat htxlogindcustom.conf
[Login]
UserTasksMax=infinity
root@yellowbee:/etc/systemd/logind.conf.d# cat ../system.conf.d/htxsystemdcustom.conf
[Manager]
DefaultTasksAccounting=yes
DefaultTasksMax=infinity

root@yellowbee:/etc/systemd/logind.conf.d#

logind limit for MaxUserTask as well as systemd limit was made infinite.
Please look defect SW363655 for more details and the suggestion given earlier by Linux team.

errors are still seen. So, would ask Linux team to take a look and see if anything else is causing these
errors.

== Comment: #3 - Kevin W. Rudd <email address hidden> - 2016-12-19 17:34:27 ==
It appears that the version of logind on this system does not support the value of "infinity", and is reverting to the default of 12288:

# cat /sys/fs/cgroup/pids/user.slice/user-0.slice/pids.max
12288

As a workaround until this can be resolved, specify an exact value. You can try using the current system thread-max value:

# cat /proc/sys/kernel/threads-max
3974272

/etc/systemd/logind.conf.d/htxlogindcustom.conf:
[Login]
UserTasksMax=3974272

== Comment: #4 - Kevin W. Rudd - 2016-12-20 10:25:09 ==

Canonical,

This issue appears to map to the following systemd bug and patch:

https://github.com/systemd/systemd/issues/3833

https://github.com/systemd/systemd/commit/f50582649f8eee73f59aff95fadd9a963ed4ffea

This patch appears to be included in debian/232-7, but is missing in the xenial and yakkety versions.

CVE References

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-150030 severity-high targetmilestone-inin16042

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1651518/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Kevin W. Rudd (kevinr) on 2016-12-20
affects: ubuntu → systemd (Ubuntu)
Luciano Chavez (lnx1138) on 2017-01-05
Changed in systemd (Ubuntu):
assignee: nobody → Taco Screen team (taco-screen-team)

------- Comment From <email address hidden> 2017-01-05 12:13 EDT-------
==== State: Assigned by: cde00 on 05 January 2017 11:13:34 ====

Steve Langasek (vorlon) wrote :

The indicated upstream patch, f50582649f8eee73f59aff95fadd9a963ed4ffea, does not apply cleanly against systemd 229. Are there known prerequisites for this patch?

Steve Langasek (vorlon) on 2017-01-07
Changed in systemd (Ubuntu Xenial):
assignee: nobody → Steve Langasek (vorlon)
milestone: none → ubuntu-16.04.2
Changed in systemd (Ubuntu Yakkety):
assignee: nobody → Steve Langasek (vorlon)
milestone: none → yakkety-updates

------- Comment on attachment From <email address hidden> 2017-01-07 21:12 EDT-------

It looks like the commit referenced earlier also builds upon the addition of the config_parse_tasks_max() function added in commit 6300502b .

The following example patch (based on systemd_229-4ubuntu14) focuses on just the UserTasksMax "infinity" parsing.

------- Comment on attachment From <email address hidden> 2017-01-07 21:15 EDT-------

A test build of systemd with the previous patch applied resolved the parsing issue in my testing.

# cat /etc/systemd/logind.conf.d/usertasks.conf
[Login]
UserTasksMax=infinity

# cat /sys/fs/cgroup/pids/user.slice/user-0.slice/pids.max
max

------- Comment From <email address hidden> 2017-01-07 21:22 EDT-------
cde00 (<email address hidden>) added native attachment /tmp/AIXOS06698101/systemd_infinity.debdiff on 2017-01-07 20:22:29
cde00 (<email address hidden>) added native attachment /tmp/AIXOS06698101/systemd_229.patch on 2017-01-07 20:22:29

Manoj Iyer (manjo) on 2017-01-23
Changed in systemd (Ubuntu):
assignee: Taco Screen team (taco-screen-team) → nobody
Steve Langasek (vorlon) on 2017-03-21
description: updated
Łukasz Zemczak (sil2100) wrote :

Confirmed that the following fix is part of 232 present in zesty. Marking as Fix Released for devel series.

Changed in systemd (Ubuntu):
status: New → Fix Released

Hello bugproxy, or anyone else affected,

Accepted systemd into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/231-9ubuntu4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in systemd (Ubuntu Yakkety):
status: New → Fix Committed
tags: added: verification-needed

------- Comment From <email address hidden> 2017-04-13 17:19 EDT-------
systemd-231-9ubuntu4 looks good for yakkety. It appears that there are other changes since user-0.slice was no longer present, but the parent pids.max is now set to "max":

root@kwr-yakkety:~# cat /sys/fs/cgroup/pids/user.slice/user-0.slice/pids.max
cat: /sys/fs/cgroup/pids/user.slice/user-0.slice/pids.max: No such file or directory

root@kwr-yakkety:~# cat /sys/fs/cgroup/pids/user.slice/pids.max
max

We are now just waiting on xenial for closure.

Thanks.

tags: added: verification-done-yakkety
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 231-9ubuntu4

---------------
systemd (231-9ubuntu4) yakkety; urgency=medium

  * debian/extra/units/systemd-resolved.service.d/resolvconf.conf: if
    resolved is going to be started, make sure this blocks
    network-online.target. LP: #1673860.
  * debian/patches/resolved-follow-CNAMES-for-DNS-stub-replies.patch:
    Cherry-pick upstream fix for resolved failing to follow CNAMES for DNS
    stub replies. LP: #1647031.
  * debian/patches/logind-update-empty-and-infinity-handling-for-User-T.patch:
    Cherry-pick upstream fix to handle empty and "infinity" values for
    [User]TasksMax. Closes LP: #1651518.

 -- Steve Langasek <email address hidden> Mon, 20 Mar 2017 22:14:14 -0700

Changed in systemd (Ubuntu Yakkety):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for systemd has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Manoj Iyer (manjo) on 2017-05-09
Changed in ubuntu-power-systems:
status: New → Fix Committed

------- Comment From <email address hidden> 2017-06-07 11:14 EDT-------
Canonical FYI: We are still waiting on status for the xenial release. Thanks.

Steve Langasek (vorlon) on 2017-06-07
Changed in systemd (Ubuntu Xenial):
assignee: Steve Langasek (vorlon) → Dimitri John Ledkov (xnox)
Changed in systemd (Ubuntu Xenial):
importance: Undecided → High
milestone: ubuntu-16.04.2 → ubuntu-16.04.3

Hello bugproxy, or anyone else affected,

Accepted systemd into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/229-4ubuntu18 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in systemd (Ubuntu Xenial):
status: New → Fix Committed
tags: added: verification-needed verification-needed-xenial
Download full text (10.4 KiB)

------- Comment From <email address hidden> 2017-07-10 11:46 EDT-------
This CMVC defect is being cancelled by the CDE Bridge because the corresponding CQ Defect [FW663285] was transferred out of the bridge domain.
Here are the additional details:
New Subsystem = ppc_triage
New Release = unspecified
New Component = ubuntu_linux
New OwnerInfo = Chavez, Luciano (<email address hidden>)
To continue tracking this issue, please follow CQ defect [FW663285].

Assigning for screening...
It appears that the version of logind on this system does not support the value of "infinity", and is reverting to the default of 12288:

# cat /sys/fs/cgroup/pids/user.slice/user-0.slice/pids.max
12288

As a workaround until this can be resolved, specify an exact value. You can try using the current system thread-max value:

# cat /proc/sys/kernel/threads-max
3974272

/etc/systemd/logind.conf.d/htxlogindcustom.conf:
[Login]
UserTasksMax=3974272

Created mirror request (28237) Canonical Launchpad.

Information on this bug will potentially be exposed to the public. Before you proceed, please make sure you read Content Guidelines for LTC Bugzilla : Confidential vs. Non-confidential[1].

[1] - ftp://ausgsa.ibm.com/projects/l/ltc/ToolsInfrastructure/ProjectStatus/Bugzilla/Bugzilla_Content_Education_v2.pdf
The bug is ready to be mirrored to:

Distro: Canonical Launchpad.
Project: ubuntu
Package: systemd

This defect will go to Linux as even after making the below 2 changes in systemd resource limit, errors are seen:

root@yellowbee:/etc/systemd/logind.conf.d# cat htxlogindcustom.conf
[Login]
UserTasksMax=infinity
root@yellowbee:/etc/systemd/logind.conf.d# cat ../system.conf.d/htxsystemdcustom.conf
[Manager]
DefaultTasksAccounting=yes
DefaultTasksMax=infinity

root@yellowbee:/etc/systemd/logind.conf.d#

logind limit for MaxUserTask as well as systemd limit was made infinite.
Please look defect SW363655 for more details and the suggestion given earlier by Linux team.

errors are still seen. So, would ask Linux team to take a look and see if anything else is causing these
errors.

== Comment: #3 - Kevin W. Rudd <email address hidden> - 2016-12-19 17:34:27 ==
It appears that the version of logind on this system does not support the value of "infinity", and is reverting to the default of 12288:

# cat /sys/fs/cgroup/pids/user.slice/user-0.slice/pids.max
12288

As a workaround until this can be resolved, specify an exact value. You can try using the current system thread-max value:

# cat /proc/sys/kernel/threads-max
3974272

/etc/systemd/logind.conf.d/htxlogindcustom.conf:
[Login]
UserTasksMax=3974272

== Comment: #4 - Kevin W. Rudd - 2016-12-20 10:25:09 ==

Canonical,

This issue appears to map to the following systemd bug and patch:

https://github.com/systemd/systemd/issues/3833

https://github.com/systemd/systemd/commit/f50582649f8eee73f59aff95fadd9a963ed4ffea

This patch appears to be included in debian/232-7, but is missing in the xenial and yakkety versions.

Default Comment by Bridge
This error recreated on the smaller 1U Open Power system with the same smaller 1-adapter/1-4U90 drawer/90 HDD. There are 2 cables connected to the drawer (one to each ESM) that requires multip...

Dimitri John Ledkov (xnox) wrote :

I am failing to see what new comments got bridged.

It appears that https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1651518/comments/15 is a sync bug.

Anyway, a test patch for xenial has been proposed now and will be undergoing testing to fix the issue raised in Xenial release. All other releases already have this fix.

Dimitri John Ledkov (xnox) wrote :

This is probably verification failed *sigh* will fix up:

Jul 11 13:14:52 systemd-sru-test systemd[1]: Starting Login Service...
Jul 11 13:14:52 systemd-sru-test systemd-logind[482]: [/etc/systemd/logind.conf:35] Failed to parse tasks maximum, ignoring: unlimited
Jul 11 13:14:52 systemd-sru-test systemd[1]: Started Login Service.

tags: added: verification-failed-xenial
removed: verification-needed-xenial
Dimitri John Ledkov (xnox) wrote :

Haha, maybe I should follow the verification steps correctly. Indeed "unlimited" is not a valid value, as the value "infinity" is introduced here, which does work correctly with systemd 229-4ubuntu18

tags: added: verification-done verification-done-xenial
removed: verification-failed-xenial verification-needed

Hello bugproxy, or anyone else affected,

Accepted systemd into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/229-4ubuntu19 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-needed verification-needed-xenial
removed: verification-done verification-done-xenial
Dimitri John Ledkov (xnox) wrote :

Starting with systemd 229-4ubuntu17.
Observed:
Jul 19 13:21:23 key-giraffe systemd-logind[458]: [/etc/systemd/logind.conf:35] Failed to parse uint64 value, ignoring: infinity

Upgraded to systemd 229-4ubuntu19
The error is gone.

tags: added: verification-done verification-done-xenial
removed: verification-needed verification-needed-xenial
Manoj Iyer (manjo) on 2017-07-19
Changed in ubuntu-power-systems:
importance: Undecided → High
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 229-4ubuntu19

---------------
systemd (229-4ubuntu19) xenial; urgency=medium

  * debian/extra/units/systemd-resolved.service.d/resolvconf.conf: partially
    revert, by removing ExecStart|StopPost lines, as these are not needed on
    xenial and generate warnings in the journal. (LP: #1704677)

systemd (229-4ubuntu18) xenial; urgency=medium

  * debian/extra/units/systemd-resolved.service.d/resolvconf.conf: if resolved
    is going to be started, make sure this blocks network-online.target.
    (LP: #1673860)
  * networkd: cherry-pick support for setting bridge port's priority
    (LP: #1668347)
  * Cherrypick upstream commit to enable system use kernel maximum limit for
    RLIMIT_NOFILE isntead of hard-coded (low) limit of 65536. (LP: #1686361)
  * Cherrypick upstream patch for platform predictable interface names.
    (LP: #1686784)
  * resolved: fix null pointer dereference crash (LP: #1621396)
  * Cherrypick core/timer downgrade message about random time addition
    (LP: #1692136)
  * SECURITY UPDATE: Out-of-bounds write in systemd-resolved (LP: #1695546)
    - CVE-2017-9445
  * Cherry-pick subset of patches to introduce infinity value in logind.conf
    for UserTasksMax (LP: #1651518)

 -- Dimitri John Ledkov <email address hidden> Mon, 17 Jul 2017 17:00:42 +0100

Changed in systemd (Ubuntu Xenial):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.