Cannot allocate memory if process owned by user with large number of groups
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
procps (Ubuntu) |
Fix Released
|
High
|
Dave Chiluk | ||
Precise |
Fix Released
|
High
|
Dave Chiluk | ||
Quantal |
Fix Released
|
High
|
Dave Chiluk | ||
Raring |
Fix Released
|
High
|
Dave Chiluk |
Bug Description
[Impact]
* Users that are members of large numbers of groups are unable to run procps commands, but instead are greated with a segfault or "failed to allocate XXXXXXXXX bytes of memory"
* Many enterprise or University users create large numbers of groups in order to effectively manage access in their infrastructure. Often admins end up being added to most of these groups. Admins end up not being able to use procps commands as a result
* before this update procps statically allocated a buffer of 1024 bytes for reading all of /proc/#/status. This file includes the list of gid's that a user is part of. The attached patches backport a45dace4 and 95d01362 from upstream procps-ng which modifies file2str to use a dynamically allocated buffer instead.
[Test Case]
* sudo useradd bob; for i in {1..800}; do sudo groupadd group$i; sudo adduser bob group$i; done; sudo -u bob ps
* The above command returns failed to allocate XXXXXXXXXXX bytes of memory
[Regression Potential]
* Regressions are most likely to manifest in corruption of data that is available from the /proc/<
* The above being said, I feel as though regression potential is fairly minimal as this is mostly a backport of upstream commits.
* This fix is now available as part of this ppa https:/
[Other Info]
* procps is a horrible no-good very bad codebase, and we should upgrade to procps-ng as soon as possible. Available https:/
-------
Both ps and pgrep exit with an error like "xrealloc: realloc(1073741824) failedCannot allocate memory" if there is a process owned by a user with a large number of groups.
I suspect this was introduced with a recent kernel patch which no longer limits the number of groups returned by /proc/<pid>/status to 32. It affects Ubuntu kernel 3.2.0-38.60 and newer.
Links
* related kernel bug report - https:/
* kernel patch - https:/
* kernel 3.2.0-38.60 changelog - https:/
strace ps output where it fails when it gets to the process owned by a user with a large number of groups
open("/
read(6, "Name:\
close(6) = 0
mmap(NULL, 135168, PROT_READ|
mremap(
mremap(
mremap(
mremap(
mremap(
mremap(
mremap(
mremap(
mremap(
mremap(
mremap(
mremap(
mremap(
mmap(NULL, 1073745920, PROT_READ|
brk(0x4133c000) = 0x1333000
mmap(NULL, 1073876992, PROT_READ|
open("/
read(6, "0\n", 8192) = 2
close(6) = 0
mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|
munmap(
munmap(
mprotect(
mmap(NULL, 1073745920, PROT_READ|
write(2, "xrealloc: realloc(1073741824) failed", 36xrealloc: realloc(1073741824) failed) = 36
write(2, "Cannot allocate memory\n", 23Cannot allocate memory
) = 23
exit_group(1) = ?
Steps I was able to use to reproduce the problem with all local users and groups. The number of groups needed to break ps may be different on other systems.
root@alowther-
root@alowther-
root@alowther-
No directory, logging in with HOME=/
PID TTY TIME CMD
5182 pts/0 00:00:00 su
5183 pts/0 00:00:00 sh
5185 pts/0 00:00:00 ps
root@alowther-
root@alowther-
xrealloc: realloc(1073741824) failedCannot allocate memory
System info - I also tried using the version of procps from Raring, but it still failed
root@alowther-
Description: Ubuntu 12.04.2 LTS
Release: 12.04
root@alowther-
Linux alowther-d02 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
root@alowther-
ii procps 1:3.2.8-11ubuntu6 /proc file system utilities
root@alowther-
procps:
Installed: 1:3.2.8-11ubuntu6
Candidate: 1:3.3.3-2ubuntu5
Version table:
1:
500 http://
*** 1:3.2.8-11ubuntu6 0
500 http://
100 /var/lib/
Changed in procps (Ubuntu): | |
importance: | Undecided → High |
status: | Confirmed → Triaged |
tags: | added: precise |
tags: | added: raring |
Changed in procps (Ubuntu): | |
assignee: | nobody → Matthias Klose (doko) |
description: | updated |
Changed in procps (Ubuntu): | |
status: | Triaged → In Progress |
tags: | added: quantal saucy |
Changed in procps (Ubuntu): | |
status: | In Progress → Confirmed |
Changed in procps (Ubuntu Precise): | |
status: | New → Triaged |
Changed in procps (Ubuntu Quantal): | |
status: | New → Triaged |
Changed in procps (Ubuntu Raring): | |
status: | New → Triaged |
Changed in procps (Ubuntu Precise): | |
importance: | Undecided → High |
Changed in procps (Ubuntu Quantal): | |
importance: | Undecided → High |
Changed in procps (Ubuntu Raring): | |
importance: | Undecided → Critical |
importance: | Critical → High |
Changed in procps (Ubuntu Precise): | |
status: | Triaged → Confirmed |
Changed in procps (Ubuntu Raring): | |
status: | Triaged → Confirmed |
Changed in procps (Ubuntu Quantal): | |
status: | Triaged → Confirmed |
Changed in procps (Ubuntu Precise): | |
assignee: | nobody → Dave Chiluk (chiluk) |
Changed in procps (Ubuntu Quantal): | |
assignee: | nobody → Dave Chiluk (chiluk) |
Changed in procps (Ubuntu Raring): | |
assignee: | nobody → Dave Chiluk (chiluk) |
tags: | added: verification-quantal-done |
tags: |
added: verification-done-quantal verification-done-saucy removed: verification-quantal-done verification-saucy-done |
tags: | added: verification-done-raring |
Changed in procps (Ubuntu Quantal): | |
status: | Fix Committed → Fix Released |
Changed in procps (Ubuntu Precise): | |
status: | Fix Committed → Fix Released |
Changed in procps (Ubuntu Raring): | |
status: | Fix Committed → Fix Released |
I'm not sure the best way to fix this, but I have located the problem.
I've found this bug to be caused by a Debian patch "ps_supgid_ display. patch" which was initially from the bug report at http:// bugs.debian. org/506303
When any /proc/<pid>/status has more than 1024 characters before the end of the Groups line, an infinite loop is entered because the loop never finds the "\n" character it expected. Looking back at the strace, the read of "/proc/ 11860/status" is truncated. A succesful part of the strace is
stat("/proc/1110", {st_mode= S_IFDIR| 0555, st_size=0, ...}) = 0 proc/1110/ stat", O_RDONLY) = 6 51615 4194304 4219268 140737259370048 140737259368256 140200938633277 0 0 0 16387 184467440715794 36968 0 0 17 0 0 0 1348 0 0\n", 1023) = 253
open("/
read(6, "1110 (npcd) S 1 1108 1108 0 -1 4202816 509903 56395239 0 129102 205 836 161563 37679 20 0 1 0 3191 243957760 263 184467440737095
While the loop runs, it keeps allocating more memory for the array of Group IDs that it has found associated with the process until it fails because there isn't enough memory to allocate.
The while loop below from the patch is what runs infinitely. The xrealloc allocates more memory, doubling the amount it wants each time the loop thinks the array is filled. P->supgid, vctsize * sizeof(int)); isupgid+ +] = strtol(S,&S,10);
+ case_Groups:
+ isupgid = 0;
+ if (*S != '\n'){ // Is there any supplementary group ?
+ P->supgid = (int *) xmalloc(0x0004 * sizeof(int));
+ int vctsize = 0x0004;
+ while (S[1] != '\n' && isupgid<INT_MAX){ // There is one blank before '\n'
+ if (isupgid == vctsize){
+ vctsize *= 2;
+ P->supgid = (int *)xrealloc(
+ }
+ P->supgid[
+ P->nsupgid++;
+ }
+ }
+ continue;