Ubuntu

autofs5 eats the cpu if you have large groups

Reported by Joel Ebel on 2010-06-08
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
autofs5 (Ubuntu)
Undecided
Unassigned
Lucid
Undecided
Unassigned

Bug Description

Binary package hint: autofs5

The issue is in lib/mounts.c, set_tsd_user_vars - editing out the boring bits, it looks like this:

grplen = sysconf(_SC_GETGR_R_SIZE_MAX);
       while (1) {
               char *tmp = realloc(gr_tmp, tmplen+1);
               status = getgrgid_r(gid, pgr, gr_tmp, tmplen, ppgr);
               if (status != ERANGE)
                       break;
               tmplen += grplen;
       }

It's trying to get the members of the users primary group, but doesn't know how big a buffer to allocate, so it keeps trying until the buffer is big enough, incrementing it each time. The increment is only 1024 bytes at a time, however, so it takes several hundred iterations to get a big enough buffer.

This shouldn't be relying on_SC_GETGR_R_SIZE_MAX to give a reasonable increment. See http://<email address hidden>/msg40443.html for some discussion about whether the value of SC_GETGR_R_SIZE_MAX should really be that low, but it seems that debian decided it should be, and the man page was wrong.

I've verified that bumping the increment value by 1000x fixes the issue, and stat'ing non-existent homedirs is now instantaneous.

Joel Ebel (jbebel) on 2010-06-08
tags: added: glucid
Joel Ebel (jbebel) wrote :

One option would be to double tmplen each pass. That would make it take, in my case, 10 tries, rather than ~750.

so:

- tmplen += grplen;
+ tmplen *= 2;

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package autofs5 - 5.0.5-0ubuntu2

---------------
autofs5 (5.0.5-0ubuntu2) maverick; urgency=low

  [Joel Ebel]
  * debian/patches/16group_buffer_size.patch: Increase group buffer size
    geometrically rather than linearly when its found to be small.
    (LP: #591100)

  [Chuck Short]
  * debian/control: Fix conflict resolution. (LP: #520601)
 -- Chuck Short <email address hidden> Wed, 30 Jun 2010 08:06:45 -0400

Changed in autofs5 (Ubuntu):
status: New → Fix Released
Joel Ebel (jbebel) wrote :

The group size patch would be very helpful for us to have included in Lucid.

Joel Ebel (jbebel) wrote :

SRU team, as mentioned, this bug causes significant delays in automounting when your primary group is large (thousands of users). The bug has been addressed by increasing the group buffer size geometrically rather than linearly. If the primary group is small, there will be no change. The patch I originally provided still applies to the version in Lucid.

TEST CASE: Your primary group should have thousands of users, and you attempt access of something in an automount managed directory.
Expected result: It succeeds or fails quickly.
Prior to this fix it will take several seconds.

There are no expected regressions from this patch.

Martin Pitt (pitti) wrote :

SRU ack, please upload.

Benjamin Drung (bdrung) wrote :

uploaded the attached one to lucid-proposed

Changed in autofs5 (Ubuntu Lucid):
status: New → Fix Committed

Accepted into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

tags: added: verification-needed
Joel Ebel (jbebel) wrote :

We have been testing this exact patch (with the exception of subtle changelog variations, verified by debdiff) for 2 months now on thousands of machines. It greatly improves autofs performance in a large group environment. I've briefly tested the package uploaded to proposed to verify that it still works and improves performance.

tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package autofs5 - 5.0.4-3.1ubuntu5.1

---------------
autofs5 (5.0.4-3.1ubuntu5.1) lucid-proposed; urgency=low

  * Increase group buffer size geometrically rather than linearly when it is
    found to be too small (LP: #591100).
 -- Joel Ebel <email address hidden> Tue, 17 Aug 2010 10:44:24 +0200

Changed in autofs5 (Ubuntu Lucid):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers