[SRU] [scalability] NC does not detach created pthreads in KVM driver

Bug #567371 reported by Daniel Nurmi on 2010-04-20
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Eucalyptus
Fix Released
High
Unassigned
eucalyptus (Ubuntu)
High
Dustin Kirkland 
Lucid
High
Dustin Kirkland 

Bug Description

In the KVM NC driver, the pthread attr was set to detach a new pthread on create, but was not being passed to the pthread_create() function leading to the case where the NC will eventually be unable to start a new pthread. Fix is in revno 1223.

======
IMPACT:
 * This bug affects anyone running hundreds, or thousands of UEC instances.

ADDRESSED:
 * This bug is addressed by cherry-picking an upstream commit from their stable branch that fixes the leak.

REPRODUCE:
 * This bug can be hard to reproduce. The most direct way would be to deploy a UEC with one Node controller, and run thousands and thousands of instances on this one NC over time. Eventually, the NC will fail, hitting the limit of attached pthreads in the KVM driver.

REGRESSION POTENTIAL:
 * The node could be otherwise affected. However, in our testing of this fix, we have run thousands of UEC instances, and have confidence in the fix's stability.
======

Dustin Kirkland  (kirkland) wrote :

When this happens, the error message is something like "failed to spawn a VM startup thread" in nc.log.

Changed in eucalyptus (Ubuntu):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Dustin Kirkland (kirkland)
milestone: none → lucid-updates
Dustin Kirkland  (kirkland) wrote :

I haven't been able to reproduce this issue yet, to confirm the fix.

I'm targeting this at lucid-updates for an SRU.

We'll start preparing an upload to lucid-updates as of now.

summary: - NC does not detach created pthreads in KVM driver
+ [scalability] NC does not detach created pthreads in KVM driver
description: updated
summary: - [scalability] NC does not detach created pthreads in KVM driver
+ [SRU] [scalability] NC does not detach created pthreads in KVM driver

Accepted eucalyptus into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in eucalyptus (Ubuntu Lucid):
status: In Progress → Fix Committed
tags: added: verification-needed
Martin Pitt (pitti) wrote :

Copied lucid-proposed to maverick.

Changed in eucalyptus (Ubuntu):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eucalyptus - 1.6.2-0ubuntu30.1

---------------
eucalyptus (1.6.2-0ubuntu30.1) lucid-proposed; urgency=low

  Address LP: #565101
  * debian/eucalyptus.conf: set default JVM_MEM option
  * debian/eucalyptus-common.eucalyptus.upstart: use $JVM_MEM
    from eucalyptus.conf, or default to 512m
  * tools/eucalyptus.conf.5: document the JVM_MEM option

  Cherry-pick upstream commit r1223..1227:
  * node/handlers.c, node/handlers_kvm.c: handle situation where NC's
    do not detach pthreads, LP: #567371
  * node/handlers_kvm.c: fix console bug (was only showing first 64K),
    LP: #566793
  * clc/modules/storage-common/src/main/java/edu/ucsb/eucalyptus/storage/StorageManager.java,
    clc/modules/storage-common/src/main/java/edu/ucsb/eucalyptus/storage/fs/FileSystemStorageManager.java,
    clc/modules/walrus/src/main/java/edu/ucsb/eucalyptus/cloud/ws/WalrusImageManager.java,
    clc/modules/walrus/src/main/java/edu/ucsb/eucalyptus/cloud/ws/WalrusManager.java,
    clc/modules/wsstack/src/main/java/com/eucalyptus/ws/handlers/ServiceSinkHandler.java:
    - fix Walrus OOM errors (java heap), LP: #565101
 -- Dustin Kirkland <email address hidden> Wed, 28 Apr 2010 08:43:38 -0500

Changed in eucalyptus (Ubuntu Lucid):
status: Fix Committed → Fix Released
Martin Pitt (pitti) on 2010-05-03
Changed in eucalyptus (Ubuntu Lucid):
status: Fix Released → Fix Committed
Dustin Kirkland  (kirkland) wrote :

Dan-

Can you please install the package from lucid-proposed and confirm that this issue is fixed there, so that we can move this from -proposed to -updates?

Thanks!

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eucalyptus - 1.6.2-0ubuntu30.2

---------------
eucalyptus (1.6.2-0ubuntu30.2) lucid-proposed; urgency=low

  * Revert: node/handlers_kvm.c: fix console bug (was only showing first 64K),
    LP: #566793
  * clc/modules/www/src/main/java/edu/ucsb/eucalyptus/admin/server/EucalyptusWebBackendImpl.java:
    - fix user enumeration and account brute force, LP: #579942
  * debian/eucalyptus-sc.upstart: Bump maximum number of loop devices for
    SC to 512, LP: #586134

eucalyptus (1.6.2-0ubuntu30.1) lucid-proposed; urgency=low

  Address LP: #565101
  * debian/eucalyptus.conf: set default JVM_MEM option
  * debian/eucalyptus-common.eucalyptus.upstart: use $JVM_MEM
    from eucalyptus.conf, or default to 512m
  * tools/eucalyptus.conf.5: document the JVM_MEM option

  Cherry-pick upstream commit r1223..1227:
  * node/handlers.c, node/handlers_kvm.c: handle situation where NC's
    do not detach pthreads, LP: #567371
  * node/handlers_kvm.c: fix console bug (was only showing first 64K),
    LP: #566793
  * clc/modules/storage-common/src/main/java/edu/ucsb/eucalyptus/storage/StorageManager.java,
    clc/modules/storage-common/src/main/java/edu/ucsb/eucalyptus/storage/fs/FileSystemStorageManager.java,
    clc/modules/walrus/src/main/java/edu/ucsb/eucalyptus/cloud/ws/WalrusImageManager.java,
    clc/modules/walrus/src/main/java/edu/ucsb/eucalyptus/cloud/ws/WalrusManager.java,
    clc/modules/wsstack/src/main/java/com/eucalyptus/ws/handlers/ServiceSinkHandler.java:
    - fix Walrus OOM errors (java heap), LP: #565101
 -- Chris Cheney <email address hidden> Fri, 04 Jun 2010 00:39:00 -0500

Changed in eucalyptus (Ubuntu Lucid):
status: Fix Committed → Fix Released
Changed in eucalyptus:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers