[SRU] connection leak in rpc connection pool

Bug #968843 reported by MotoKen on 2012-03-30
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Chris Behrens
Essex
Undecided
Unassigned
nova (Ubuntu)
Undecided
Unassigned
Precise
Undecided
Chuck Short

Bug Description

When exception occurs in MulticallWaiter, it won't put the connection back to the pool.

To reproduce:
set small pool size, rpc_conn_pool_size=1
cause rpc response timeout ("Timed out waiting for RPC response" in log)
then when rpc.call next time, it will wait forever to get connection from pool

Related branches

CVE References

Fix proposed to branch: master
Review: https://review.openstack.org/5984

Changed in nova:
assignee: nobody → MotoKen (motokentsai)
status: New → In Progress
Chris Behrens (cbehrens) on 2012-04-25
Changed in nova:
assignee: MotoKen (motokentsai) → Chris Behrens (cbehrens)
importance: Undecided → High

Fix proposed to branch: master
Review: https://review.openstack.org/6804

Reviewed: https://review.openstack.org/6804
Committed: http://github.com/openstack/nova/commit/208c635a7d064fafc14dab97172c98cd5d8e6fc6
Submitter: Jenkins
Branch: master

commit 208c635a7d064fafc14dab97172c98cd5d8e6fc6
Author: Chris Behrens <email address hidden>
Date: Wed Apr 25 17:34:53 2012 +0000

    Don't leak RPC connections on timeouts or other exceptions

    Fixes bug 968843

    Change-Id: I9e0f1e306cab203bf4c865050b7a45f96127062e

Changed in nova:
status: In Progress → Fix Committed

Reviewed: https://review.openstack.org/6833
Committed: http://github.com/openstack/nova/commit/48a07680b46b9973cd7de1b30ae80bd93861e1bb
Submitter: Jenkins
Branch: stable/essex

commit 48a07680b46b9973cd7de1b30ae80bd93861e1bb
Author: Chris Behrens <email address hidden>
Date: Wed Apr 25 17:34:53 2012 +0000

    Don't leak RPC connections on timeouts or other exceptions

    Fixes bug 968843

    Change-Id: I9e0f1e306cab203bf4c865050b7a45f96127062e

tags: added: in-stable-essex
Devin Carlen (devcamcar) on 2012-05-22
Changed in nova:
milestone: none → folsom-1
Thierry Carrez (ttx) on 2012-05-23
Changed in nova:
status: Fix Committed → Fix Released
Chuck Short (zulcss) on 2012-05-30
Changed in nova (Ubuntu):
status: New → In Progress
Changed in nova (Ubuntu Precise):
status: New → In Progress
Chuck Short (zulcss) on 2012-06-07
summary: - connection leak in rpc connection pool
+ [SRU] connection leak in rpc connection pool
Chuck Short (zulcss) wrote :

** Impact **

In nova, when exception occurs in MulticallWaiter scheduler the scheduler wont put the connection back to the pool, preventing Nova from running properly

** Development Fix **

This has been addressed in https://review.openstack.org/6804 and fixed in quantal.

** Stable Fix **

This has been addressed in https://review.openstack.org/6833

** Test Case **

set small pool size, rpc_conn_pool_size=1
cause rpc response timeout ("Timed out waiting for RPC response" in log)
then when rpc.call next time, it will wait forever to get connection from pool

** Regression Potental **

Minimal this is an edge corner case.

Chuck Short (zulcss) on 2012-06-08
Changed in nova (Ubuntu Precise):
assignee: nobody → Chuck Short (zulcss)
milestone: none → ubuntu-12.04.1

Hello MotoKen, or anyone else affected,

Accepted nova into precise-proposed. The package will build now and be available in a few hours. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nova (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Adam Gandelman (gandelman-a) wrote :

Please find the attached Jenkins job results from the Ubuntu Server Team's CI
infrastructure. As part of the verification process for this bug, Nova has
been deployed and configured across multiple nodes using precise-proposed as
an installation source. After successful bring-up and configuration of the
cluster, a number of exercises and smoke tests have be invoked to ensure the
updated package did not introduce any regressions. A number of test iterations
were carried out to catch any possible transient errors.

Note the list of installed packages at the top and bottom of the report.

For records of upstream test coverage of this update, please see the
Jenkins links in the comments of the relevant upstream code-review:

https://review.openstack.org/6833

As per the provisional Micro Release Exception granted to this package by
the Technical Board, we hope this contributes toward verification of this
update.

Dave Walker (davewalker) on 2012-07-03
tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nova - 2012.1+stable~20120612-3ee026e-0ubuntu1

---------------
nova (2012.1+stable~20120612-3ee026e-0ubuntu1) precise-proposed; urgency=low

  * New upstream snapshot. (LP: #1010473)
  * Dropped, superseeded by new snapshot:
    - debian/patches/upstream/0001-fix-bug-where-nova-ignores-glance-host-in-imageref.patch
    - debian/patches/upstream/0002-Stop-libvirt-test-from-deleting-instances-dir.patch
    - debian/patches/upstream/0003-Allow-unprivileged-RADOS-users-to-access-rbd-volumes.patch
    - debian/patches/upstream/0004-Fixed-bug-962840-added-a-test-case.patch
    - debian/patches/upstream/0005-Populate-image-properties-with-project_id-again.patch
    - debian/patches/upstream/0006-Use-project_id-in-ec2.cloud._format_image.patc
    - debian/patches/CVE-2012-2101.patch
    - debian/patches/CVE-2012-2654.patch
  * Resynchronize with stable/essex:
    - 3ee026e Only invoke .lower() on non-None protocols. (LP: #1010514)
    - f0a9f47 Create a utf8 version of the dns_domains table. (LP: #993663)
    - 84a43e1 Report memory correctly on Xen. (LP: #997014)
    - 8c72924 Add libvirt get_console_output tests: pty and file. (LP: #990237)
    - 4e423cd Fix Multi_Scheduler to process host capabilities. (LP: #1000403)
    - 4aea7f1 Nail pep8 dependencies to 1.0.1
    - 2b3bbc4 handle updated qemu-img info output. (LP: #1000261)
    - 2d7d51c Fix type of snapshot_id column to match db. (LP: #962615)
    - ec70c69 Generate a Changelog for Nova
    - e5e890f Fix nova.tests.test_nova_rootwrap on Fedora 17. (LP: #992916)
    - 9e9a554 Ec2 handle strings with "0x" (LP: #983206)
    - 26dc6b7 QuantumManager will start dnsmasq during startup. Fixes (LP: #977759)
    - 7028d66 Introduced flag base_dir_name. (LP: #973194)
    - 76b525a Get unit tests functional in OS X.
    - facb936 Update KillFilter to handle 'deleted' exe's. (LP: #967931)
    - 1209af4 Checks if value is string or not before decode. (LP: #952176)
    - 1209af4 Fix timeout in EC2 CloudController.create_image(). (LP: #989764)
    - 108e74b Re-add console_log from console_console_output(). (LP: #987335)
    - 48a0768 Don't leak RPC connections on timeouts or other exceptions. (LP: #968843)
    - 7c64de9 Cloudpipe tap vpn not always working. (LP: #975043)
    - 5ab5051 add libvirt_inject_key flag fix (LP: #971640)
    - 6c68ef5 Xen: Pass session to destroy_vdi. (LP: #988615)
    - 015744e Delete fixed_ips when network is deleted. (LP: #754900)
  * Add debian/scripts/changelog.sh to help generate the changelog.
  * Add debian/nova-common.docs:
    - Include changelog and README.rst
  * debian/rules: Generate a tarball from git snapshot.
  * debian/patches/fix-pep8-errors.patch: Fix pep8 errors due to pep8 upstream
    migration.
 -- Chuck Short <email address hidden> Tue, 05 Jun 2012 09:50:59 -0400

Changed in nova (Ubuntu Precise):
status: Fix Committed → Fix Released
Chuck Short (zulcss) on 2012-07-31
Changed in nova (Ubuntu):
status: In Progress → Fix Released
Thierry Carrez (ttx) on 2012-09-27
Changed in nova:
milestone: folsom-1 → 2012.2
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers