glance-api unresponsive during long-lived I/O-bound operations
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Glance |
Fix Released
|
Critical
|
Eoghan Glynn |
Bug Description
The following commit changed image downloading when using copy_from to be asynchronous:
commit 41c164139cab619
Author: Eoghan Glynn <email address hidden>
Date: Wed Sep 5 14:33:47 2012 +0000
Asynchronously copy from external image source
Fixes bug 1008874, bug 1046433.
Avoid tieing up dispatch thread for large copy-from images,
instead initiate copy asynchronously.
The response status is not set to 202 Accepted as per standard
RESTful idiom, as a non-error response code change requires
an API version bump.
Instead, the incomplete nature of the image registration is
reflected in the image status.
Change-Id: I06692422490de0
Unfortunately, it appears that there is a greenthread scheduling problem that still leaves glance-api in a bad state while the image is downloaded.
[rbryant@
Added new image with ID: 9717656c-
real 0m0.715s
user 0m0.117s
sys 0m0.033s
[rbryant@
ID Name Disk Format Container Format Size
-------
9717656c-
real 1m11.992s
user 0m0.109s
sys 0m0.022s
[rbryant@
ID Name Disk Format Container Format Size
-------
9717656c-
real 1m1.287s
user 0m0.124s
sys 0m0.061s
(repeat 'time glance index' 4 more times, with times varying from 1 to 1.5 minutes)
[rbryant@
ID Name Disk Format Container Format Size
-------
9717656c-
real 0m24.255s
user 0m0.125s
sys 0m0.034s
[rbryant@
ID Name Disk Format Container Format Size
-------
9717656c-
real 0m0.443s
user 0m0.110s
sys 0m0.017s
(all further instances return quickly like this)
description: | updated |
tags: | added: folsom-rc-potential |
Changed in glance: | |
assignee: | nobody → Eoghan Glynn (eglynn) |
milestone: | none → grizzly-1 |
status: | New → In Progress |
importance: | Undecided → Critical |
tags: | removed: folsom-rc-potential |
no longer affects: | glance/grizzly |
no longer affects: | glance/folsom |
summary: |
- glance-api unresponsive while downloading an image with copy_from + glance-api unresponsive during long-lived I/O-bound operations |
Changed in glance: | |
milestone: | folsom-rc2 → 2012.2 |
Recently the glance copy-from logic was made asynchronous, so that a 202 response code is returned immediately, and the download from the remote location proceeds on a greenthread.
This change was the obvious candidate for causing the blockage on the subsequent API calls.
However, it turns out that with the copy-from reverting to its original synchronous form, or even just using a slow direct upload, we still see concurrent API calls being blocked.
So there's something more fundamental awry on the glance dispatch path.
Continuing to investigate ...