openstack+ceph glance image create stuck in datastore rbd mode

Bug #1340664 reported by Mh Raies
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Glance
Incomplete
Undecided
Unassigned

Bug Description

I have 3 node ceph-cluster + 1 node openstack (legacy networking and remote cinder volume node)

ceph health == OK

On OpenStack node I have configured glance-1pi.conf as -

default_store = file
rbd_store_ceph_conf = /etc/ceph/ceph.conf
rbd_store_user = glance
rbd_store_pool = images
rbd_store_chunk_size = 8

When I run glance image-create command then this command stuck till timeout

I again increased timeout to 15 minute still same error.

Now after time-out, I run glance image-list it also stuck for too long.

Now I restart glance-api service.

Now I can run glance image-list, and it shows that the image which I was trying to create is in "saving" state.

Tags: backend
Mh Raies (raiesmh08)
tags: added: glance
tags: added: ceph
removed: glance
Revision history for this message
Feilong Wang (flwang) wrote :

Hi Mh Raies, thanks for reporting this. I'm trying to figure out which 'stuck' issue you're trying to report? stucking when image create? or stucking in saving status after you restart the iamge api service? I think they are different. And I think the later is hard to resolve based on current implement.

Erno Kuvaja (jokke)
tags: added: backend
removed: ceph
Changed in glance:
status: New → Incomplete
Revision history for this message
Mh Raies (raiesmh08) wrote :

Hi Erno -

Stuck mean - When I entered command "glance create" with timeout to 10 minutes, nothing command gets stuck till timeout. after timeout there is a message timeout.

Now I wanted to check whether image is created or not - I run command "glance image-list" it also stuck till timeout.

As I could not check image list, I restarted glance services.

Now I run "glance image-list", and output displayed was such that above image (of which creation got stuck as of above) is showing "saving state"

I think it is problem. I am not getting any error in logs.

Revision history for this message
William Law (wlaw) wrote :

Hello,

I think I'm seeing the same issue with glance. I think it is actually settings related; not clear yet. [I just think that as I keep on stumbling on various openstack conf files]

glance image-create --name=cirros-raw-9:40 --disk-format=raw --container-format=bare --is-public=yes < cirros-0.3.2-x86_64-disk.raw
Error communicating with http://192.168.2.26:9292 timed out

glance image-list
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+
| ID | Name | Disk Format | Container Format | Size | Status |
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+
| 8fceded7-3416-4deb-9621-e47cefbc4751 | another test | qcow2 | bare | 13200896 | saving |
| 0aa32bcd-ee69-40f2-b832-2977b23334b4 | c2 | raw | bare | 41126400 | saving |
| 6b9c87eb-21a6-40cf-b337-297ee23539ac | cirros-0.3.2-raw | raw | bare | 41126400 | saving |
| 79f86fbb-ee1f-45c7-8d38-fd12855b60b4 | cirros-raw-9:40 | raw | bare | 41126400 | saving |
| 2ca6b476-352c-4c3b-8687-bf565479dbe3 | cirros2 | qcow2 | bare | 13167616 | active |
| 3151ba68-c6d3-4937-8f00-57243781be93 | raw_cirros_upload | raw | bare | 41126400 | saving |
| 8d618e97-3f5b-49da-a68d-fb2d4ebeccdf | trusty-server-cloud | qcow2 | bare | 255853056 | active |
+--------------------------------------+---------------------+-------------+------------------+-----------+--------+

Here's the registry.log, showing it finished:
2014-10-22 21:55:05.310 16917 INFO glance.wsgi.server [6701ed27-d16a-460c-a056-27aca91a02cd 2006cccbfa314b8e988624a6ca0a5095 7924efbe387f4b0d88f390c114f7626d - - -] 192.168.2.26 - - [22/Oct/2014 21:55:05] "GET /images/3151ba68-c6d3-4937-8f00-57243781be93 HTTP/1.1" 200 710 0.007189
2014-10-22 21:55:05.356 16917 INFO glance.registry.api.v1.images [a39d2179-daa7-4319-95b2-2b3314a8ad13 2006cccbfa314b8e988624a6ca0a5095 7924efbe387f4b0d88f390c114f7626d - - -] Successfully retrieved image 79f86fbb-ee1f-45c7-8d38-fd12855b60b4

Nothing in the api log other than lots of things like:
2014-10-22 21:55:28.524 13849 INFO glance.wsgi.server [bc5852cd-a0fe-497b-a8fc-24e879c28794 2006cccbfa314b8e988624a6ca0a5095 7924efbe387f4b0d88f390c114f7626d - - -] 192.168.2.28 - - [22/Oct/2014 21:55:28] "HEAD /v1/images/79f86fbb-ee1f-45c7-8d38-fd12855b60b4 HTTP/1.1" 200 768 0.011354

Revision history for this message
William Law (wlaw) wrote :

Figured out my issue. I was being too fancy.

The ceph storage was on a different network than the glance services; I needed to add a nic [or, god forbid, routing] to get over there. My suggestion if you into what I did, which might not be applicable too all instances, is try connecting to the pool with rbd.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.