Deploying ISO with configuration openstack-8.0-ubuntu-vxlan-lvm-swift triggers to an error

Bug #1517013 reported by Sergii Turivnyi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Kairat Kushaev

Bug Description

Steps to reproduce:
1. Deploying ISO with configuration openstack-8.0-ubuntu-vxlan-lvm-swift

Expected result:
Deploy is successful

Actual result:
Deploy triggers to an error
glanceclient.exc.HTTPInternalServerError: 500 Internal Server Error: The server has either erred or is incapable of performing the requested operation. (HTTP 500)

2015-11-17T01:07:23.083162+00:00 err: Object DELETE failed: http://192.168.0.2:8080/v1/AUTH_31a2d083bce94302b3c2229b2edbc01a/glance/530b0f4d-a14f-4803-97a7-c5d486093bd8-00017 409 Conflict [first 60 chars of response] <html><h1>Conflict</h1><p>There was a conflict when trying t
2015-11-17T01:07:23.083983+00:00 err: Failed to upload image data due to internal error
2015-11-17T01:07:23.136461+00:00 err: Caught error: Swift already has an image at this location

See logs in attachment

Revision history for this message
Sergii Turivnyi (sturivnyi) wrote :
Changed in fuel:
status: New → Confirmed
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Looks like the issue in configuration of Swift or Glance, after the deployment users can't upload images to Glance and Glance returns error with 500 status code.

MOS Glance team, could you please take a loo on the issue? (since we have no dedicated Swift team yet).

Thank you!

Changed in fuel:
assignee: nobody → MOS Glance (mos-glance)
Revision history for this message
Kairat Kushaev (kkushaev) wrote :

I executed grep -nr "tx17e3067c032442818eefb-00564a7dcb" . to find places where we got request.
I found the following: https://paste.mirantis.net/show/1434/.
Because of some reason HA (I guess) sends 3 similar requests to node-1, node-2 and node-3.
When I looked into swift log files for node-1 and node-3 I found a lot of connection refused and socket errors in these logs:
swift-object-server on node-1:
2015-11-17T00:12:27.579547+00:00 err: rsync: failed to connect to 192.168.1.1 (192.168.1.1): Connection refused (111)
2015-11-17T00:12:27.579547+00:00 err: rsync error: error in socket IO (code 10) at clientserver.c(128) [sender=3.1.0]
2015-11-17T00:12:27.580171+00:00 err: Bad rsync return code: 10 <- ['rsync', '--recursive', '--whole-file', '--human-readable', '--xattrs', '--itemize-changes', '--ignore-existing', '--timeout=30', '--contimeout=30', '--bwlimit=0', '/var/lib/glance/node/2/objects/799/8b1', '192.168.1.1::object/2/objects/799']

So it seems that we can't find the reason without swift experts.

Changed in fuel:
assignee: MOS Glance (mos-glance) → MOS Swift (mos-swift)
Dmitry Pyzhov (dpyzhov)
tags: added: area-mos
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :
Igor Marnat (imarnat)
Changed in fuel:
assignee: MOS Swift (mos-swift) → Alyona Kiseleva (akiselyova)
Revision history for this message
Alyona Kiseleva (akiselyova) wrote :

This error occurs, because delete operation is performed just after put operation, some race took place, that's why this is not regular bug. It's known problem. Swift have no fix for this.
The solutions can be little delay between put and delete operations or repeating of delete command several times.

Revision history for this message
Alyona Kiseleva (akiselyova) wrote :

* Both solutions can't be done by me, I assume it should be fixed in glance.

Changed in fuel:
assignee: Alyona Kiseleva (akiselyova) → nobody
Ilya Kutukov (ikutukov)
Changed in fuel:
assignee: nobody → MOS Glance (mos-glance)
Revision history for this message
Kairat Kushaev (kkushaev) wrote :

We have a fix for wait condition when https://review.openstack.org/#/c/254873.
The race condition happens when we are deleting final zero-size chunk when uploading image.
@Alyone, is this the race condition you mentioned? Could you please check that the trouble with zero-chunk size in swift?

Changed in fuel:
assignee: MOS Glance (mos-glance) → Kairat Kushaev (kkushaev)
Revision history for this message
Kairat Kushaev (kkushaev) wrote :

I have found the chunk in glance:
2015-11-17T01:07:23.034722+00:00 debug: Wrote chunk 530b0f4d-a14f-4803-97a7-c5d486093bd8-00016 (16/?) of length 192806912 to Swift returning MD5 of content: 9bf81ce79c7e0565c6e0026a3f2a633d
2015-11-17T01:07:23.060612+00:00 warning: Connection pool is full, discarding connection: 192.168.0.2
2015-11-17T01:07:23.060977+00:00 debug: REQ: curl -i http://192.168.0.2:8080/v1/AUTH_31a2d083bce94302b3c2229b2edbc01a/glance/530b0f4d-a14f-4803-97a7-c5d486093bd8-00017 -X PUT -H "X-Auth-Token: 8964e4993856436cbc8a81df43d18045"
2015-11-17T01:07:23.061757+00:00 debug: RESP STATUS: 201 Created
2015-11-17T01:07:23.061757+00:00 debug: RESP HEADERS: [('content-length', '0'), ('last-modified', 'Tue, 17 Nov 2015 01:07:24 GMT'), ('connection', 'close'), ('etag', 'd41d8cd98f00b204e9800998ecf8427e'), ('x-trans-id', 'txc38f8f3dd1cb4ee4a09d7-00564a7dcb'), ('date', 'Tue, 17 Nov 2015 01:07:23 GMT'), ('content-type', 'text/html; charset=UTF-8')]
2015-11-17T01:07:23.061757+00:00 debug: Wrote chunk 530b0f4d-a14f-4803-97a7-c5d486093bd8-00017 (17/?) of length 0 to Swift returning MD5 of content: d41d8cd98f00b204e9800998ecf8427e
2015-11-17T01:07:23.062247+00:00 debug: Deleting final zero-length chunk
2015-11-17T01:07:23.082819+00:00 info: REQ: curl -i http://192.168.0.2:8080/v1/AUTH_31a2d083bce94302b3c2229b2edbc01a/glance/530b0f4d-a14f-4803-97a7-c5d486093bd8-00017 DELETE -H "X-Auth-Token: 8964e4993856436cbc8a81df43d18045"
2015-11-17T01:07:23.082819+00:00 info: RESP STATUS: 409 Conflict
2015-11-17T01:07:23.082819+00:00 info: RESP HEADERS: [('date', 'Tue, 17 Nov 2015 01:07:23 GMT'), ('content-length', '95'), ('content-type', 'text/html; charset=UTF-8'), ('connection', 'close'), ('x-trans-id', 'tx17e3067c032442818eefb-00564a7dcb')]
2015-11-17T01:07:23.083162+00:00 info: RESP BODY: <html><h1>Conflict</h1><p>There was a conflict when trying to complete your request.</p></html>
2015-11-17T01:07:23.083162+00:00 err: Object DELETE failed: http://192.168.0.2:8080/v1/AUTH_31a2d083bce94302b3c2229b2edbc01a/glance/530b0f4d-a14f-4803-97a7-c5d486093bd8-00017 409 Conflict [first 60 chars of response] <html><h1>Conflict</h1><p>There was a conflict when trying t
2015-11-17T01:07:23.083983+00:00 err: Failed to upload image data due to internal error
2015-11-17T01:07:23.136461+00:00 err: Caught error: Swift already has an image at this location

The trouble is in final zero-length chunks. Will try to push upstream folks.

Revision history for this message
Kairat Kushaev (kkushaev) wrote :

Fix has been merged to upstream and backported to liberty.

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Kairat Kushaev (kkushaev) wrote :
Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Alexey Galkin (agalkin) wrote :

Verificated on mos 8.0 iso #462

Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.