CI: IOError in glance when uploading an image (swift backend)

Bug #1646750 reported by Gabriele Cerami
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Won't Fix
High
Unassigned

Bug Description

overcloud logs at http://logs.openstack.org/periodic/periodic-tripleo-ci-centos-7-ovb-ha/19aa208/logs/overcloud-controller-2/var/log/glance/api.txt.gz#_2016-12-02_07_56_28_327

reports

2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store [req-660f6647-f809-4db1-b1c8-2b331d2d96e8 b24e2bb472624e24a7257fc8fb8fcb13 5efc72dcf3a547abb639d6aa959e24f5 - default default] Error during chunked upload to backend, deleting stale chunks
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store Traceback (most recent call last):
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/glance_store/_drivers/swift/store.py", line 900, in add
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store content_length=content_length)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1824, in put_object
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store response_dict=response_dict)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1673, in _retry
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store service_token=self.service_token, **kwargs)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1302, in put_object
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store conn.putrequest(path, headers=headers, data=data)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 446, in putrequest
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store return self.request('PUT', full_path, data, headers, files)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 436, in request
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store files=files, **self.requests_args)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 419, in _request
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store return self.request_session.request(*arg, **kwarg)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 475, in request
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store resp = self.send(prep, **send_kwargs)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 585, in send
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store r = adapter.send(request, **kwargs)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 423, in send
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store for i in request.body:
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/swiftclient/utils.py", line 231, in next
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store return self.__next__()
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/swiftclient/utils.py", line 219, in __next__
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store chunk = self.content.read(self.chunk_size)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/glance_store/_drivers/swift/store.py", line 1489, in read
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store result = self.do_read(i)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/glance_store/_drivers/swift/store.py", line 1482, in do_read
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store return self.fd.read(i)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/glance/common/utils.py", line 241, in read
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store result = self.data.read(i)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/glance/common/utils.py", line 121, in readfn
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store result = fd.read(*args)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/eventlet/wsgi.py", line 194, in read
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store return self._chunked_read(self.rfile, length)
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store File "/usr/lib/python2.7/site-packages/eventlet/wsgi.py", line 165, in _chunked_read
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store raise IOError("unexpected end of file while parsing chunked data")
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store IOError: unexpected end of file while parsing chunked data
2016-12-02 07:56:28.327 50490 ERROR glance_store._drivers.swift.store

Not sure the problem is in glance or swift, but this could also just show a network connectivity problem of some sort

Tags: ci
Changed in tripleo:
milestone: ocata-2 → ocata-3
Revision history for this message
Emilien Macchi (emilienm) wrote :

Gabriele, could you please give an update on this bug? Can we close it?

tags: removed: promotion-blocker
Revision history for this message
Gabriele Cerami (gcerami) wrote :

Bug disappeared. not sure what fixed it

Changed in tripleo:
status: Triaged → Fix Released
Changed in tripleo:
status: Fix Released → Triaged
tags: added: alert
Revision history for this message
Emilien Macchi (emilienm) wrote :
Changed in tripleo:
assignee: nobody → Emilien Macchi (emilienm)
Revision history for this message
Emilien Macchi (emilienm) wrote :

"timburke [16:29:10] EmilienM: basically, the swift log was a red herring; the proxy handled everything just fine. the object server didn't handle the proxy's disconnect gracefully, but that didn't impact glance at all. the issue is still the IOError in glance's logs, and that seems to be caused by an unexpected client disconnect"

summary: - CI: periodic jobs fail to upload test image to overcloud glance
+ CI: IOError in glance when uploading an image (swift backend)
Changed in tripleo:
assignee: Emilien Macchi (emilienm) → nobody
Revision history for this message
Alex Schultz (alex-schultz) wrote :

So I thought I had recognized this type of issue which brought me to Bug 1375596 which seemed to indicate the default timeout we have for the swift node_timeout settings (60) may not be sufficient. We may need to increase this to 90 by default to ensure this doesn't happen. Since in the logs, the upload start to failure time is about 63 seconds I'm assuming this is what we're hitting.

Revision history for this message
Alex Schultz (alex-schultz) wrote :

sorry my math is a bit off on the timing but the error is still consistent with the previously mentioned bug. I'm attempting to poke around a bit more but is this happening consistently or it is still random?

Revision history for this message
Alex Schultz (alex-schultz) wrote :

Of course from the logs there is:

Jan 5 17:30:11 host-192-168-24-11 proxy-server: Client disconnected without sending last chunk (txn: tx4d0f0eb736a74d75901d8-00586e82a3) (client_ip: 172.18.0.14)

Revision history for this message
Alex Schultz (alex-schultz) wrote :

This seems like some sort of timeout issue probably in the haproxy config between the various services.

Revision history for this message
Emilien Macchi (emilienm) wrote :
tags: removed: alert
Changed in tripleo:
importance: Critical → High
Changed in tripleo:
milestone: ocata-3 → ocata-rc1
Revision history for this message
Ben Nemec (bnemec) wrote :

Given how rarely this happens, I don't think it's critical that we fix it for ocata. Re-targeting to ongoing.

Changed in tripleo:
milestone: ocata-rc1 → ongoing
Revision history for this message
Alex Schultz (alex-schultz) wrote :

According to logstash this hasn't happened in >30 days in CI. So i'm going to close this bug out for now

Changed in tripleo:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.