Mirantis OpenStack

Glance-api prematurely terminates client connections

Bug #1592140 reported by Eugene Nikanorov on 2016-06-13

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Mirantis OpenStack	Invalid	Medium	MOS Glance	Mirantis OpenStack 5.1.1-updates
	5.1.x	Invalid	Medium	MOS Glance	Mirantis OpenStack 5.1.1-updates

Bug Description

Sometimes when glance-api workers consume lots of memory, client connections could be terminated prematurely.
When glance-api is accessed via haproxy, it results in 502 errors given to the client.

The only noticeable condition is that glance-api processes worked for half a year, and memory consumption has reached 10 Gigs per worker.

That might be that eventlet is receiving some sort of system error trying to accept incoming connection.
Failed requests never reach glance code itself, therefore they are never logged.

See original description

Tags:

Revision history for this message

Eugene Nikanorov (enikanorov) wrote on 2016-06-13:

The only version that was discovered is 5.1

Revision history for this message

Bug Checker Bot (bug-checker) wrote on 2016-06-13: Autochecker

(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:

actual result

version

expected result

steps to reproduce

For more detailed information on the contents of each of the listed sections see https://wiki.openstack.org/wiki/Fuel/How_to_contribute#Here_is_how_you_file_a_bug

tags:

added: need-info

Revision history for this message

Kairat Kushaev (kkushaev) wrote on 2016-06-17:

there is a workaround for that bug, it is also seems very hard to re-produce.
because of that I will mark that as medium, hope we will get some time to dig into this soon.

Changed in mos:
importance:	Undecided → Medium
assignee:	nobody → MOS Glance (mos-glance)
status:	New → Confirmed

Revision history for this message

Dina Belova (dbelova) wrote on 2016-06-17:

Targeting to 5.1.1 updates.. this a medium bug, so it is not going to be fixed, but Glance team would like to check it anyway.

Changed in mos:
milestone:	none → 5.0-updates
milestone:	5.0-updates → 5.1.1-updates

Revision history for this message

Mike Fedosin (mfedosin) wrote on 2016-06-17:

Hello! This bug was fixed during liberty cycle and backported to kilo - the idea is to introduce client socket timeout, because without it connections can exist forever: https://review.openstack.org/#/c/119132/

Eugene Nikanorov (enikanorov) on 2016-06-17

description:

updated

Revision history for this message

Eugene Nikanorov (enikanorov) wrote on 2016-06-17:

I think the mentioned fix is unrelated to the problem I'm describing.
Faulty glance worker did not have any connections hanging, in neither state.
What is known for sure is that glance workers consumed ~10Gig of memory, which led to memory exhaustion and could let to system error at eventlet level.

Such errors, as well as incoming requests themselves, were never logged in glance logs, so it is only a theory.

So I'd say the first thing that we need to reproduce or find corresponding bug about is memory leak.
10Gig is too much for a stateless service.

Revision history for this message

Roman Rufanov (rrufanov) wrote on 2016-07-22:

could it affect versions later them 5.1 ? If yes - please nominate.

tags:

added: customer-found support

Revision history for this message

Kairat Kushaev (kkushaev) wrote on 2016-07-26:

Invalid because the root cause was not Glance.
According to explanation from Eugene Nikanorov:

it was all qemu...
8:11 too many OSDs in their cloud. When large files are read/written, qemu opens connections to pretty mch every os, exhausting file descriptors
8:11 because the default limit is low

Changed in mos:
status:	Confirmed → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.