glanceclient.exc.HTTPBadGateway: 502 Proxy Error during server snapshot - infinite loop in File "/opt/stack/new/glance/glance/quota/__init__.py", line 168 in __getattr__

Bug #1808063 reported by Matt Riedemann
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Glance
Fix Released
High
Chris Dent

Bug Description

Seen here:

http://logs.openstack.org/56/621856/10/check/tempest-full-py3/85b3f05/controller/logs/screen-n-api.txt.gz#_Dec_11_19_57_35_024101

Dec 11 19:57:35.024101 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi [None req-76b7b44d-b425-46f0-84c9-307f3fcf9795 tempest-ImagesOneServerNegativeTestJSON-1724249280 tempest-ImagesOneServerNegativeTestJSON-1724249280] Unexpected exception in API method: glanceclient.exc.HTTPBadGateway: 502 Proxy Error: Proxy Error: The proxy server received an invalid: response from an upstream server.: The proxy server could not handle the request POST&nbsp;/image/v2/images.: Reason: Error reading from remote server: Apache/2.4.18 (Ubuntu) Server at 10.209.33.36 Port 80 (HTTP 502)
Dec 11 19:57:35.024709 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi Traceback (most recent call last):
Dec 11 19:57:35.024947 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/openstack/wsgi.py", line 655, in wrapped
Dec 11 19:57:35.025231 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi return f(*args, **kwargs)
Dec 11 19:57:35.025994 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/validation/__init__.py", line 110, in wrapper
Dec 11 19:57:35.026254 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi return func(*args, **kwargs)
Dec 11 19:57:35.026499 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/validation/__init__.py", line 110, in wrapper
Dec 11 19:57:35.026754 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi return func(*args, **kwargs)
Dec 11 19:57:35.027784 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/openstack/compute/servers.py", line 1059, in _action_create_image
Dec 11 19:57:35.028062 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi extra_properties=metadata)
Dec 11 19:57:35.028299 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/compute/api.py", line 213, in _wrapped
Dec 11 19:57:35.028530 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi return fn(self, context, instance, *args, **kwargs)
Dec 11 19:57:35.028855 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/compute/api.py", line 153, in inner
Dec 11 19:57:35.029094 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi return f(self, context, instance, *args, **kw)
Dec 11 19:57:35.029366 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/compute/api.py", line 2959, in snapshot
Dec 11 19:57:35.029754 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi extra_properties=extra_properties)
Dec 11 19:57:35.029995 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/compute/api.py", line 3032, in _create_image
Dec 11 19:57:35.030244 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi return self.image_api.create(context, image_meta)
Dec 11 19:57:35.030485 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/image/api.py", line 118, in create
Dec 11 19:57:35.030716 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi return session.create(context, image_info, data=data)
Dec 11 19:57:35.030966 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/image/glance.py", line 477, in create
Dec 11 19:57:35.031200 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi _reraise_translated_exception()
Dec 11 19:57:35.031546 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/image/glance.py", line 915, in _reraise_translated_exception
Dec 11 19:57:35.031899 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi six.reraise(type(new_exc), new_exc, exc_trace)
Dec 11 19:57:35.032269 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.5/dist-packages/six.py", line 693, in reraise
Dec 11 19:57:35.032612 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi raise value
Dec 11 19:57:35.032977 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/image/glance.py", line 475, in create
Dec 11 19:57:35.033337 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi data, force_activate)
Dec 11 19:57:35.033727 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/image/glance.py", line 552, in _create_v2
Dec 11 19:57:35.034080 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi context, 2, 'create', kwargs=sent_service_image_meta)
Dec 11 19:57:35.034424 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/opt/stack/nova/nova/image/glance.py", line 193, in call
Dec 11 19:57:35.034876 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi result = getattr(controller, method)(*args, **kwargs)
Dec 11 19:57:35.035320 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.5/dist-packages/glanceclient/common/utils.py", line 598, in inner
Dec 11 19:57:35.035693 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi return RequestIdProxy(wrapped(*args, **kwargs))
Dec 11 19:57:35.039116 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.5/dist-packages/glanceclient/v2/images.py", line 361, in create
Dec 11 19:57:35.039411 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi resp, body = self.http_client.post(url, headers=headers, data=image)
Dec 11 19:57:35.039662 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.5/dist-packages/keystoneauth1/adapter.py", line 357, in post
Dec 11 19:57:35.039899 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi return self.request(url, 'POST', **kwargs)
Dec 11 19:57:35.040164 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.5/dist-packages/glanceclient/common/http.py", line 377, in request
Dec 11 19:57:35.040400 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi return self._handle_response(resp)
Dec 11 19:57:35.040632 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.5/dist-packages/glanceclient/common/http.py", line 126, in _handle_response
Dec 11 19:57:35.040876 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi raise exc.from_response(resp, resp.content)
Dec 11 19:57:35.041177 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi glanceclient.exc.HTTPBadGateway: 502 Proxy Error: Proxy Error: The proxy server received an invalid: response from an upstream server.: The proxy server could not handle the request POST&nbsp;/image/v2/images.: Reason: Error reading from remote server: Apache/2.4.18 (Ubuntu) Server at 10.209.33.36 Port 80 (HTTP 502)
Dec 11 19:57:35.041412 ubuntu-xenial-rax-dfw-0001132076 <email address hidden>[13547]: ERROR nova.api.openstack.wsgi

This bug is for e-r tracking.

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Unexpected%20exception%20in%20API%20method%3A%20glanceclient.exc.HTTPBadGateway%3A%20502%20Proxy%20Error%3A%20Proxy%20Error%3A%20The%20proxy%20server%20received%20an%20invalid%3A%20response%20from%20an%20upstream%20server.%3A%20The%20proxy%20server%20could%20not%20handle%20the%20request%20POST%26nbsp%3B%2Fimage%2Fv2%2Fimages.%3A%20Reason%3A%20Error%20reading%20from%20remote%20server%3A%5C%22%20AND%20tags%3A%5C%22screen-n-api.txt%5C%22&from=7d

Matt Riedemann (mriedem)
Changed in openstack-gate:
status: New → Confirmed
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Chris Dent (cdent) wrote :

For the sake of more data, here are some links into a recent failure of this type. I don't have ideas, but if the links are useful...

milliseconds before the message is produced from the temptest test[1] the g-api service has a python stackoverflow [2], somewhere in the midst of processing the PUT request [3] from the test.

This appears to be happening while quota is being checked. It is not clear from the logs if this is for the same request or not. And somehow oslo_messaging is involved.

[1] http://logs.openstack.org/41/617941/31/gate/grenade-py3/7909f30/job-output.txt.gz#_2018-12-13_14_43_21_472048
[2] http://logs.openstack.org/41/617941/31/gate/grenade-py3/7909f30/logs/screen-g-api.txt.gz#_Dec_13_14_43_21_413081
[3] http://logs.openstack.org/41/617941/31/gate/grenade-py3/7909f30/logs/screen-g-api.txt.gz#_Dec_13_14_43_21_042332

Revision history for this message
Chris Dent (cdent) wrote :

Another link: a while past the overflow, uwsgi notices a dead worker and restarts it [1]. This means that the uwsgi work that died died midflight, thus an incomplete response and the message "Te proxy server received an invalid\\r\\nresponse from an upstream server"

The thing about content-type is misleading, the tempest lib isn't checking for a valid response before checking for content-type.

[1] http://logs.openstack.org/41/617941/31/gate/grenade-py3/7909f30/logs/screen-g-api.txt.gz#_Dec_13_14_43_21_901912

Revision history for this message
Matt Riedemann (mriedem) wrote :
Download full text (4.0 KiB)

There is definitely some kind of infinite loop here:

Dec 13 14:43:21.412172 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: /opt/stack/new/glance/glance/quota/__init__.py:168: ResourceWarning: unclosed <socket.socket fd=14, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('104.130.211.192', 52252), raddr=('104.130.211.192', 8080)>
Dec 13 14:43:21.412563 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: return getattr(self.tags, name)
Dec 13 14:43:21.413081 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: Fatal Python error: Cannot recover from stack overflow.
Dec 13 14:43:21.413440 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: Thread 0x00007f4ff7ac8700 (most recent call first):
Dec 13 14:43:21.413817 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/lib/python3.5/threading.py", line 297 in wait
Dec 13 14:43:21.414179 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/lib/python3.5/threading.py", line 549 in wait
Dec 13 14:43:21.414511 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/_drivers/impl_rabbit.py", line 926 in _heartbeat_thread_job
Dec 13 14:43:21.414869 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/lib/python3.5/threading.py", line 862 in run
Dec 13 14:43:21.415584 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/lib/python3.5/threading.py", line 914 in _bootstrap_inner
Dec 13 14:43:21.415951 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/lib/python3.5/threading.py", line 882 in _bootstrap
Dec 13 14:43:21.416261 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: Thread 0x00007f4ff8349700 (most recent call first):
Dec 13 14:43:21.416622 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/lib/python3.5/threading.py", line 297 in wait
Dec 13 14:43:21.416995 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/lib/python3.5/threading.py", line 549 in wait
Dec 13 14:43:21.417357 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/local/lib/python3.5/dist-packages/oslo_messaging/_drivers/impl_rabbit.py", line 926 in _heartbeat_thread_job
Dec 13 14:43:21.417721 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/lib/python3.5/threading.py", line 862 in run
Dec 13 14:43:21.418085 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/lib/python3.5/threading.py", line 914 in _bootstrap_inner
Dec 13 14:43:21.418450 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/usr/lib/python3.5/threading.py", line 882 in _bootstrap
Dec 13 14:43:21.419113 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: Current thread 0x00007f50070af780 (most recent call first):
Dec 13 14:43:21.419499 ubuntu-xenial-rax-ord-0001176665 <email address hidden>[6990]: File "/opt/stack/new/glance/glance/quota/__init__.py", line 168 in __getattr__
Dec 13 14:43:21.419875 ubuntu-xenial-rax-ord-0001176665 <email address hidden>...

Read more...

Changed in glance:
importance: Undecided → High
status: New → Confirmed
summary: glanceclient.exc.HTTPBadGateway: 502 Proxy Error during server snapshot
+ - infinite loop in File
+ "/opt/stack/new/glance/glance/quota/__init__.py", line 168 in
+ __getattr__
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to glance (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/625086

Revision history for this message
Matt Riedemann (mriedem) wrote :

jokke figured out that there is a connection between this change in oslo.policy 1.43.1:

https://github.com/openstack/oslo.policy/commit/b9fd10e2612f26c93d49c168a0408aba6d20e5bf

Which does a deepcopy on a target, which might be one of these deeply nested proxy objects in glance, and that could be triggering the infinite recursion. I'm able to recreate functional-py35 failures every time and when I go back to oslo.policy 1.43.0 the functional-py35 job works again.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/625114

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to glance (master)

Reviewed: https://review.openstack.org/625086
Committed: https://git.openstack.org/cgit/openstack/glance/commit/?id=044d6b45f98238f4a5745922ef93223d05771d4c
Submitter: Zuul
Branch: master

commit 044d6b45f98238f4a5745922ef93223d05771d4c
Author: Chris Dent <email address hidden>
Date: Thu Dec 13 18:59:39 2018 +0000

    Guard __getattr__ on QuotaImageTagsProxy

    In the related bug (below) we can see getting tags from a
    QuotaImageTagsProxy can sometimes cause an infinite loop, presumably
    because of bizarre threading circumstances. To at least provide a bit
    more information when this bizarre circumstance happens, instead
    of looping, resulting in a stackoverflow and the processing being
    terminated without a traceback, now raise a TypeError. This should
    help figure out what may be going on in the rare times this happens.

    Also, it is generally a healthy idea to always have a guard like this
    in a __getattr__ that uses an attr.

    Change-Id: Ib4ad119a838d5b06372d9302dc6608da1772eed6
    Related-Bug: #1808063

Revision history for this message
Matt Riedemann (mriedem) wrote :

I think we can probably at least close the gate bug for this since it looks like Chris' fix fixed the infinite recursion problem:

http://status.openstack.org/elastic-recheck/#1807518

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/625114
Committed: https://git.openstack.org/cgit/openstack/glance/commit/?id=deffb09871c9460497929c57bd82ca78d1c3177f
Submitter: Zuul
Branch: master

commit deffb09871c9460497929c57bd82ca78d1c3177f
Author: Chris Dent <email address hidden>
Date: Thu Dec 13 20:56:39 2018 +0000

    Make QuotaImageTagsProxy deep-copyable

    Infinite recursion is possible if you try to deepcopy a
    QuotaImageTagsProxy object, which is something that happens when using
    oslo.policy 1.43.1 and beyond.

    By adding a __gettattribute__ within the __getattr__ method we can break
    the loop when the name of the attribute is 'tags'.

    The added test will fail (with infinite recursion) if the new code is
    removed.

    For completeness a test is added checking what happens when the tags
    attribute is removed.

    The thing that is not immediately clear from the __getattr__ code is
    that it is not for getting at the tags themselves, it is for getting
    at the methods on the set object at self.tags. The non-obviousness
    of this led to rather a lot of confusion while trying to get this
    code right. It isn't really "right" now, but rather: safe in the
    context of its use.

    Change-Id: Ic47c9bf8e9b97de5a5a49a35f631753c54e0e2af
    Related-Bug: #1808063

Matt Riedemann (mriedem)
Changed in glance:
status: Confirmed → Fix Released
no longer affects: openstack-gate
Changed in glance:
assignee: nobody → Chris Dent (cdent)
Erno Kuvaja (jokke)
Changed in glance:
milestone: none → stein-1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.