python3 + Fedora + SSL + nova compute RecursionError: maximum recursion depth exceeded while calling a Python object

Bug #1808975 reported by yatin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Matthew Booth

Bug Description

Description:- While Testing python3 Fedora deployment for nova in [1] got below Recursion Error in nova-compute:-

2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager [req-f908a9e0-e77a-4d35-9266-fc5e8d79dfde - - - - -] Error updating resources for node rdo-fedora-stable-rdo-cloud-0000358855.: RecursionError: maximum recursion depth exceeded while calling a Python object
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager Traceback (most recent call last):
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 7690, in _update_available_resource_for_node
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager rt.update_available_resource(context, nodename, startup=startup)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 738, in update_available_resource
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager self._update_available_resource(context, resources, startup=startup)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 328, in inner
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager return f(*args, **kwargs)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 794, in _update_available_resource
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager instance_by_uuid)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 1256, in _remove_deleted_instances_allocations
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager context, cn.uuid)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/scheduler/client/report.py", line 2165, in get_allocations_for_resource_provider
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager resp = self.get(url, global_request_id=context.global_id)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/scheduler/client/report.py", line 297, in get
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager return self._client.get(url, microversion=version, headers=headers)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 351, in get
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager return self.request(url, 'GET', **kwargs)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 213, in request
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager return self.session.request(url, method, **kwargs)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 684, in request
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager auth_headers = self.get_auth_headers(auth)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1071, in get_auth_headers
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager return auth.get_headers(self, **kwargs)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/plugin.py", line 95, in get_headers
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager token = self.get_token(session)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 88, in get_token
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager return self.get_access(session).auth_token
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 134, in get_access
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager self.auth_ref = self.get_auth_ref(session)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py", line 206, in get_auth_ref
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager self._plugin = self._do_create_plugin(session)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py", line 138, in _do_create_plugin
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager authenticated=False)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 610, in get_discovery
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager authenticated=authenticated)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 1436, in get_discovery
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager disc = Discover(session, url, authenticated=authenticated)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 520, in __init__
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager authenticated=authenticated)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 101, in get_version_data
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager resp = session.get(url, headers=headers, authenticated=authenticated)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1011, in get
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager return self.request(url, 'GET', **kwargs)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 814, in request
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager resp = send(**kwargs)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 903, in _send_request
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager resp = self.session.request(method, url, **kwargs)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 524, in request
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager resp = self.send(prep, **send_kwargs)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 637, in send
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager r = adapter.send(request, **kwargs)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager timeout=timeout
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager chunked=chunked)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in _make_request
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager self._validate_conn(conn)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 839, in _validate_conn
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager conn.connect()
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 332, in connect
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager cert_reqs=resolve_cert_reqs(self.cert_reqs),
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 281, in create_urllib3_context
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager context.options |= options
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib64/python3.6/ssl.py", line 423, in options
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager super(SSLContext, SSLContext).options.__set__(self, value)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib64/python3.6/ssl.py", line 423, in options
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager super(SSLContext, SSLContext).options.__set__(self, value)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager File "/usr/lib64/python3.6/ssl.py", line 423, in options
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager super(SSLContext, SSLContext).options.__set__(self, value)
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager [Previous line repeated 294 more times]
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager RecursionError: maximum recursion depth exceeded while calling a Python object
2018-12-18 08:00:05.266 2428 ERROR nova.compute.manager

Don't know what's causing it but it somehow related to the eventlet monkey patching done in nova. I filed seperated bug(also filed https://bugs.launchpad.net/nova/+bug/1808951 for nova-api running under wsgi) for nova-compute as issue here seems different(as nova-compute is not running under wsgi).

Logs from job: https://logs.rdoproject.org/61/620561/12/openstack-check-rdo/puppet-openstack-integration-5-scenario-py3-tempest-fedora-28/74a9000/logs/nova/nova-compute.txt.gz

Expected Result:-
nova compute should work correctly under python3

Actual Result:-
nova-compute fails

[1] https://review.openstack.org/#/c/620561

Revision history for this message
Matt Riedemann (mriedem) wrote :

Which release of nova is this? Because this might be related:

https://github.com/openstack/nova/commit/23ba1c690652832c655d57476630f02c268c87ae

Revision history for this message
Matt Riedemann (mriedem) wrote :
tags: added: eventlet python3
Revision history for this message
yatin (yatinkarel) wrote :

<<<Oh I guess this isn't a specific version, it's master:

<<<http://logs.rdoproject.org/61/620561/12/openstack-check-rdo/puppet-openstack-integration-5-scenario-py3-tempest-fedora-28/71918d4/logs/nova/nova-compute.txt.gz

Yes this is master

<<<What happens if you comment this out in a dependent nova change?

<<<https://github.com/openstack/nova/blob/master/nova/cmd/__init__.py#L18

You mean starting nova compute after comenting that? If yes then nova-compute stuck in Starting state and from logs: It waits for nova-conductor. As per nova service-list and nova-conductor log, nova-conductor is up.

Revision history for this message
Lee Yarwood (lyarwood) wrote :

https://github.com/eventlet/eventlet/issues/371

For n-cpu moving the eventlet.monkey_patch(os=False) call back into nova/cmd/__init__.py works around this issue for me locally.

I've not reproduced the issue with n-api in #1808951 yet but I assume moving the call ahead of the other imports in nova/api/openstack/wsgi_app.py should also do the trick.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/626952

Changed in nova:
assignee: nobody → Matthew Booth (mbooth-9)
status: New → In Progress
Revision history for this message
yatin (yatinkarel) wrote :

<<< Review: https://review.openstack.org/626952

The issue is still reproduced with ^^ patch, check job logs on poi test patch:- https://review.openstack.org/#/c/620561/

To reproduce locally run following commands on a Fedora 28 VM:-

sudo sed -i 's/enabled=1/enabled=0/' /etc/yum.repos.d/*.repo
sudo tee -a /etc/yum.repos.d/delorean-deps.repo <<EOF
[fedora-stable]
name=fedora
baseurl=https://trunk.rdoproject.org/fedora/stable-base/latest
gpgcheck=0
enabled=1
EOF

sudo dnf -y update
sudo dnf -y install git python-setuptools python-devel libffi-devel libselinux-python openssl-devel yum-utils createrepo gettext diffstat doxygen patch patchutils subversion systemtap

sudo easy_install pip
sudo pip install tox
git clone https://github.com/rdo-infra/weirdo.git

test -f ~/.ssh/id_rsa.pub || ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub |sudo tee -a /root/.ssh/authorized_keys

cat <<EOF >weirdo/hosts
localhost ansible_connection=local ansible_connection=ssh ansible_ssh_user=root
[openstack_nodes]
localhost log_destination=/var/log/weirdo ansible_connection=ssh ansible_ssh_user=root
EOF

cd ~/weirdo

nohup tox -e ansible-playbook -- -vv -i hosts playbooks/puppet-openstack-scenario001.yml -e ci_environment=ci-centos -e delorean_url=https://trunk.rdoproject.org/fedora/consistent/delorean.repo -e delorean_deps_url=https://trunk.rdoproject.org/fedora/delorean-deps.repo -e openstack_release=master -e version=master -e enable_puppet_modules_rpm=false -e nested_virtualization=false -e refspec=refs/changes/61/620561/21 &

tail -f nohup.out

Matthew Booth (mbooth-9)
tags: added: stein-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)
Download full text (3.2 KiB)

Reviewed: https://review.openstack.org/626952
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3c5e2b0e9fac985294a949852bb8c83d4ed77e04
Submitter: Zuul
Branch: master

commit 3c5e2b0e9fac985294a949852bb8c83d4ed77e04
Author: Matthew Booth <email address hidden>
Date: Wed Jan 30 15:10:25 2019 +0000

    Eventlet monkey patching should be as early as possible

    We were seeing infinite recursion opening an ssl socket when running
    various combinations of python3, eventlet, and urllib3. It is not
    clear exactly what combination of versions are affected, but for
    background there is an example of this issue documented here:

    https://github.com/eventlet/eventlet/issues/371

    The immediate cause in nova's case was that we were calling
    eventlet.monkey_patch() after importing urllib3. Specifically, change
    Ie7bf5d012e2ccbcd63c262ddaf739782afcdaf56 introduced the
    nova.utils.monkey_patch() method to make monkey patching common
    between WSGI and non-WSGI services. Unfortunately, before executing
    this method you must first import nova.utils, which imports a large
    number of modules itself. Anything imported (transitively) by
    nova.utils would therefore be imported before monkey patching, which
    included urllib3. This triggers the infinite recursion problem
    described above if you have an affected combination of library
    versions.

    While this specific issue may eventually be worked around or fixed in
    eventlet or urllib3, it remains true that eventlet best practises are
    to monkey patch as early as possible, which we were not doing. To
    avoid this and hopefully future similar issues, this change ensures
    that monkey patching happens as early as possible, and only a minimum
    number of modules are imported first.

    This change fixes monkey patching for both non-wsgi and wsgi callers:

    * Non-WSGI services (nova/cmd)

      This is fixed by using the new monkey_patch module, which has minimal
      dependencies.

    * WSGI services (nova/api/openstack)

      This is fixed both by using the new monkey_patch module, and by moving
      the patching point up one level so that it is done before importing
      anything in nova/api/openstack/__init__.py.

      This move causes issues for some external tools which load this path
      from nova and now monkey patch where they previously did not. However,
      it is unfortunately unavoidable to enable monkey patching for the wsgi
      entry point without major restructuring. This change includes a
      workaround for sphinx to avoid this issue.

    This change has been through several iterations. I started with what
    seemed like the simplest and most obvious change, and moved on as I
    discovered more interactions which broke. It is clear that eventlet
    monkey patching is extremely fragile, especially when done implicitly at
    module load time as we do. I would advocate a code restructure to
    improve this situation, but I think the time would be better spent
    removing the eventlet dependency entirely.

    Co-authored-by: Lee Yarwood <lyarwood@redhat...

Read more...

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.openstack.org/647310

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/stein)
Download full text (3.3 KiB)

Reviewed: https://review.opendev.org/647310
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d0f540742efc1004b4d2c64814dcc6f7b9f0ccf6
Submitter: Zuul
Branch: stable/stein

commit d0f540742efc1004b4d2c64814dcc6f7b9f0ccf6
Author: Matthew Booth <email address hidden>
Date: Wed Jan 30 15:10:25 2019 +0000

    Eventlet monkey patching should be as early as possible

    We were seeing infinite recursion opening an ssl socket when running
    various combinations of python3, eventlet, and urllib3. It is not
    clear exactly what combination of versions are affected, but for
    background there is an example of this issue documented here:

    https://github.com/eventlet/eventlet/issues/371

    The immediate cause in nova's case was that we were calling
    eventlet.monkey_patch() after importing urllib3. Specifically, change
    Ie7bf5d012e2ccbcd63c262ddaf739782afcdaf56 introduced the
    nova.utils.monkey_patch() method to make monkey patching common
    between WSGI and non-WSGI services. Unfortunately, before executing
    this method you must first import nova.utils, which imports a large
    number of modules itself. Anything imported (transitively) by
    nova.utils would therefore be imported before monkey patching, which
    included urllib3. This triggers the infinite recursion problem
    described above if you have an affected combination of library
    versions.

    While this specific issue may eventually be worked around or fixed in
    eventlet or urllib3, it remains true that eventlet best practises are
    to monkey patch as early as possible, which we were not doing. To
    avoid this and hopefully future similar issues, this change ensures
    that monkey patching happens as early as possible, and only a minimum
    number of modules are imported first.

    This change fixes monkey patching for both non-wsgi and wsgi callers:

    * Non-WSGI services (nova/cmd)

      This is fixed by using the new monkey_patch module, which has minimal
      dependencies.

    * WSGI services (nova/api/openstack)

      This is fixed both by using the new monkey_patch module, and by moving
      the patching point up one level so that it is done before importing
      anything in nova/api/openstack/__init__.py.

      This move causes issues for some external tools which load this path
      from nova and now monkey patch where they previously did not. However,
      it is unfortunately unavoidable to enable monkey patching for the wsgi
      entry point without major restructuring. This change includes a
      workaround for sphinx to avoid this issue.

    This change has been through several iterations. I started with what
    seemed like the simplest and most obvious change, and moved on as I
    discovered more interactions which broke. It is clear that eventlet
    monkey patching is extremely fragile, especially when done implicitly at
    module load time as we do. I would advocate a code restructure to
    improve this situation, but I think the time would be better spent
    removing the eventlet dependency entirely.

    Co-authored-by: Lee Yarwood <lyarwood@re...

Read more...

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.1

This issue was fixed in the openstack/nova 19.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 20.0.0.0rc1

This issue was fixed in the openstack/nova 20.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.