tox hangs due to pip backtracking during virtualenv generation

Bug #1975711 reported by Miguel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned

Bug Description

Description
===========
On a fresh checkout of nova, running tox -e pep8 results in the process maxing out a CPU core and seemingly getting stuck (I terminated it after 30 minutes of no progress).

I believe this is due to pip trying to find a set of packages that exactly satisfy cross-requirements of all dependencies, checking multiple progressively older versions of each package until the tree becomes too complex to handle at all.

Steps to reproduce
==================

* Make a fresh checkout of nova, a shallow one works since we only need master:
git clone --depth 1 https://opendev.org/openstack/nova.git nova
This makes sure the tox virtualenv from an existing checkout isn't reused.

* From within the repo, run tox pep8 with verbosity to see pip output:
$ tox -vvv -e pep8

Expected result
===============
Tox successfully sets up its virtualenv and runs pep8.

Actual result
=============
pip downloads several versions of packages, outputting a large amount of messages like these for a few packages along the way:

INFO: pip is looking at multiple versions of certifi to determine which version is compatible with other requirements. This could take a while.
  Downloading certifi-2020.4.5-py2.py3-none-any.whl (156 kB)
     |████████████████████████████████| 156 kB 81.6 MB/s
  Downloading certifi-2019.11.28-py2.py3-none-any.whl (156 kB)
     |████████████████████████████████| 156 kB 86.8 MB/s
  Downloading certifi-2019.9.11-py2.py3-none-any.whl (154 kB)
     |████████████████████████████████| 154 kB 79.5 MB/s
  Downloading certifi-2019.6.16-py2.py3-none-any.whl (157 kB)
     |████████████████████████████████| 157 kB 71.6 MB/s
  Downloading certifi-2019.3.9-py2.py3-none-any.whl (158 kB)
     |████████████████████████████████| 158 kB 84.7 MB/s
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. If you want to abort this run, you can press Ctrl + C to do so. To improve how pip performs, tell us what happened here: https://pip.pypa.io/surveys/backtracking

Eventually it seems to get completely stuck after one of those downloads, maxing out a CPU core and seemingly making no more progress until terminated.

Environment
===========
This happens in dev environments no in Openstack deployments. We've reproduced it on Fedora 35 and 36, I would expect others to be similarly impacted. Some system python env info:

$ python -V
Python 3.10.4

$ pip show pip
Name: pip
Version: 21.3.1

$ pip show tox
Name: tox
Version: 3.24.5

Logs & Configs
==============
Reproduced on a fresh checkout with no altered configs.

Revision history for this message
Miguel (compi-migui) wrote :

I was able to resolve this on my machine by using pip-tools' pip-compile[1] to pin the versions of dependencies prior to running tox. This means that the tox virtualenv generation doesn't try to resolve all cross-dependencies (getting stuck along the way), and instead uses the pinned versions defined by pip-compile.

Doing this means slightly refactoring how requirements are defined in the project. I'm working on a draft patch to show what that would look like and will create a review soon.

[1] https://github.com/jazzband/pip-tools

Revision history for this message
sean mooney (sean-k-mooney) wrote :

in general we cannot simple use pip-compile to mage our requirements.txt.
we are not meant to cap or pin deps at the project level.
we use the requirement repos to manage global requirements for co installablity of services.

we may be able to use pip-compile in limited cases within tox by generating a tempory requirement file that we then use when creating the venv but we would have to be very carful to not break our current requirement manamgnet stragy so this might be a valid workaround for indivegual developers but we may or may not be able to adopt it in nova.

Revision history for this message
sean mooney (sean-k-mooney) wrote :

specifically any change would have to be compatible with
https://github.com/openstack/governance/blob/master/reference/pti/python.rst
https://github.com/openstack/governance/blob/master/reference/runtimes/zed.rst
and not break integration with global requirement management controlled by
https://github.com/openstack/requirements/blob/master/global-requirements.txt
and https://github.com/openstack/requirements/blob/master/upper-constraints.txt

as a project we are not meant to pin explicit version in reqruiemetns.txt or test-requriements.txt

we can track min version but not max or explicitly pin.

Revision history for this message
sean mooney (sean-k-mooney) wrote :

im settign this to opipion for now since this seam to be distro-specific

we have confirmed that this can happen on fedora 36 but we have also confirmed that
3.10 work on debian and nixos and we have ci jobs running non voting unit test on 3.10 on ubuntu 22.04

so in general this does not appear to be a nova but it looks like its either a fedroa issue or its related to the tox/pip version that are bing used.

we may be able to work around it in nova but I'm not sure we should do that or can do that without change what we test and how we test in an undesireable way.

Changed in nova:
status: New → Opinion
Revision history for this message
Miguel (compi-migui) wrote :

Thanks for the context. I'll do some more digging to isolate what exactly is breaking the install in Fedora, hopefully it's as simple as the system tox/pip/etc version which would be easy to work around.

Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :

I can (intermittently) reproduce it with the below versions on Fedora-36, installed from 'pip':

- Python 3.8.13
- pip 22.1.1
- tox-3.25.0

And my system Python versions:

  python3-3.10.4-1.fc36.x86_64
  tox-3.25.0-1.fc36.noarch
  python3-pip-21.3.1-2.fc36.noarch

The following is one of the packages where 'pip got stuck:
----------------------------------------------------------------------

$> tox -e pep8
[...]
INFO: pip is looking at multiple versions of eventlet to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of decorator to determine which version is compatible with other requirements. This could take a while.
Collecting decorator>=4.1.0
  Using cached decorator-5.1.0-py3-none-any.whl (9.1 kB)
  Using cached decorator-5.0.9-py3-none-any.whl (8.9 kB)
  Using cached decorator-5.0.8-py3-none-any.whl (8.9 kB)
  Using cached decorator-5.0.7-py3-none-any.whl (8.8 kB)
  Using cached decorator-5.0.6-py3-none-any.whl (8.8 kB)
  Using cached decorator-5.0.5-py3-none-any.whl (8.8 kB)
  Using cached decorator-4.4.2-py2.py3-none-any.whl (9.2 kB)
INFO: pip is looking at multiple versions of eventlet to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of decorator to determine which version is compatible with other requirements. This could take a while.
  Using cached decorator-4.4.1-py2.py3-none-any.whl (9.2 kB)
  Using cached decorator-4.4.0-py2.py3-none-any.whl (8.3 kB)
  Using cached decorator-4.3.2-py2.py3-none-any.whl (9.1 kB)
  Using cached decorator-4.3.1-py2.py3-none-any.whl (8.8 kB)
  Using cached decorator-4.3.0-py2.py3-none-any.whl (9.2 kB)
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.
  Using cached decorator-4.2.1-py2.py3-none-any.whl (9.3 kB)
  Using cached decorator-4.1.2-py2.py3-none-any.whl (9.1 kB)
  Using cached decorator-4.1.1-py2.py3-none-any.whl (9.0 kB)
  Using cached decorator-4.1.0-py2.py3-none-any.whl (9.0 kB)
----------------------------------------------------------------------

Revision history for this message
Miguel (compi-migui) wrote :

Reproduced in a Debian container, so it's not Fedora-specific at least. Identical output to Kashyap's:

INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.
  Downloading decorator-4.2.1-py2.py3-none-any.whl (9.3 kB)
  Downloading decorator-4.1.2-py2.py3-none-any.whl (9.1 kB)
  Downloading decorator-4.1.1-py2.py3-none-any.whl (9.0 kB)
  Downloading decorator-4.1.0-py2.py3-none-any.whl (9.0 kB)
INFO: pip is looking at multiple versions of pycodestyle to determine which version is compatible with other requirements. This could take a while.
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.

# cat /etc/debian_version
11.3
# python3 --version
Python 3.9.2
# pip3 --version
pip 20.3.4 from /usr/lib/python3/dist-packages/pip (python 3.9)
# pip3 show tox
Name: tox
Version: 3.25.0

Exact same thing on Fedora 33 and 36 containers (pulled 33 first on accident, free data point).

33:
[root@782408867bf9 /]# python3 --version
Python 3.9.7
[root@782408867bf9 /]# pip3 --version
pip 20.2.2 from /usr/lib/python3.9/site-packages/pip (python 3.9)
[root@782408867bf9 /]# pip3 show tox
Name: tox
Version: 3.25.0

36:
[root@699feac13be5 /]# python3 --version
Python 3.10.4
[root@699feac13be5 /]# pip3 --version
pip 21.3.1 from /usr/lib/python3.10/site-packages/pip (python 3.10)
[root@699feac13be5 /]# pip3 show tox
Name: tox
Version: 3.25.0

UBI python-38 works!
# python3 --version; pip3 --version; pip3 show tox
Python 3.8.12
pip 21.2.3 from /opt/app-root/lib64/python3.8/site-packages/pip (python 3.8)
Name: tox
Version: 3.25.0

UBI python-39 breaks:
# python3 --version; pip3 --version; pip3 show tox
Python 3.9.7
pip 21.2.3 from /opt/app-root/lib64/python3.9/site-packages/pip (python 3.9)
Name: tox
Version: 3.25.0

So it looks like all my attempts on Python 3.9/3.10 broke but the one on 3.8 worked. There doesn't seem to be a correlation with the pip/tox version, but maybe it could be a weird interaction between specific versions of all of them?

Sean mentioned it working with 3.10 on Ubuntu and Nixos so worth trying those.

Dockerfiles for easy reproducing:
-----
FROM fedora:36

USER root
RUN dnf install -y python3-pip git
RUN git clone --depth 1 https://opendev.org/openstack/nova.git && cd nova
RUN pip3 install tox
RUN cd ./nova && tox -vvv -e pep8 --notest
-----
FROM debian

USER root
RUN apt-get update -y && apt install -y python3-pip git
RUN git clone --depth 1 https://opendev.org/openstack/nova.git && cd nova
RUN pip3 install tox
RUN cd ./nova && tox -vvv -e pep8 --notest
------
ROM registry.access.redhat.com/ubi8/python-39

USER root
RUN dnf install -y python3-pip git
RUN git clone --depth 1 https://opendev.org/openstack/nova.git && cd nova
RUN pip3 install tox
RUN cd ./nova && tox -vvv -e pep8

Revision history for this message
Miguel (compi-migui) wrote (last edit ):

ubuntu:22.04 image also gets stuck on downloading old versions of decorator. ubuntu:20.04 works just fine.

# cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04 LTS"
# python3 --version; pip3 --version; pip3 show tox
Python 3.10.4
pip 22.0.2 from /usr/lib/python3/dist-packages/pip (python 3.10)
Name: tox
Version: 3.25.0

# cat /etc/os-release
PRETTY_NAME="Ubuntu 20.04.4 LTS"
# python3 --version; pip3 --version; pip3 show tox
Python 3.8.10
pip 20.0.2 from /usr/lib/python3/dist-packages/pip (python 3.8)
Name: tox
Version: 3.25.0

Noticed that openstack-tox-py310 runs on 22.04 and doesn't break:
https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_aa6/843228/3/check/openstack-tox-py310/aa673bd/job-output.txt

Tried "tox -vvv -e py310 --notest" on the 22.04 image and confirmed that it works just fine, it's the pep8 testenv specifically that breaks.

Revision history for this message
Miguel (compi-migui) wrote :

ah ha!

The difference between the functional-py310 and pep8 testenvs (per tox.ini) is a dep on autopep8, which pulls in 2 other packages[1]: 'toml' (unversioned) and 'pycodestyle >= 2.8.0'.

Modifying tox.ini to remove autopep8 and adding those two back in one at a time reveals that pycodestyle is at fault. Replacing it with unversioned pycodestyle lets the environment generate fine, so it's a problem with the specific version.

After letting the env generate with unversioned pycodestyle, examining it reveals that the flake8 version we're installing (3.8.4) requires[2] 'pycodestyle >= 2.6.0a1, < 2.7.0'. The <2.7.0 part was dropped in flake8 4.0.0.

We can't get flake8 4.0.0 because hacking requires flake8<3.9.0,>=3.8.0

On why hacking can't use the newer flake8, found these:
https://review.opendev.org/c/openstack/hacking/+/816676
https://github.com/PyCQA/flake8/pull/1438

Going back to the origin of the problem, the last version that allowed 'pycodestyle >= 2.6.0' (as opposed to 2.7.0 or 2.8.0) was 1.5.5. Pinning autopep8 to that in tox.ini allows the environment to be generated, and pep8 passes to boot.

[1] https://github.com/hhatto/autopep8/blob/master/setup.py#L12-L14
[2] https://github.com/PyCQA/flake8/blob/3.8.4/setup.cfg#L45

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/843443

Changed in nova:
status: Opinion → In Progress
Revision history for this message
Miguel (compi-migui) wrote :

Now that hacking has had its falke8 requirement relaxed[1] this is fixed. 'tox -e pep8' resolves all dependencies, installs them and runs without issue.

No change needed in nova.

[1] https://review.opendev.org/c/openstack/hacking/+/816676/

Changed in nova:
status: In Progress → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by "Miguel Garcia <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/nova/+/843443
Reason: Dependency conflict fixed in hacking, no need to work around the issue by pinning in nova: https://review.opendev.org/c/openstack/hacking/+/816676/

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.