disk-image-create broken in virtualenvs created with `python3 -m venv ...`

Bug #1745626 reported by Jeremy Audet on 2018-01-26
This bug affects 2 people
Affects Status Importance Assigned to Milestone

Bug Description

Python 3.3+ ships the "venv" module in the standard library. Among other things, one can create a virtualenv with it. Unfortunately, diskimage-builder doesn't work in these virtualenvs. Here's a simple reproducer:

    python3 -m venv ~/.venvs/diskimage-builder
    source ~/.venvs/diskimage-builder/bin/activate
    pip install diskimage-builder
    disk-image-create --help # boom!

Here's a sample traceback:

    $ disk-image-create
    Traceback (most recent call last):
      File "/home/ichimonji10/.venvs/diskimage-builder/bin/disk-image-create", line 11, in <module>
      File "/home/ichimonji10/.venvs/diskimage-builder/lib/python3.6/site-packages/diskimage_builder/disk_image_create.py", line 56, in main
      File "/home/ichimonji10/.venvs/diskimage-builder/lib/python3.6/site-packages/diskimage_builder/disk_image_create.py", line 36, in activate_venv
        globs = runpy.run_path(activate_this, globals())
      File "/usr/lib/python3.6/runpy.py", line 261, in run_path
        code, fname = _get_code_from_file(run_name, path_name)
      File "/usr/lib/python3.6/runpy.py", line 231, in _get_code_from_file
        with open(fname, "rb") as f:
    FileNotFoundError: [Errno 2] No such file or directory: '/home/ichimonji10/.venvs/diskimage-builder/bin/activate_this.py'

The issue is easy enough to track down. Here's a snippet from `disk_image_create.py`:

    def main():
        # If we are called directly from a venv install
        # (/path/venv/bin/disk-image-create) then nothing has added the
        # virtualenv bin/ dir to $PATH. the exec'd script below will be
        # unable to find call other dib tools like dib-run-parts.
        # One solution is to say that you should only ever run
        # disk-image-create in a shell that has already sourced
        # bin/activate.sh (all this really does is add /path/venv/bin to
        # $PATH). That's not a great interface as resulting errors will
        # be very non-obvious.
        # We can detect if we are running in a virtualenv and use
        # virtualenv's "activate_this.py" script to activate it ourselves
        # before we call the script. This ensures we have the path setting

In other words, according to this comment, both of these two workflows are supported:

    # typical workflow
    source ~/.venvs/diskimage-builder/bin/activate
    disk-image-create --help

    # workflow supported by activate_venv()

Unfortunately, activate_venv() makes assumptions about the internal structure about virtualenvs, and those assumptions aren't valid. Specifically, activate_venv() assumes that an executable called `activate_this.py` exists. `activate_this.py` does exist in virtualenvs created by `virtualenv`. It doesn't exist in virtualenvs created by the standard library's venv module.

Let's look a bit closer. Here's the full source of activate_venv():

    def activate_venv():
        if running_under_virtualenv():
            activate_this = os.path.join(sys.prefix, "bin", "activate_this.py")
            globs = runpy.run_path(activate_this, globals())
            del globs

Importantly, running_under_virtualenv() returns true when disk-image-create is called from a virtualenv, even when the virtualenv has already been activated. That means `activate_this.py` is unnecessarily executed in both of the following scenarios:

    python3 -m venv ~/.venvs/diskimage-builder/
    source ~/.venvs/diskimage-builder/bin/activate
    pip install diskimage-builder
    disk-image-create --help

    virtualenv ~/.venvs/diskimage-builder/
    source ~/.venvs/diskimage-builder/bin/activate
    pip install diskimage-builder
    disk-image-create --help

There's a couple possible fixes.

1. Drop the logic that automagically adds path-to-venv/bin/ to $PATH.
2. Make activate_venv() execute only when disk-image-create is in a venv *and* the venv hasn't been activated.

I like the first solution, because I dislike automagic things that break user expectations. My expectation is that virtualenvs are only active when I make them active, and this is the only Python application I'm aware of that tries to be clever about this. But that's a regression in functionality, and is therefore off the table (at least until a major version change). So, the second option is the more feasible one.

Ian Wienand (iwienand) wrote :

I'd be willing to consider any patches on this. I don't particularly like this either but people run the binary straight out of the virtualenv and get just as confused when it all breaks with weird tracebacks because it can't find the other binaries it needs. I remember at the time we looked around for other solutions, such as pkgutil but didn't really find anything that worked quite the same.

I think dib-run-parts is actually solved because we have diskimage_builder.paths.get_path('lib') where we run it from now. if there's some way we can know the canonical path to dib-block-device and others without this that would be good

Changed in diskimage-builder:
importance: Undecided → Medium
status: New → Confirmed
description: updated
Jeremy Audet (ichimonji10) wrote :

> if there's some way we can know the canonical path to dib-block-device and others without this that would be good

Yes, it's possible to discover the canonical path to `dib-block-device` and other executables without using `activate_this.py`. In fact, the current code base already assumes that it can find the bin directory:

    activate_this = os.path.join(sys.prefix, "bin", "activate_this.py")

The big question is: what happens next?

The current code updates `os.environ['PATH']` by using runpy to execute `activate_this.py` and then injecting its namespace dictionary into the current namespace dictionary, overwriting any existing keys:

    globs = runpy.run_path(activate_this, globals())
    del globs

I see two problems with this approach:

* This doesn't work for virtualenvs created by `python3 -m venv`, as no `activate_this.py` script is present.
* Overwriting the current namespace dictionary with all of the values from another module's namespace dictionary is, uh... not how I'd do it.

Here's a different approach:

    bin_dir = os.path.abspath(os.path.join(sys.prefix, 'bin'))
    paths = os.environ['PATH'].split(':')
    if bin_dir not in paths:
        # If a dependent executable (like dib-block-device) is installed in
        # several locations, we want to find the one in the virtualenv's
        # bin directory. Thus, prepend to PATH, instead of appending.
        os.environ['PATH'] = bin_dir + os.pathsep + os.environ['PATH']

I've attached a patch that implements this change, along with some other nice changes. Let me know if the patch is in an inappropriate format. I generated the patch with a simple `git show`, but I'm unsure if this is the correct approach.

I tested the changes with `tox`. I've also tested the changes by executing the following:

    python3 -m venv ~/.venvs/diskimage-builder-py3/
    source ~/.venvs/diskimage-builder-py3/bin/activate
    pip install -e .
    mkdir ~/diskimage-builder
    cd ~/diskimage-builder
    disk-image-create -a amd64 -o ubuntu-amd64 vm ubuntu
    sudo cp ubuntu-amd64.qcow2 /var/lib/libvirt/images/

I successfully booted the image.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers