Python apps broken in MicroStack (a classic snap) after Jan 14 snapcraft update

Bug #1860369 reported by Pete Vander Giessen on 2020-01-20
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
snapd
High
Zygmunt Krynicki

Bug Description

I ran into this bug on January 14th, when the snapcraft stable snap updated, but it's strange enough that it took me a week to come to terms with it (and rule out foolish things that I might have been doing).

In the MicroStack snap, we have a few Python scripts that are built using Python's entry point convention, where, after running setup.py, you wind up with a script in your bin dir that looks something like this:

    #!/usr/bin/env python

    from my_package import some_function

    if __name__ == '__main__'
        some_function()

That's all very straightforward and normal, and our scripts were working beautifully before January 14th. Post snapcraft 3.9.5, however, the import of <my_package> will fail, causing the script to bail with an Exception.

MicroStack is a large and complex snap, and test cycles can run upwards of an hour. I made a much simpler snap that reproduces the bug, and pushed it to my public github account here: https://github.com/petevg/pedrolino

If you run snapcraft on a checkout of that reproducer, then snap install the resultant snap and run pedrolino.test (or just run reproducer.sh in the repo's root), you should be able to reproduce the problem: the .test script will fail with a DistributionNotFound error message.

Interestingly, the following set of commands will work:

    sudo snap run --shell pedrolino.test
    $SNAP/usr/bin/python3 $SNAP/bin/pedrolino

It looks like something broke with the pathing, and Python scripts with entry points are using the wrong Python, or are otherwise winding up without the correct libraries in their PYTHONPATH. Do you know why this might have happened, and do you know how we might work around the issue? (I kind of have a bad feeling that the workaround might be to find every invocation of a Python script in the snapcraft.yaml and MicroStack tooling, and explicitly call the scripts with usr/bin/python3 ...)

Pete Vander Giessen (petevg) wrote :

Is this possibly related to the switch from using pip.wheel to using pip.install when building packages?

That change happened here: https://github.com/snapcore/snapcraft/commit/9bac0fd647c22e2207a3b27b886ea12a8809ce30

Said change caused other broken-ness in MicroStack, though the workaround turned out to be a one line change.

Specifically I was building an OpenStack component called horizon from source in MicroStack's openstack-projects part. A horizon version is specified in the OpenStack upper constraints file, which we reference in the openstack-projects part. Pip.wheel was happy to just build from our horizon source tarball, but pip.install failed, complaining about being unable to apply a version of a source-built library. This is a reasonable error message, and I simply switched to installing horizon via pip (the constraints basically just points at the version in pip that matches the tarball that I was downloading).

I'm wondering if there are further consequences from moving away from wheels. Though I do see everything living happily in the snap's site packages. This really is a weird one :-/

summary: Issue running Python daemons that have been built using entry points
- with the latest snapcraft
+ with the latest snapcraft (in a --classic snap)

Also, one important note: this problem does not occur in a strictly confined version of the reproducer snap. This appears to be an issue w/ classic confinement (the issue is not addressed by adding PATH stuff to the environment -- see https://opendev.org/x/microstack/src/branch/master/snapcraft.yaml for a reference list of environment parameters that do *not* address the issue).

Pete Vander Giessen (petevg) wrote :

A note on workarounds: I attempted to explicitly run my entry point scripts with usr/bin/python3.6, but the issue persists, which suggests that it is more of an issue w/ PYTHONPATH getting clobbered or not properly set than it is an issue running the wrong Python binary.

I also tried starting my daemons after a sleep (assuming that we were running into a race condition), and that did not make a difference.

Chris Patterson (cjp256) wrote :

Thanks for the report and sorry for any troubles you're seeing here. While I search for the cause of the change in behavior, there is a minor issue in the snapcraft.yaml that may address your issue:

`command: pedrolino` should be: `command: bin/pedrolino`

Then the command gets updated to reflect it: `command: usr/bin/python3 $SNAP/bin/pedrolino`

With the above change and snapcraft 3.9.7:
+ pedrolino.test
If you are reading this, you have not reproduced the bug!

Changed in snapcraft:
assignee: nobody → Chris Patterson (cjp256)
status: New → Confirmed
Pete Vander Giessen (petevg) wrote :

Here is a pastebin of the nginx error:

https://paste.ubuntu.com/p/rGDzgn8tPX/

Pete Vander Giessen (petevg) wrote :

Also, here's a paste of the error, but with some debugging output about the Python environment.

Note that, in os.environ, LD_LIBRARY_PATH matches the one we setup in environment in our snapcraft.yaml, but PATH seems to simply be the default system path, with none of the changes that we specify in our snapcraft.

Chris Patterson (cjp256) wrote :
Download full text (3.9 KiB)

I investigated further and there does indeed appear to be a bug with classic snaps, but appears to be on the snapd side of the house.

I expanded @petevg's test case:
https://github.com/cjp256/pedrolino

For the two apps in that snap that use the shell, PATHs appear to be updated to reflect `environment`. For the two apps that do not use the shell (python script and minimal gcc built binary), the PATHs are missing the updates. There is a reproducer.sh in there that's updated to build and run the commands using the snap(craft) try commands.

I think snapcraft triggered this bug for Microstack because it stopped creating the unnecessary wrapper for nginx app, which dropped the shell script usage, which had the correct behavior.

##########################################################################
##########################################################################

**********
[pedrolino]
/snap/pedrolino/x8/usr/bin/python3 /snap/pedrolino/x8/bin/pedrolino
PATH: /snap/pedrolino/x8/usr/bin:/home/chris/.local/bin:/home/chris/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/var/lib/snapd/snap/bin
hi
**********
[pedrolino.wrapper]
in shell script
/snap/pedrolino/x8/bin/pedrolino.wrapper
PATH: /snap/pedrolino/x8/usr/bin:/home/chris/.local/bin:/home/chris/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/var/lib/snapd/snap/bin
execing...
/snap/pedrolino/x8/usr/bin/python3 /snap/pedrolino/x8/bin/pedrolino
PATH: /snap/pedrolino/x8/usr/bin:/home/chris/.local/bin:/home/chris/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/var/lib/snapd/snap/bin
hi
**********
[pedrolino.sh]
/snap/pedrolino/x8/bin/test.sh
PATH: /snap/pedrolino/x8/usr/bin:/home/chris/.local/bin:/home/chris/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/var/lib/snapd/snap/bin
**********
[printpath]
/snap/pedrolino/x8/bin/printpath
PATH: /snap/pedrolino/x8/usr/bin:/home/chris/.local/bin:/home/chris/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/var/lib/snapd/snap/bin

##########################################################################
##########################################################################

Versus strict:
**********
[pedrolino]
/snap/pedrolino/x9/usr/bin/python3 /snap/pedrolino/x9/bin/pedrolino
PATH: /snap/pedrolino/x9/usr/sbin:/snap/pedrolino/x9/usr/bin:/snap/pedrolino/x9/sbin:/snap/pedrolino/x9/bin:/snap/pedrolino/x9/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
hi
**********
[pedrolino.wrapper]
in shell script
/snap/pedrolino/x9/bin/pedrolino.wrapper
PATH: /snap/pedrolino/x9/usr/sbin:/snap/pedrolino/x9/usr/bin:/snap/pedrolino/x9/sbin:/snap/pedrolino/x9/bin:/snap/pedrolino/x9/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
execing...
/snap/pedrolino/x9/usr/bin/python3 /snap/pedrolino/x9/bin/pedr...

Read more...

affects: snapcraft → snapd
Changed in snapd:
assignee: Chris Patterson (cjp256) → nobody
Chris Patterson (cjp256) wrote :

Sorry, I posted the wrong classic snippet above.

[pedrolino]
/snap/pedrolino/x9/usr/bin/python3 /snap/pedrolino/x9/bin/pedrolino
PATH: /home/chris/.local/bin:/home/chris/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/var/lib/snapd/snap/bin
hi
**********
[pedrolino.wrapper]
in shell script
/snap/pedrolino/x9/bin/pedrolino.wrapper
PATH: /snap/pedrolino/x9/usr/bin:/home/chris/.local/bin:/home/chris/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/var/lib/snapd/snap/bin
execing...
/snap/pedrolino/x9/usr/bin/python3 /snap/pedrolino/x9/bin/pedrolino
PATH: /snap/pedrolino/x9/usr/bin:/home/chris/.local/bin:/home/chris/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/var/lib/snapd/snap/bin
hi
**********
[pedrolino.sh]
/snap/pedrolino/x9/bin/test.sh
PATH: /snap/pedrolino/x9/usr/bin:/home/chris/.local/bin:/home/chris/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/var/lib/snapd/snap/bin
**********
[printpath]
/snap/pedrolino/x9/bin/printpath
PATH: /home/chris/.local/bin:/home/chris/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/var/lib/snapd/snap/bin

Chris Patterson (cjp256) wrote :

And the relevant snap.yaml generated by snapcraft 3.9.7 is available for reference:
https://github.com/cjp256/pedrolino/blob/master/snap.yaml

summary: - Issue running Python daemons that have been built using entry points
- with the latest snapcraft (in a --classic snap)
+ Python entry points broken in MicroStack (a classic snap) after Jan 14
+ snapcraft update

Seems the root cause of the issue is that when we dropped the wrapper we stopped creating a shell for every command we ran, when running in a shell the last environment variable was used, whereas python prefers to pick up the first element in the environment.

summary: - Python entry points broken in MicroStack (a classic snap) after Jan 14
- snapcraft update
+ Python apps broken in MicroStack (a classic snap) after Jan 14 snapcraft
+ update
Chris Patterson (cjp256) wrote :

The scope of the issue does not appear to be limited to python. See `printpath` in the the example linked above to see that a simple binary executable has the incorrect path.

In the meantime, here is a proposal for snapcraft to _always_ generate the snapcraft-runner (inserted into the command-chain), which does appear to workaround the issue. https://github.com/snapcore/snapcraft/pull/2889

The difference here is that the snapcraft-runner will effectively just no-op in the cases where it wasn't generated before. Not ideal, but can be removed later when a proper fix becomes available.

Changed in snapd:
importance: Undecided → High
tags: added: core20
Changed in snapd:
assignee: nobody → Ian Johnson (anonymouse67)
Changed in snapd:
status: Confirmed → Triaged
Zygmunt Krynicki (zyga) wrote :

I'm working on parts of the snapd side problem:

https://github.com/snapcore/snapd/pull/8230

This PR contains a test illustrating what is broken.

Zygmunt Krynicki (zyga) wrote :

Fixing this bug will have impact on the perceived environment variables.

Currently the environment definitions from application-level declarations are appended to the snap-level declarations and inherited environment.

Once this bug is fixed the result will be deterministic across all applications.
Currently the result differs depending on the type of application used.

I've prepared a program naughty-env that can be used to craft environment with duplicate keys and run an arbitrary program with it. I tested three cases: C program, Python 3 program, Bash and Dash (sh).

Programs that use libc for accessing environment key-by-key see the first definition ("old" in the snippet below). Shells seem to build a mapping, overwriting old values, see the last definition ("new in the snippet below).

$ ./naughty-env bash KEY=old KEY=new -c 'echo $KEY'
new
$ ./naughty-env dash KEY=old KEY=new -c 'echo $KEY'
new
$ ./naughty-env python3 KEY=old KEY=new -c "import os; k='KEY'; print(os.environ[k], os.getenv(k))"
old old
$ ./naughty-env $(pwd)/getenv KEY=old KEY=new KEY
KEY=old

Once we fix this bug all applications will behave like shell does. Specifically snap-exec will provide an environment block with just one definition of PATH (or any clashing variable). The value will depend on what is defined at each level in snap.yaml. If the value refers to the old value (e.g. PATH: $SNAP/bin:$PATH) then the result will contain the expanded value. If it doesn't it will just contain the new value.

Changed in snapd:
assignee: Ian Johnson (anonymouse67) → Zygmunt Krynicki (zyga)
status: Triaged → In Progress
Zygmunt Krynicki (zyga) wrote :

I'm re-assigning this to myself since I have a local fix and will be opening a branch with the changes soon.

Zygmunt Krynicki (zyga) wrote :

The branch has been heavily modified. It will not make it to 2.44, re-assigning to 2.45

Changed in snapd:
milestone: none → 2.45
Zygmunt Krynicki (zyga) wrote :

This has been merged and should be available in edge in a few hours. We would appreciate some cross-check form your side.

Changed in snapd:
status: In Progress → Fix Committed
Zygmunt Krynicki (zyga) wrote :

Snapd 2.45 is now in the stable channel. Marking as released.

Changed in snapd:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers