apt feature broken on >=Yakkety due to new gpg agent

Bug #1645680 reported by ChristianEhrhardt on 2016-11-29
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
GnuPG
Incomplete
Unknown
curtin
Undecided
Unassigned
curtin (Ubuntu)
Medium
Unassigned
Xenial
Medium
Unassigned
Yakkety
High
Unassigned
Zesty
Medium
Unassigned

Bug Description

----- Begin SRU Template -----
[Impact]
The mechanism for adding PPAs during was prone to failure when installing
16.10 (yakkety) or newer systems.

A curtin install of yakkety with the following configuration would
fail:
  apt:
  sources:
    ignored1:
      source: "ppa:paelzer/yourppa"

[Test Case]
Install a yakkety or zesty system with the above configuration.
This can be accomplished by running the vmtest YakketyTestAptConfigCMDCMD
with the installed version of curtin.

It has configuration of
    apt:
      sources:
        ignored:
           source: "ppa:curtin-dev/test-archive"
        curtin-test1.list:
           source: "deb $MIRROR $RELEASE-proposed main"

[Regression Potential]

[Other Info]
This failure came as a result of change in behavior of gpg. Curtin
(indirectly through add-apt-repository) uses GPG to add PPAs into a
chroot. GPG2 began daemonizing itself, which meant that unmounts of the
filesystem would fail due to open filehandles of the daemonized gpg
process.

There is further discussion both on the bug and in the upstream
merge proposal [1] on other ways to do this. The solution taken was
a killall of processes named 'dirmgr' or 'gpg-agent' that were spawned
after the chroot.

[1] https://code.launchpad.net/~paelzer/curtin/curtin-bug-1645680-gpgagent/+merge/312143
----- End SRU Template -----

Hi,
while testing I found that when running apt feature related to add-apt-repository like:

apt:
  sources:
    ignored1:
      source: "ppa:paelzer/yourppa"

Or in fact any sort of add-apt-repository (also unrelated to the apt feature itself) like:
late_commands:
 01_install_ppa: ['curtin', 'in-target --', 'add-apt-repository --yes ppa:paelzer/bug-1645274-multipath-merge']

Then the installation fails.

Both use the chroot to execute in target, but recent add-apt-repository seems so cause daemons to spawn which then let the umount fail.

Failure is usually around something like:
"umount: /tmp/tmptmucmfm0/target/dev: target is busy"

Here an excerpt from a lsof +fg afterwards.
dirmngr 6771 root 1r CHR LG,0x80000 1,9 0t0 11 /tmp/tmptmucmfm0/target/dev/urandom
dirmngr 6771 root 2w CHR W,LG 1,3 0t0 6 /tmp/tmptmucmfm0/target/dev/null
gpg-agent 6776 root 0r CHR LG 1,3 0t0 6 /tmp/tmptmucmfm0/target/dev/null
gpg-agent 6776 root 1w CHR W,LG 1,3 0t0 6 /tmp/tmptmucmfm0/target/dev/null
gpg-agent 6776 root 2w CHR W,LG 1,3 0t0 6 /tmp/tmptmucmfm0/target/dev/null

One of them could be shut down by:
gpg-connect-agent --verbose KILLAGENT
But not dirmngr, that has to be killed.
Actually killing them seems ok (does not seem to create and later fallout).

Related branches

ChristianEhrhardt (paelzer) wrote :

FYI - at least for gpg-agent there seems to be a softer way to remove it which is

$ gpg-connect-agent --verbose KILLAGENT bye

ChristianEhrhardt (paelzer) wrote :

This can of course also be --quiet instead of --verbose, but one messages will stay.

I kind of got what I actually wanted to do (test against a ppa).
But the commands I used can be used to iterate on this issue with different solutions.

late_commands:
 01_1_install_ppa: ['curtin', 'in-target', '--', 'sh', '-c', "/bin/echo '#!/bin/sh' > /tmp/addppa"]
 01_2_install_ppa: ['curtin', 'in-target', '--', 'sh', '-c', "/bin/echo 'set -uxe' >> /tmp/addppa"]
 01_3_install_ppa: ['curtin', 'in-target', '--', 'sh', '-c', "/bin/echo 'DEBIAN_FRONTEND=noninteractive add-apt-repository --yes ppa:paelzer/bug-1645274-multipath-merge' >> /tmp/addppa"]

<Testing different cleanups here>

 01_6_install_ppa: ['curtin', 'in-target', '--', 'sh', '-c', "/bin/echo 'lsof +fg /dev' >> /tmp/addppa"]
 01_7_install_ppa: ['curtin', 'in-target', '--', 'sh', '-c', "/bin/chmod +x /tmp/addppa"]
 01_8_install_ppa: ['curtin', 'in-target', '--', 'sh', '-c', "/bin/cat /tmp/addppa"]
 01_9_install_ppa: ['curtin', 'in-target', '--', 'sh', '-c', "/tmp/addppa"]

I tried:
#1 Hard clean with pkill (killall would equally work)
 01_4_install_ppa: ['curtin', 'in-target', '--', 'sh', '-c', "/bin/echo 'pkill gpg-agent' >> /tmp/addppa"]
 01_5_install_ppa: ['curtin', 'in-target', '--', 'sh', '-c', "/bin/echo 'pkill dirmngr' >> /tmp/addppa"]

#2 soft kill with the actual agent closing itself gracefully
 01_5_install_ppa: ['curtin', 'in-target', '--', 'sh', '-c', "/bin/echo 'gpg-connect-agent --quiet KILLAGENT bye' >> /tmp/addppa"]

The first one works, while the latter fails in a chroot (works locally):
gpg-connect-agent: failed to create temporary file '/root/.gnupg/.#lk0x00005595a8a39c80.ubuntu.6958': No such file or directory
gpg-connect-agent: can't connect to the agent: No such file or directory
gpg-connect-agent: error sending standard options: No agent running

The issue of #2 is the same when calling it e.g. in a lxd guest.

So for now back to old style killing things :-/

ChristianEhrhardt (paelzer) wrote :

Maybe that could serve as a soft shutdown (untested)

gpgconf --kill gpg-agent

ChristianEhrhardt (paelzer) wrote :

I guess in general finding and tuning it to the point where it doesn't even get started might be the better way, but I have no idea where/how to do that.

ChristianEhrhardt (paelzer) wrote :

We discussed alternatives and options and agreed that we first have to file an upstream bug.

We should file a bug upstream and link it here.
Target is to understand if there is - or ever will be - something like a --oneshot option to exit it immediately. Preferably also controlled by an environment variable that can be inherited.

Another option - but probably over-engineering for that - could be a comeback of lxc-nschroot.
And clean up all in it when leaving, but that would be a major change and still error prone.
Some more background on alternatives: http://unix.stackexchange.com/questions/124162/reliable-way-to-jail-child-processes-using-nsenter

Fallback for now could be to add on the path out of the chroot (where we already clean the changed we made to rc.d policy) in curtin that does the cleanup "killall dirmngr |:" and not needed but maybe also the same for gpg-agent maybe.
A step further to be protected could be to add --younger-than and --user to the killall.
Add a comment that once gnupg came up with something official we should migrate to use that (long term).

Actions:
- gnupg upstream discussion
- The workaround into yakkety as fixup for now
- mid Term discussion on team sprint about using namespaces to solve it more generally

Changed in curtin:
status: New → Confirmed
Changed in curtin (Ubuntu):
status: New → Confirmed
Changed in curtin (Ubuntu Yakkety):
status: New → Triaged
importance: Undecided → High
ChristianEhrhardt (paelzer) wrote :

I was able to track the spawning down to calls to gpg done from add-apt-repository.

Simplified - that spawns one:
gpg --keyserver hkp://keyserver.ubuntu.com:80/ --recv 04450970750A7228C042CD4442C3EB37B6832E30

There already is our "oneshot" in the form of "--no-use-agent"
So that does not spawn, but still work as intended.

gpg --no-use-agent --keyserver hkp://keyserver.ubuntu.com:80/ --recv 04450970750A7228C042CD4442C3EB37B6832E30

Need to make the mode and check if other calls later on still spawn it.

Unfortunately that option is a dummy in newer gpg, so I opened a discussion upstream as https://bugs.gnupg.org/gnupg/issue2858.

Changed in gnupg:
status: Unknown → New
Changed in gnupg:
status: New → Incomplete
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package curtin - 0.1.0~bzr435-0ubuntu1

---------------
curtin (0.1.0~bzr435-0ubuntu1) zesty; urgency=medium

  * New upstream snapshot.
    - pep8: fix pep8 errors found with 'make pep8' on zesty.
    - Workaround failures caused by gpg2 daemons left running in chroot.
      (LP: #1645680)
    - Install u-boot-tools when running on a system with u-boot.
      (LP: #1640519)
    - block: fix partition kname for raid devices (LP: #1641661)
    - Fix up tox errors that slipped in and new pycodestyle 2.1.0 complaints.
    - vmtests: adjust vmtest image sync metadata filenames
    - vmtests: Add centos support
    - Disable WilyTestRaid5Bcache vmtest
    - tools/xkvm: fix --netdev=<bridge>
    - bytes2human: fix for values larger than 32 bit int on 32 bit python2.

 -- Scott Moser <email address hidden> Thu, 01 Dec 2016 21:05:30 -0500

Changed in curtin (Ubuntu):
status: Confirmed → Fix Released
Scott Moser (smoser) on 2017-01-18
Changed in curtin (Ubuntu Xenial):
status: New → Fix Released
status: Fix Released → Confirmed
importance: Undecided → Medium
status: Confirmed → Triaged
Changed in curtin (Ubuntu Zesty):
importance: Undecided → Medium
Scott Moser (smoser) on 2017-01-18
description: updated
description: updated

Hello ChristianEhrhardt, or anyone else affected,

Accepted curtin into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/curtin/0.1.0~bzr437-0ubuntu1~16.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in curtin (Ubuntu Yakkety):
status: Triaged → Fix Committed
tags: added: verification-needed
Scott Moser (smoser) on 2017-01-31
Changed in curtin:
status: Confirmed → Fix Committed
Changed in curtin (Ubuntu Xenial):
status: Triaged → Fix Committed
tags: added: verification-needed-xenial verification-needed-yakkety
removed: verification-needed
Scott Moser (smoser) wrote :

Hi,
I've verified this in yakkety using vmtest from curtin and some tools at
https://gist.github.com/smoser/4e25b5a9e6245692be90098464990d00

$ git clone https://gist.github.com/4e25b5a9e6245692be90098464990d00.git vtools
$ export PATH=$PWD/vtools:$PATH

$ bzr branch curtin trunk
$ cd trunk
$ bzr revno
443

$ rel=yakkety
$ cname=test-$rel;
$ setup-curtin-container --proposed ubuntu-daily:$rel $cname
Creating test-yakkety

$ lxc exec $cname -- apt-cache policy curtin
curtin:
  Installed: 0.1.0~bzr437-0ubuntu1~16.10.1
  Candidate: 0.1.0~bzr437-0ubuntu1~16.10.1
  Version table:
 *** 0.1.0~bzr437-0ubuntu1~16.10.1 500
        500 http://archive.ubuntu.com/ubuntu yakkety-proposed/universe amd64 Packages
        100 /var/lib/dpkg/status
     0.1.0~bzr425-0ubuntu1 500
        500 http://archive.ubuntu.com/ubuntu yakkety/universe amd64 Packages

$ PATH=$PWD/../vtools:$PATH \
   CURTIN_VMTEST_CURTIN_EXE="curtin-from-container $cname curtin" \
   CURTIN_VMTEST_TOPDIR=$PWD/$rel-1645680 \
   ./tools/jenkins-runner \
      tests/vmtests/test_apt_config_cmd.py:YakketyTestAptConfigCMDCMD

I'm attaching the output directory yakkety-1645680.

Scott Moser (smoser) on 2017-01-31
tags: added: verification-done-yakkety
removed: verification-needed-yakkety
Scott Moser (smoser) wrote :

Hi,
I've verified this in yakkety using vmtest from curtin and some tools at
https://gist.github.com/smoser/4e25b5a9e6245692be90098464990d00

$ git clone https://gist.github.com/4e25b5a9e6245692be90098464990d00.git vtools
$ export PATH=$PWD/vtools:$PATH

$ bzr branch curtin trunk
$ cd trunk
$ bzr revno
443

$ rel=xenial
$ cname=test-$rel;
$ setup-curtin-container --proposed ubuntu-daily:$rel $cname
Creating test-xenial

$ lxc exec $cname apt-cache policy curtin
curtin:
  Installed: 0.1.0~bzr437-0ubuntu1~16.04.1
  Candidate: 0.1.0~bzr437-0ubuntu1~16.04.1
  Version table:
 *** 0.1.0~bzr437-0ubuntu1~16.04.1 500
        500 http://archive.ubuntu.com/ubuntu xenial-proposed/universe amd64 Packages
        100 /var/lib/dpkg/status
     0.1.0~bzr425-0ubuntu1~16.04.1 500
        500 http://archive.ubuntu.com/ubuntu xenial-updates/universe amd64 Packages
     0.1.0~bzr365-0ubuntu1 500
        500 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages

$ PATH=$PWD/../vtools:$PATH \
   CURTIN_VMTEST_CURTIN_EXE="curtin-from-container $cname curtin" \
   CURTIN_VMTEST_TOPDIR=$PWD/$rel-1645680 \
   ./tools/jenkins-runner \
      tests/vmtests/test_apt_config_cmd.py:YakketyTestAptConfigCMDCMD

I'm attaching the output directory xenial-1645680.

Scott Moser (smoser) wrote :

previous comment should have said :
  I've verified this in XENIAL using vmtest

tags: added: verification-done-xenial
removed: verification-needed-xenial
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package curtin - 0.1.0~bzr437-0ubuntu1~16.04.1

---------------
curtin (0.1.0~bzr437-0ubuntu1~16.04.1) xenial-proposed; urgency=medium

  * debian/new-upstream-snapshot: change to not use bzr merge-upstream.
  * New upstream snapshot.
    - pep8: fix pep8 errors found with 'make pep8' on zesty.
    - Workaround failures caused by gpg2 daemons left running in chroot.
      (LP: #1645680)
    - Install u-boot-tools when running on a system with u-boot. (LP: #1640519)
    - block: fix partition kname for raid devices (LP: #1641661)
    - Fix up tox errors that slipped in and new pycodestyle 2.1.0 complaints.
    - vmtests: adjust vmtest image sync metadata filenames
    - vmtests: Add centos support
    - Disable WilyTestRaid5Bcache vmtest
    - tools/xkvm: fix --netdev=<bridge>
    - bytes2human: fix for values larger than 32 bit int on 32 bit python2.

 -- Scott Moser <email address hidden> Wed, 18 Jan 2017 12:39:01 -0500

Changed in curtin (Ubuntu Xenial):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for curtin has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package curtin - 0.1.0~bzr437-0ubuntu1~16.10.1

---------------
curtin (0.1.0~bzr437-0ubuntu1~16.10.1) yakkety-proposed; urgency=medium

  * debian/new-upstream-snapshot: change to not use bzr merge-upstream.
  * New upstream snapshot.
    - revert: Test Workaround: skip XenialTestNvme for a short time.
    - Test Workaround: skip XenialTestNvme for a short time.
    - pep8: fix pep8 errors found with 'make pep8' on zesty.
    - Workaround failures caused by gpg2 daemons left running in chroot.
      (LP: #1645680)
    - Install u-boot-tools when running on a system with u-boot. (LP: #1640519)
    - block: fix partition kname for raid devices (LP: #1641661)
    - Fix up tox errors that slipped in and new pycodestyle 2.1.0 complaints.
    - vmtests: adjust vmtest image sync metadata filenames
    - vmtests: Add centos support
    - Disable WilyTestRaid5Bcache vmtest
    - tools/xkvm: fix --netdev=<bridge>
    - bytes2human: fix for values larger than 32 bit int on 32 bit python2.

 -- Scott Moser <email address hidden> Mon, 23 Jan 2017 14:30:25 -0500

Changed in curtin (Ubuntu Yakkety):
status: Fix Committed → Fix Released
Scott Moser (smoser) wrote :

As an 'fyi', newer versions of curtin now have dropped the specific hack for dirmngr and gpg.
Instead they now use a pid namespace for all processes in the chroot.
This allows us to let the kernel kill off the delinquent processes when their parent exits.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.