vaultlocker spins indefinitely if it starts before dns configured

Bug #1868557 reported by Edward Hope-Morley
34
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Bionic Backports
Fix Released
Undecided
Unassigned
vaultlocker
Fix Released
Undecided
Edward Hope-Morley
vaultlocker (Ubuntu)
Fix Released
High
Edward Hope-Morley
Eoan
Fix Released
High
Unassigned
Focal
Fix Released
High
Edward Hope-Morley

Bug Description

[Impact]
vaultlocker decrypt systemd units start to early in boot process and as a result can't determine the local hostname of the machine they are running on, resulting in failure to retrieve keys from vault.

[Test Case]
This is somewhat tricky to reproduce as its a bit of a race condition - the original bug reporter will help with testing as it was fairly reliable reproduced in the impacted deployment.

[Regression Potential]
Low - the fix (release as the only change in 1.0.6) simple ensures that nss-lookup.target has completed before running the vaultlocker-decrypt units and has been tested using overrides in the impacted deployment.

[Original Bug Report]
On a node that has multiple networks configured and vaultlocker is used for decrypting ceph osds, if vaultlocker starts (specifically vaultlocker-decrypt systemd units) prior to dns being configured, it appears that it will spin forever when the vault url contains hostnames (i.e. not IP addresses). What we see is that there are no crypt- devices and there are per-osd vaultocker processes running that if we strace we see are spinning in select(NULL, NULL, ...) which is socket.gethostname() at [1]. The only way to fix this currently is to manually restart the vaultlocker process so that current dns settings are picked up. It appears that this behavior was introduced by the fix for bug 1838607 [2] which means that vaultlocker no longer waits for all networking to be UP and ready and therefor does not wait for dns to be setup.

We tried adding After=nss-lookup.target to the vaultlocker-decrypt unit configs and rebooted the node and that resolved the problem.

[1] https://github.com/openstack-charmers/vaultlocker/blob/master/vaultlocker/shell.py#L54
[2] https://github.com/openstack-charmers/vaultlocker/pull/7/files

Revision history for this message
Edward Hope-Morley (hopem) wrote :
Changed in vaultlocker:
assignee: nobody → Edward Hope-Morley (hopem)
status: New → In Progress
Changed in vaultlocker (Ubuntu):
status: New → In Progress
assignee: nobody → Edward Hope-Morley (hopem)
James Page (james-page)
Changed in vaultlocker:
status: In Progress → Fix Released
James Page (james-page)
Changed in vaultlocker (Ubuntu Focal):
status: In Progress → Fix Released
status: Fix Released → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package vaultlocker - 1.0.6-0ubuntu1

---------------
vaultlocker (1.0.6-0ubuntu1) focal; urgency=medium

  * New upstream point release including fix to ensure vaultlocker
    waits for hostname resolution to be up before starting (LP: #1868557).

 -- James Page <email address hidden> Mon, 23 Mar 2020 15:50:33 +0000

Changed in vaultlocker (Ubuntu Focal):
status: Fix Committed → Fix Released
James Page (james-page)
no longer affects: vaultlocker (Ubuntu Bionic)
Changed in vaultlocker (Ubuntu Eoan):
status: New → Triaged
importance: Undecided → High
Changed in vaultlocker (Ubuntu Focal):
importance: Undecided → High
Revision history for this message
James Page (james-page) wrote :
description: updated
description: updated
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Edward, or anyone else affected,

Accepted vaultlocker into eoan-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/vaultlocker/1.0.6-0ubuntu0.19.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-eoan to verification-done-eoan. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-eoan. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in vaultlocker (Ubuntu Eoan):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-eoan
Revision history for this message
Edward Hope-Morley (hopem) wrote :

eoan-proposed verified using [Test Case] with output:

root@juju-773c15-lp1863014-eoan-10:/home/ubuntu# apt-cache policy vaultlocker
vaultlocker:
  Installed: 1.0.6-0ubuntu0.19.10.1
  Candidate: 1.0.6-0ubuntu0.19.10.1
  Version table:
 *** 1.0.6-0ubuntu0.19.10.1 500
        500 http://archive.ubuntu.com/ubuntu eoan-proposed/universe amd64 Packages
        100 /var/lib/dpkg/status
     1.0.4-0ubuntu0.19.10.1 500
        500 http://nova.clouds.archive.ubuntu.com/ubuntu eoan-updates/universe amd64 Packages
     1.0.3-0ubuntu2 500
        500 http://nova.clouds.archive.ubuntu.com/ubuntu eoan/universe amd64 Packages
root@juju-773c15-lp1863014-eoan-10:/home/ubuntu# journalctl -u system-vaultlocker\x2ddecrypt.slice
-- Logs begin at Wed 2020-04-15 17:26:55 UTC, end at Thu 2020-04-16 12:51:32 UTC. --
-- No entries --
root@juju-773c15-lp1863014-eoan-10:/home/ubuntu# mount| grep crypt
/dev/mapper/crypt-76a2e3b7-0977-4dcd-a0c9-ec036259bac1 on /var/lib/nova/instances type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)

tags: added: verification-done verification-done-eoan
removed: verification-needed verification-needed-eoan
tags: added: sts-sru-needed
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for vaultlocker has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package vaultlocker - 1.0.6-0ubuntu0.19.10.1

---------------
vaultlocker (1.0.6-0ubuntu0.19.10.1) eoan; urgency=medium

  * New upstream point release including fixes for:
    - 1.0.5 - Skip trying to decrypt device if it already exists
      (LP: #1863014).
    - 1.0.6 - vaultlocker spins indefinitely if it starts before dns
      is configured (LP: #1868557).

 -- James Page <email address hidden> Thu, 09 Apr 2020 09:41:41 +0100

Changed in vaultlocker (Ubuntu Eoan):
status: Fix Committed → Fix Released
James Page (james-page)
Changed in bionic-backports:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.