vaultlocker service fails when some interface are DOWN with NO-CARRIER
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| Bionic Backports |
Undecided
|
James Page | ||
| vaultlocker |
High
|
Unassigned | ||
| vaultlocker (Ubuntu) |
High
|
James Page | ||
| Disco |
High
|
James Page | ||
| Eoan |
High
|
James Page |
Bug Description
[Impact]
Systems with block device encryption managed using vaultlocker will not boot if any interfaces are in a DOWN or NO-CARRIER state
[Test Case]
Deploy OpenStack with block device encryption using vaultlocker (charms)
Unplug or disable a network interface which is configured on the system.
Reboot - server will timeout on unlocking block devices on boot.
[Regression Potential]
Low - change simply removes the dependency on systemd-
[Original Bug Report]
On some hosts, it might be possible to have interfaces that are DOWN
with NO-CARRIER. In this case, systemd-
and fail. Therefore vaultlocker will also fail.
If vaultlocker fails it might impact the mount of the encrypted
partitions.
Nicolas Pochet (npochet) wrote : | #1 |
Ryan Beisner (1chb1n) wrote : | #2 |
FYI py34 tests failed on the PR, investigating.
Ryan Beisner (1chb1n) wrote : | #3 |
FYI, proposed test updates in which will need to be rebased into your change after the test updates merge. https:/
Nicolas Pochet (npochet) wrote : | #4 |
The change was rebased and the tests are passing.
Could someone review https:/
Changed in vaultlocker: | |
status: | New → Fix Released |
importance: | Undecided → High |
Nicolas Pochet (npochet) wrote : | #5 |
Could we please backport it and make it available to Bionic?
I faced that again for another customer deployment.
Changed in vaultlocker (Ubuntu Eoan): | |
assignee: | nobody → James Page (james-page) |
Changed in vaultlocker (Ubuntu Disco): | |
assignee: | nobody → James Page (james-page) |
Changed in vaultlocker (Ubuntu): | |
assignee: | nobody → James Page (james-page) |
status: | New → Triaged |
Changed in vaultlocker (Ubuntu Disco): | |
status: | New → Triaged |
Changed in vaultlocker (Ubuntu Eoan): | |
status: | New → Triaged |
Changed in vaultlocker (Ubuntu): | |
importance: | Undecided → High |
Changed in vaultlocker (Ubuntu Disco): | |
importance: | Undecided → High |
Changed in vaultlocker (Ubuntu Eoan): | |
importance: | Undecided → High |
Changed in vaultlocker (Ubuntu): | |
status: | Triaged → In Progress |
Changed in bionic-backports: | |
status: | New → In Progress |
assignee: | nobody → James Page (james-page) |
James Page (james-page) wrote : | #6 |
I've uploaded a new point release of vaultlocker for consideration for SRU; the point release includes only two changes - a previous fix that was already included as a patch (1.0.3-0ubuntu2) and the fix for this issue.
description: | updated |
Launchpad Janitor (janitor) wrote : | #7 |
This bug was fixed in the package vaultlocker - 1.0.4-0ubuntu1
---------------
vaultlocker (1.0.4-0ubuntu1) focal; urgency=medium
* New upstream release (LP: #1838607):
- d/p/*: Drop all patches, included in release.
-- James Page <email address hidden> Thu, 05 Dec 2019 14:49:04 +0000
Changed in vaultlocker (Ubuntu): | |
status: | In Progress → Fix Released |
Hello Nicolas, or anyone else affected,
Accepted vaultlocker into eoan-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-
Further information regarding the verification process can be found at https:/
N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.
Changed in vaultlocker (Ubuntu Eoan): | |
status: | Triaged → Fix Committed |
tags: | added: verification-needed verification-needed-eoan |
Changed in vaultlocker (Ubuntu Disco): | |
status: | Triaged → Fix Committed |
tags: | added: verification-needed-disco |
Brian Murray (brian-murray) wrote : | #9 |
Hello Nicolas, or anyone else affected,
Accepted vaultlocker into disco-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-
Further information regarding the verification process can be found at https:/
N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.
Nicolas Pochet (npochet) wrote : | #10 |
Validation for disco.
I created a disco VM, configured vault on another machine and installed vaultlocker from the repo.
Used vaultlocker to encrypt a partition:
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 149.7M 1 loop /snap/vault/1822
loop1 7:1 0 54.9M 1 loop /snap/lxd/12631
loop2 7:2 0 89.1M 1 loop /snap/core/8268
sda 8:0 0 20G 0 disk
└─sda1 8:1 0 20G 0 part /
sdb 8:16 0 5G 0 disk
└─sdb1 8:17 0 5G 0 part
└─crypt-
As described in the original bug, there's an interface that is DOWN with NO-CARRIER:
ip l
1: lo: <LOOPBACK,
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens3: <BROADCAST,
link/ether 52:54:00:ad:4f:a6 brd ff:ff:ff:ff:ff:ff
3: ens8: <NO-CARRIER,
link/ether 52:54:00:2b:c5:56 brd ff:ff:ff:ff:ff:ff
When rebooting, we can see the following in the logs:
grep mnt /var/log/syslog
Jan 14 14:17:37 vm1 systemd[1]: Dependency failed for /mnt/test.
Jan 14 14:17:37 vm1 systemd[1]: mnt-test.mount: Job mnt-test.
Jan 14 14:17:42 vm1 systemd[1]: Mounting /mnt/test...
Jan 14 14:17:42 vm1 systemd[1]: Mounted /mnt/test.
The version of vaultlocker is:
dpkg -l | grep vault
ii vaultlocker 1.0.3-0ubuntu2 all Secure storage of dm-crypt keys in Hashicorp Vault
After an upgrade to disco-proposed for the vaultlocker package:
dpkg -l | grep vault
ii vaultlocker 1.0.4-0ubuntu0.
Rebooting the machine does not show the same errors in the logs:
grep mnt /var/log/syslog
Jan 14 14:24:30 vm1 systemd[983]: mnt-test.mount: Succeeded.
Jan 14 14:27:14 vm1 systemd[1]: Mounting /mnt/test...
Jan 14 14:27:14 vm1 systemd[1]: Mounted /mnt/test.
For the original bug point of view, this patch is fixing the issue in disco-proposed.
tags: |
added: verification-done-disco removed: verification-needed-disco |
Nicolas Pochet (npochet) wrote : | #11 |
Validation for eoan.
I created an eoan VM, configured vault on another machine and installed vaultlocker from the repo.
Used vaultlocker to encrypt a partition:
lsblk
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 89.1M 1 loop /snap/core/8268
loop1 7:1 0 54.9M 1 loop /snap/lxd/12631
sda 8:0 0 20G 0 disk
└─sda1 8:1 0 20G 0 part /
sdb 8:16 0 5G 0 disk
└─sdb1 8:17 0 5G 0 part
└─crypt-
As described in the original bug, there's an interface that is DOWN with NO-CARRIER:
ip l
1: lo: <LOOPBACK,
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens3: <BROADCAST,
link/ether 52:54:00:ad:4f:a6 brd ff:ff:ff:ff:ff:ff
3: ens8: <NO-CARRIER,
link/ether 52:54:00:06:3e:6c brd ff:ff:ff:ff:ff:ff
When rebooting, we can see the following in the logs:
grep mnt /var/log/syslog
Jan 14 15:30:32 vm1 systemd[1]: Dependency failed for /mnt/test.
Jan 14 15:30:32 vm1 systemd[1]: mnt-test.mount: Job mnt-test.
The version of vaultlocker is:
dpkg -l | grep vault
ii vaultlocker 1.0.3-0ubuntu2 all Secure storage of dm-crypt keys in Hashicorp Vault
After an upgrade to disco-proposed for the vaultlocker package:
dpkg -l | grep vault
ii vaultlocker 1.0.4-0ubuntu0.
Rebooting the machine does not show the same errors in the logs:
grep mnt /var/log/syslog
Jan 14 15:44:04 vm1 systemd[926]: mnt-test.mount: Succeeded.
Jan 14 15:46:46 vm1 systemd[1]: Mounting /mnt/test...
Jan 14 15:46:46 vm1 systemd[1]: Mounted /mnt/test.
For the original bug point of view, this patch is fixing the issue in eoan-proposed.
tags: |
added: verification-done-eoan removed: verification-needed-eoan |
Nicolas Pochet (npochet) wrote : | #12 |
James, Brian,
Now that it has been tested and validated that the change works for Disco and Eoan, I guess that we'll have to wait 7 more days to have it in {disco,
What about Bionic? Is it possible to backport it?
The verification of the Stable Release Update for vaultlocker has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
Launchpad Janitor (janitor) wrote : | #14 |
This bug was fixed in the package vaultlocker - 1.0.4-0ubuntu0.
---------------
vaultlocker (1.0.4-
* New upstream point release including fix for an issue when
vaultlocker blocks boot if interfaces are in a down or no-carrier
state (LP: #1838607):
- d/p/*: Drop, all included in new point release.
-- James Page <email address hidden> Thu, 05 Dec 2019 16:22:36 +0000
Changed in vaultlocker (Ubuntu Eoan): | |
status: | Fix Committed → Fix Released |
Launchpad Janitor (janitor) wrote : | #15 |
This bug was fixed in the package vaultlocker - 1.0.4-0ubuntu0.
---------------
vaultlocker (1.0.4-
* New upstream point release including fix for an issue when
vaultlocker blocks boot if interfaces are in a down or no-carrier
state (LP: #1838607):
- d/p/*: Drop, all included in new point release.
-- James Page <email address hidden> Thu, 05 Dec 2019 16:22:58 +0000
Changed in vaultlocker (Ubuntu Disco): | |
status: | Fix Committed → Fix Released |
James Page (james-page) wrote : | #16 |
I've uploaded a backport of 1.0.4-0ubuntu0.
Nicolas Pochet (npochet) wrote : | #17 |
Validation for Bionic from James Page PPA.
I created a bionic VM, configured vault on another machine and installed vaultlocker from the repo.
Used vaultlocker to encrypt a partition:
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 20G 0 disk
└─sda1 8:1 0 20G 0 part /
sdb 8:16 0 5G 0 disk
└─sdb1 8:17 0 5G 0 part
└─crypt-
As described in the original bug, there's an interface that is DOWN with NO-CARRIER:
ip l
1: lo: <LOOPBACK,
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens3: <BROADCAST,
link/ether 52:54:00:ad:4f:a6 brd ff:ff:ff:ff:ff:ff
3: ens8: <NO-CARRIER,
link/ether 52:54:00:06:3e:6c brd ff:ff:ff:ff:ff:ff
When rebooting, we can see the following in the logs:
grep mnt /var/log/syslog
Jan 15 11:07:33 vm1 systemd[1]: Dependency failed for /mnt/test.
Jan 15 11:07:33 vm1 systemd[1]: mnt-test.mount: Job mnt-test.
The version of vaultlocker is:
dpkg -l | grep vaultlocker
ii vaultlocker 1.0.3-0ubuntu1.
I upgraded the vaultlocker package from James Page's PPA:
sudo apt-add-repository ppa:james-
sudo apt update
sudo apt upgrade
dpkg -l | grep vaultlocker
ii vaultlocker 1.0.4-0ubuntu0.
Rebooting the machine does not show the same errors in the logs:
grep mnt /var/log/syslog
Jan 15 11:34:09 vm1 systemd[1]: Mounting /mnt/test...
Jan 15 11:34:09 vm1 systemd[1]: Mounted /mnt/test.
For the original bug point of view, this patch is fixing the issue in the version proposed by James Page in his PPA. This package is in the queue to be backported to bionic-backports.
Edward Hope-Morley (hopem) wrote : | #18 |
I'm trying to understand why I do not see this issue. I have several interfaces DOWN and vaultlocker does not have this issue on boot:
root@chespin:~# ip a s| grep ": eno"
2: eno1: <NO-CARRIER,
3: eno2: <NO-CARRIER,
4: eno3: <NO-CARRIER,
5: eno4: <NO-CARRIER,
6: eno49: <BROADCAST,
7: eno50: <BROADCAST,
(reverse-
root@chespin:~# dpkg -l| grep vaultlocker
ii vaultlocker 1.0.3-0ubuntu1.
root@chespin:~# grep "Dependency failed" /var/log/syslog*
root@chespin:~#
It also appears you are using a vm so i wonder if that somehow impacts your issue. The only other issue with vaultlocker on boot that i am aware of is bug 1804261 where it can timeout reaching the vault api but that is a different problem.
Mauricio Faria de Oliveira (mfo) wrote : | #19 |
Hi Ed,
I looked into this, and the issue only happens if such interfaces (with state DOWN and NO-CARRIER) are managed by systemd-networkd (check with 'networkctl list').
Per systemd-
'By default, it will wait for all links it is aware of and which are managed by systemd-
[1] https:/
The issue can be reproduced in a VM with a NIC configured in libvirt XML as "<interface type='ethernet'
Hope this helps,
Mauricio
---
Create a VM (bionic):
---
$ uvt-simplestrea
$ uvt-kvm create --cpu 2 --memory 2048 --disk 4 --password password bionic release=bionic arch=amd64
$ uvt-kvm wait bionic
Give it an ethernet interface with link down:
---
$ virsh edit bionic
...
<interface type='ethernet'>
<model type='virtio'/>
<link state='down'/>
</interface>
...
Re-start the VM:
---
$ virsh shutdown bionic
$ virsh start bionic
$ uvt-kvm wait bionic
$ uvt-kvm ssh bionic
Check that systemd-
---
By default, only the 'ens3' interface is configured in netplan
(thus managed by systemd-networkd, default renderer in netplan).
$ ip -o l
1: lo: <LOOPBACK,
2: ens3: <BROADCAST,
3: ens7: <BROADCAST,
$ grep -r ens[0-9]: /etc/netplan
/etc/netplan/
$ ls -1 /run/systemd/
/run/systemd/
Notice that the 'ens7' interface (other/new) SETUP status is 'unmanaged':
$ sudo networkctl list
IDX LINK TYPE OPERATIONAL SETUP
1 lo loopback carrier unmanaged
2 ens3 ether routable configured
3 ens7 ether off unmanaged
3 links listed.
Thus, despite being 'state DOWN', systemd-
$ systemctl status systemd-
Process: 611 ExecStart=
Check that systemd-
---
If you just configure 'ens7' in netplan, even without setting/getting IP:
$ cat <<EOF | sudo tee /etc/netplan/
network:
ethernets:
ens7:
match:
macaddress: 52:54:00:fb:c6:b6
set-name: ens7
version: 2
EOF
$ sudo netplan apply
$ grep -r ens[0-9]: /etc/netplan
/etc/netplan/
/etc/netplan/
$ ls -1 /run/systemd/
Changed in bionic-backports: | |
status: | In Progress → Fix Released |
Edward Hope-Morley (hopem) wrote : | #20 |
@mfo thanks yeah, the piece of info that was missing for me is that the interfaces need to be down AND have a netplan configuration in order for the issue to trigger which in my repro was not the case.
https:/ /github. com/openstack- charmers/ vaultlocker/ pull/7 tries to address the issue by removing the dependency on systemd- networkd- wait-online.