Boot and package install/update stalled due to linux-cloud-tools-common (hv-kvp-daemon.service)

Bug #1968013 reported by Christian Ehrhardt 
46
This bug affects 7 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned

Bug Description

I see a boot stall on a normal qemu/libvirt based KVM guest using an almost
fresh cloud image a stall on boot:

While live on the console I see:
[ OK ] Reached target System Time Set.
[ OK ] Finished Load AppArmor pro…s managed internally by snapd.
[* ] A start job is running for /sys/dev…misc/vmbus!hv_kvp (41s / 1min 30s)

Afterwards in the console I see:
[ TIME ] Timed out waiting for device es/virtual/misc/vmbus!hv_kvp.
[DEPEND] Dependency failed for Hyper-V KVP Protocol Daemon.

Note: Found after wondering why my system isn't available for ssh after start

In the journal this bonus minute can be seen for example here:
Apr 06 05:48:42 login-jammy systemd[1]: systemd-fsckd.service: Deactivated successfully.
Apr 06 05:49:41 login-jammy systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_kvp.device: Job sys-devices-virtual-misc-vmbus\x21hv_kvp.device/start timed out.
Apr 06 05:49:41 login-jammy systemd[1]: Timed out waiting for device /sys/devices/virtual/misc/vmbus!hv_kvp.
Apr 06 05:49:41 login-jammy systemd[1]: Dependency failed for Hyper-V KVP Protocol Daemon.
Apr 06 05:49:41 login-jammy systemd[1]: hv-kvp-daemon.service: Job hv-kvp-daemon.service/start failed with result 'dependency'.
Apr 06 05:49:41 login-jammy systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_kvp.device: Job sys-devices-virtual-misc-vmbus\x21hv_kvp.device/start failed with result 'timeout'.
Apr 06 05:49:41 login-jammy systemd[1]: Starting Initial cloud-init job (pre-networking)...

This behavior is from:
  /usr/lib/systemd/system/hv-kvp-daemon.service

$ dpkg -S hv-kvp-daemon.service
linux-cloud-tools-common: /lib/systemd/system/hv-kvp-daemon.service

It has the expected "should not matter elsewhere":
  ConditionVirtualization=microsoft
  ConditionKernelCommandLine=!snapd_recovery_mode

But it also has:
  BindsTo=sys-devices-virtual-misc-vmbus\x21hv_kvp.device

Right now it seems the latter overrules the former (in an obvious non microsoft
environment it does run and wait) and thereby makes it stall boot until it
gives up on it.

This is on my system since I wanted to do performance checks for
something completely else and is part of linux-cloud-tools-common which I had
installed for perf.

Repro:
1. get jammy system
2. install linux-cloud-tools-common
3. reboot and see the delay

I'd expect this stalls boot everywhere except when on azure, but I still need to try that to confirm it.

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-cloud-tools-common 5.15.0-25.25
ProcVersionSignature: Ubuntu 5.15.0-25.25-generic 5.15.30
Uname: Linux 5.15.0-25-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Apr 6 05:48 seq
 crw-rw---- 1 root audio 116, 33 Apr 6 05:48 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu80
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
CasperMD5CheckResult: unknown
Date: Wed Apr 6 09:04:02 2022
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Lsusb-t:
 /: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
 /: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
MachineType: QEMU Standard PC (Q35 + ICH9, 2009)
PackageArchitecture: all
PciMultimedia:

ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-25-generic root=UUID=16484d8a-587c-4c41-9201-e6079b9c6938 ro console=tty1 console=ttyS0
RelatedPackageVersions:
 linux-restricted-modules-5.15.0-25-generic N/A
 linux-backports-modules-5.15.0-25-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/01/2014
dmi.bios.release: 0.0
dmi.bios.vendor: SeaBIOS
dmi.bios.version: 1.14.0-2
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-q35-impish
dmi.modalias: dmi:bvnSeaBIOS:bvr1.14.0-2:bd04/01/2014:br0.0:svnQEMU:pnStandardPC(Q35+ICH9,2009):pvrpc-q35-impish:cvnQEMU:ct1:cvrpc-q35-impish:sku:
dmi.product.name: Standard PC (Q35 + ICH9, 2009)
dmi.product.version: pc-q35-impish
dmi.sys.vendor: QEMU

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

On a fresh AWS instance I already see it stalling on package install, I do not even need to reboot :-/

ubuntu@ip-172-31-16-15:~$ sudo apt install linux-cloud-tools-common
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  linux-cloud-tools-common
0 upgraded, 1 newly installed, 0 to remove and 1 not upgraded.
Need to get 72.4 kB of archives.
After this operation, 342 kB of additional disk space will be used.
Get:1 http://eu-south-1.ec2.archive.ubuntu.com/ubuntu jammy/main amd64 linux-cloud-tools-common all 5.15.0-25.25 [72.4 kB]
Fetched 72.4 kB in 0s (632 kB/s)
Selecting previously unselected package linux-cloud-tools-common.
(Reading database ... 63666 files and directories currently installed.)
Preparing to unpack .../linux-cloud-tools-common_5.15.0-25.25_all.deb ...
Unpacking linux-cloud-tools-common (5.15.0-25.25) ...
Setting up linux-cloud-tools-common (5.15.0-25.25) ...
Created symlink /etc/systemd/system/multi-user.target.wants/hv-fcopy-daemon.service → /lib/systemd/system/hv-fcopy-daemon.service.
Created symlink /etc/systemd/system/multi-user.target.wants/hv-kvp-daemon.service → /lib/systemd/system/hv-kvp-daemon.service.
Created symlink /etc/systemd/system/multi-user.target.wants/hv-vss-daemon.service → /lib/systemd/system/hv-vss-daemon.service.

Progress: [ 60%] [###################################################################################################..................................................................]

< here it is hanging ~1 minute >

Could not execute systemctl: at /usr/bin/deb-systemd-invoke line 142.
Processing triggers for man-db (2.10.2-1) ...
Scanning processes...
Scanning linux images...

It is interesting that it even hits:
Could not execute systemctl: at /usr/bin/deb-systemd-invoke line 142.

That is the try to start it from:
/var/lib/dpkg/info/linux-cloud-tools-common.postinst :63

Which has:
deb-systemd-invoke $_dh_action 'hv-fcopy-daemon.service' 'hv-kvp-daemon.service' 'hv-vss-daemon.service' >/dev/null || true

In Journal we can see the exact same time-out
Apr 06 09:12:36 ip-172-31-16-15 systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_kvp.device: Job sys-devices-virtual-misc-vmbus\x21hv_kvp.device/start timed out.
Apr 06 09:12:36 ip-172-31-16-15 systemd[1]: Timed out waiting for device /sys/devices/virtual/misc/vmbus!hv_kvp.
Apr 06 09:12:36 ip-172-31-16-15 systemd[1]: Dependency failed for Hyper-V KVP Protocol Daemon.
Apr 06 09:12:36 ip-172-31-16-15 systemd[1]: hv-kvp-daemon.service: Job hv-kvp-daemon.service/start failed with result 'dependency'.
Apr 06 09:12:36 ip-172-31-16-15 systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_kvp.device: Job sys-devices-virtual-misc-vmbus\x21hv_kvp.device/start failed with result 'timeout'.

summary: - Boot stall on non-azure systems due to linux-cloud-tools-common
+ Boot and package install/update stalled due to linux-cloud-tools-common
+ (hv-kvp-daemon.service)
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

For easier debugging this can be reproduced on any affected system (Tried KVM and EC2) by just running:

$ sudo systemctl start hv-kvp-daemon.service

^^ This will stall while one would assume it should quick-exit due to the ConditionVirtualization not being met.

A full cycle (90 second stall) of this start-try in the journal will look like:
Apr 06 09:17:01 ip-172-31-16-15 sudo[2290]: ubuntu : TTY=pts/1 ; PWD=/home/ubuntu ; USER=root ; COMMAND=/usr/bin/systemctl start hv-kvp-daemon.service
Apr 06 09:17:01 ip-172-31-16-15 sudo[2290]: pam_unix(sudo:session): session opened for user root(uid=0) by ubuntu(uid=1000)
Apr 06 09:17:01 ip-172-31-16-15 CRON[2294]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 06 09:17:01 ip-172-31-16-15 CRON[2295]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Apr 06 09:17:01 ip-172-31-16-15 CRON[2294]: pam_unix(cron:session): session closed for user root

Apr 06 09:18:31 ip-172-31-16-15 systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_kvp.device: Job sys-devices-virtual-misc-vmbus\x21hv_kvp.device/start timed out.
Apr 06 09:18:31 ip-172-31-16-15 systemd[1]: Timed out waiting for device /sys/devices/virtual/misc/vmbus!hv_kvp.
Apr 06 09:18:31 ip-172-31-16-15 sudo[2290]: pam_unix(sudo:session): session closed for user root
Apr 06 09:18:31 ip-172-31-16-15 systemd[1]: Dependency failed for Hyper-V KVP Protocol Daemon.
Apr 06 09:18:31 ip-172-31-16-15 systemd[1]: hv-kvp-daemon.service: Job hv-kvp-daemon.service/start failed with result 'dependency'.
Apr 06 09:18:31 ip-172-31-16-15 systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_kvp.device: Job sys-devices-virtual-misc-vmbus\x21hv_kvp.device/start failed with result 'timeout'.

Changed in linux (Ubuntu):
importance: Undecided → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.