Comment 4 for bug 2039441

Revision history for this message
Robie Basak (racb) wrote :

I tried uvt-kvm wait three times against the following image and it consistently works:

http://cloud-images.ubuntu.com/daily/server/focal/20231003/focal-server-cloudimg-amd64.img

I tried uvt-kvm wait three times against the following image and it consistently fails:

http://cloud-images.ubuntu.com/releases/focal/release-20231011/ubuntu-20.04-server-cloudimg-amd64.img

Differences in package versiosn between these images are as follows:

$ diff -u0 good bad
--- good 2023-10-16 16:13:54.245903813 +0000
+++ bad 2023-10-16 16:14:04.801959310 +0000
@@ -30 +30 @@
-cloud-init 23.2.2-0ubuntu0~20.04.1
+cloud-init 23.3.1-0ubuntu1~20.04.1
@@ -43 +43 @@
-curl 7.68.0-1ubuntu2.19
+curl 7.68.0-1ubuntu2.20
@@ -101,2 +101,2 @@
-grub-efi-amd64-bin 2.06-2ubuntu14.2
-grub-efi-amd64-signed 1.187.4~20.04.1+2.06-2ubuntu14.2
+grub-efi-amd64-bin 2.06-2ubuntu14.4
+grub-efi-amd64-signed 1.187.6~20.04.1+2.06-2ubuntu14.4
@@ -175,2 +175,2 @@
-libcurl3-gnutls:amd64 7.68.0-1ubuntu2.19
-libcurl4:amd64 7.68.0-1ubuntu2.19
+libcurl3-gnutls:amd64 7.68.0-1ubuntu2.20
+libcurl4:amd64 7.68.0-1ubuntu2.20
@@ -380,8 +380,8 @@
-linux-headers-5.4.0-163 5.4.0-163.180
-linux-headers-5.4.0-163-generic 5.4.0-163.180
-linux-headers-generic 5.4.0.163.160
-linux-headers-virtual 5.4.0.163.160
-linux-image-5.4.0-163-generic 5.4.0-163.180
-linux-image-virtual 5.4.0.163.160
-linux-modules-5.4.0-163-generic 5.4.0-163.180
-linux-virtual 5.4.0.163.160
+linux-headers-5.4.0-164 5.4.0-164.181
+linux-headers-5.4.0-164-generic 5.4.0-164.181
+linux-headers-generic 5.4.0.164.161
+linux-headers-virtual 5.4.0.164.161
+linux-image-5.4.0-164-generic 5.4.0-164.181
+linux-image-virtual 5.4.0.164.161
+linux-modules-5.4.0-164-generic 5.4.0-164.181
+linux-virtual 5.4.0.164.161
@@ -579,4 +579,4 @@
-vim 2:8.1.2269-1ubuntu5.17
-vim-common 2:8.1.2269-1ubuntu5.17
-vim-runtime 2:8.1.2269-1ubuntu5.17
-vim-tiny 2:8.1.2269-1ubuntu5.17
+vim 2:8.1.2269-1ubuntu5.18
+vim-common 2:8.1.2269-1ubuntu5.18
+vim-runtime 2:8.1.2269-1ubuntu5.18
+vim-tiny 2:8.1.2269-1ubuntu5.18
@@ -589 +589 @@
-xxd 2:8.1.2269-1ubuntu5.17
+xxd 2:8.1.2269-1ubuntu5.18

cloud-init seems like the most likely change to be affecting this issue.

So then I tried downgrading cloud-init to 23.2.2-0ubuntu0~20.04.1 in 20231011 using mount-image-callback. I tried uvt-kvm wait against this three times, and it consistently works.

To try to eliminate me accidentally fixing the issue due to my method itself, I tried reinstalling cloud-init 23.3.1-0ubuntu1~20.04.1 using mount-image-callback. I tried uvt-kvm wait against this three times, and it consistently fails.

Conclusion: something changed in cloud-init 23.3.1-0ubuntu1~20.04.1 to consistently cause early ssh to fail, whereas it did not before. This could be that the ssh listening socket is open before pam_nologin is deactivated whereas this didn't happen before, or because the window when it is active has grown. The former suggests a behavioural change; the latter suggests a "time to ssh" boot speed regression. So it seems appropriate to add a cloud-init regression task to this bug.

Nevertheless we should also fix this in uvt-kvm wait.