Comment 10 for bug 1919177

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Thanks Paride. I still think we should precisely understand that difference in the logs, since in the BAD case we always see the "azure.py" messages, not the other one. This could be related or at least a clue on the root cause.

Regarding the kernel side, I've build a 5.11 kernel with debug patch [0] - I'm attaching the patch here, very simple, just a parameter-delay in the carrier notification. Unfortunately gjolly tried it in a custom image and it didn't reproduce. My theory is that just delaying the notification is not enough, due to the complex SR-IOV multi-interface nature in Hyper-V, maybe there is network connectivity even before the carrier is fully set UP, so the debug patch could be extended maybe to block packet transmission in mlx5 for N seconds.

I have a feeling that Groovy should reproduce this, as discussed with gjolly - in our first reproducer, we had a Hirsute image with Groovy 5.8 kernel and also we have cloud-init versions really alike in Groovy/Hirsute. So, if reproduces in Groovy it shouldn't be a release blocker, definitely.

Thanks!

[0] https://launchpad.net/~gpiccoli/+archive/ubuntu/test1919177/