cloud-net fails when DHCP slow

Bug #1990585 reported by Greg Lee
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Incomplete
Undecided
Unassigned
subiquity (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

During a cloud-init autoinstall with Ubunutu Focal Server 20.04, the attempts to retrieve the meta-data file fail immediately with no timeout as indicated in the cloud-init.log. According to the system logs, the network does not receive an IP address until after the attempt to retrieve the meta-data file. The DHCP in the facility is known to be a little slow.

It is possible to retrieve the file by switching to a second virtual terminal and downloading it with wget (after the failure of the cloud-init). It appears that the url_helper.py function reporting the failure to retrieve the files does not wait for any timeout period when the network is not yet present.

Possible solution? Adding a check with a timeout of a few seconds for the network being available before moving to download the meta-data and user-data files.

Revision history for this message
Dan Bungert (dbungert) wrote :

This looks like cloud-init handling to me, cloud-init folks please send it right back if you believe otherwise.

Changed in subiquity (Ubuntu):
status: New → Invalid
Revision history for this message
Chad Smith (chad.smith) wrote :

Thanks for filing this bug Greg and making cloud-init and subiquity better. It looks to me reminiscent of LP: #1990149 which is that the nocloud-net doesn't wait for the configured nocloud datasource to be visible, nor retry if it is not yet there.

Greg, can you confirm that the autoinstall path you are providing to your machine is nocloud-based?

It would be helpful to either attach the tar gz from running `cloud-init collect-logs` or minimally provide /var/log/cloud-init.log from the system on which things failed just so we can confirm it's the same failure mode as the other bug linked.

I'm setting the bug status to 'Incomplete' and agree it's probably a cloud-init problem, not subiquity but please set this bug status back to 'New' for cloud-init when you'd like to have us glance again at this issue.

Changed in cloud-init (Ubuntu):
status: New → Incomplete
Revision history for this message
Greg Lee (mrleegs) wrote : Re: [Bug 1990585] Re: cloud-net fails when DHCP slow
Download full text (3.6 KiB)

Hi Chad,

I reviewed the bug report that you cited. It appears to be the same or
related.

The bug I encountered began with appending the following command to the
kernel in Grub:

autoinstall ds=nocloud-net\;s=
https://raw.githubusercontent.com/cwru-robotics/cwru_robotics_autoinstall_scripts/focal_install/ros-noetic-desktop-full/

So I believe it is no-cloud based. (I am not particularly well versed as
to which package does explicitly what at the level of granularity.)

The cloud-init.log showed the attempt to download the meta-data file failed
ten times in rapid succession (no change in the timestamps for each attempt
in the log). The error message for failing to retrieve meta-data in the
cloud-init.log were from url-helper.py. The syslog.log file showed that
the network interface received its IP address at a time stamp after the
download failures. I was able to manually retrieve the meta-data file
using wget as a way to verify the files were accessible from the computer.
The script works fine on a virtual machine that gets its IP address
immediately.

It will likely take me a while to get the logs since the hardware is with
students right now. (This is part of a senior project.) I hope this is
adequate, if imperfect, information to proceed fixing the bug.

Thanks,

Greg

On Wed, Sep 28, 2022 at 2:55 PM Chad Smith <email address hidden>
wrote:

> Thanks for filing this bug Greg and making cloud-init and subiquity
> better. It looks to me reminiscent of LP: #1990149 which is that the
> nocloud-net doesn't wait for the configured nocloud datasource to be
> visible, nor retry if it is not yet there.
>
> Greg, can you confirm that the autoinstall path you are providing to
> your machine is nocloud-based?
>
> It would be helpful to either attach the tar gz from running `cloud-init
> collect-logs` or minimally provide /var/log/cloud-init.log from the
> system on which things failed just so we can confirm it's the same
> failure mode as the other bug linked.
>
> I'm setting the bug status to 'Incomplete' and agree it's probably a
> cloud-init problem, not subiquity but please set this bug status back to
> 'New' for cloud-init when you'd like to have us glance again at this
> issue.
>
> ** Changed in: cloud-init (Ubuntu)
> Status: New => Incomplete
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1990585
>
> Title:
> cloud-net fails when DHCP slow
>
> Status in cloud-init package in Ubuntu:
> Incomplete
> Status in subiquity package in Ubuntu:
> Invalid
>
> Bug description:
> During a cloud-init autoinstall with Ubunutu Focal Server 20.04, the
> attempts to retrieve the meta-data file fail immediately with no
> timeout as indicated in the cloud-init.log. According to the system
> logs, the network does not receive an IP address until after the
> attempt to retrieve the meta-data file. The DHCP in the facility is
> known to be a little slow.
>
> It is possible to retrieve the file by switching to a second virtual
> terminal and downloading it with wget (after the failure of the cloud-
> init). It appears that the url_hel...

Read more...

Revision history for this message
James Falcon (falcojr) wrote :

Thanks for the additional info Greg. Based on what you've said, yes, it sounds like this is a duplicate of https://bugs.launchpad.net/cloud-init/+bug/1990149 . Since we don't have logs though, I'm going to leave this incomplete. #1990149 is on the docket to be fixed soon, so hopefully at the point this will resolve itself without the need for additional logs.

Revision history for this message
Greg Lee (mrleegs) wrote :
  • logs.tgz Edit (135.2 KiB, application/x-compressed-tar; name="logs.tgz")

Hi James,

I was able to get the logs from an attempt today. Here they are.

Greg

On Mon, Oct 3, 2022 at 10:20 AM James Falcon <email address hidden>
wrote:

> Thanks for the additional info Greg. Based on what you've said, yes, it
> sounds like this is a duplicate of https://bugs.launchpad.net/cloud-
> init/+bug/1990149 <https://bugs.launchpad.net/cloud-init/+bug/1990149> .
> Since we don't have logs though, I'm going to leave
> this incomplete. #1990149 is on the docket to be fixed soon, so
> hopefully at the point this will resolve itself without the need for
> additional logs.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1990585
>
> Title:
> cloud-net fails when DHCP slow
>
> Status in cloud-init package in Ubuntu:
> Incomplete
> Status in subiquity package in Ubuntu:
> Invalid
>
> Bug description:
> During a cloud-init autoinstall with Ubunutu Focal Server 20.04, the
> attempts to retrieve the meta-data file fail immediately with no
> timeout as indicated in the cloud-init.log. According to the system
> logs, the network does not receive an IP address until after the
> attempt to retrieve the meta-data file. The DHCP in the facility is
> known to be a little slow.
>
> It is possible to retrieve the file by switching to a second virtual
> terminal and downloading it with wget (after the failure of the cloud-
> init). It appears that the url_helper.py function reporting the
> failure to retrieve the files does not wait for any timeout period
> when the network is not yet present.
>
> Possible solution? Adding a check with a timeout of a few seconds for
> the network being available before moving to download the meta-data
> and user-data files.
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1990585/+subscriptions
>
>

Revision history for this message
James Falcon (falcojr) wrote :

Thanks Greg. Based on what I see in the logs, I'm going to mark this bug as a duplicate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.