19.04 minimal images on GCE intermittently fail to set up networking

Bug #1822353 reported by Philip Roche
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Expired
High
Unassigned

Bug Description

Related to https://bugs.launchpad.net/cloud-init/+bug/1766287 I'd like to open a new bug following Disco minimal images failing to set up networking due to similar reasons as lp:1766287 with the only difference being that no nic was found.

A workaround was found to set up cloud-init service config:

/etc/systemd/system/cloud-init-local.service.d/gcp.conf
```
[Unit]
After=systemd-udev-trigger.service

[Service]
ExecStartPre=/bin/udevadm settle
```

The goal of this workaround is to:

1) ensure that cloud-init-local.service runs after
   systemd-udev-trigger.service starts (this is what triggers
   udev coldplug events, like plugging in the nic)
2) Run udevadm settle before we start cloud-init local so that any
   nic processing is completed before cloud-init starts looking for
   a nic.

Currently this is only required on minimal images but there is a
chance it could occur in base images too should they boot quick
enough. Minimal disco does not have snap preseeding as base images do and
due to snap preseeding running before cloud-init it makes it extremely unlikely to
happen in base images.

I understand that cloud-init might not be the place to fix the issue for all images but I'd like to re-open this bug to start that discussion.

I have attached cloud-init logs, netplan yaml, image manifest and sosreports from an instance that failed to set up networking.

Revision history for this message
Philip Roche (philroche) wrote :
Ryan Harper (raharper)
Changed in cloud-init:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Philip Roche (philroche) wrote :

For help in debugging I have attached a console of an instance that failed to set up networking

Revision history for this message
Philip Roche (philroche) wrote :

And for completeness I have attached a console of an instance that _successfully_ set up networking

tags: added: id-5c7fbddcf1364b63892a22c9
tags: added: id-5c9d168ed6c5704cac05e595
tags: added: id-5d0a33dc7c02f24574ae04aa
Revision history for this message
Johann Queuniet (jqueuniet) wrote :

I'm also running into this kind of issues with Debian servers on 20.2-2~deb10u1. I'm not trying to rename the interfaces and final names are the kernel ones (ethX) instead of the expected predictive names. This is an issue for our inventory process.

+---------+------+------------------------------+---------------+--------+-------------------+
| Device | Up | Address | Mask | Scope | Hw-Address |
+---------+------+------------------------------+---------------+--------+-------------------+
| bond0 | True | fe80::e8b9:79ff:fe23:7b23/64 | . | link | ea:b9:79:23:7b:23 |
| eth0 | True | . | . | . | ea:b9:79:23:7b:23 |
| eth1 | True | . | . | . | ea:b9:79:23:7b:23 |
| lo | True | 127.0.0.1 | 255.0.0.0 | host | . |
| lo | True | ::1/128 | . | host | . |
| team100 | True | XXX.XX.XX.XX | 255.255.252.0 | global | ea:b9:79:23:7b:23 |
| team100 | True | fe80::e8b9:79ff:fe23:7b23/64 | . | link | ea:b9:79:23:7b:23 |
+---------+------+------------------------------+---------------+--------+-------------------+

Overriding the cloud-init-local.service unit with the workaround mentioned above solves the issue.

Revision history for this message
Dan Watkins (oddbloke) wrote :

Hi Johann,

Thanks for the comment! As you're seeing network interfaces with unexpected names (rather than _no_ network interfaces), your issue sounds more like bug 1766287 than this one. As that other bug is already Fix Released, could you file a new bug with the tarball that `cloud-init collect-logs` produces on an affected system, and we can follow up there?

Thanks!

Dan

Changed in cloud-init:
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for cloud-init because there has been no activity for 60 days.]

Changed in cloud-init:
status: Incomplete → Expired
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.