juju deploy --to kvm:X has interface name changes during deploy and juju-info is not being updated

Bug #1882564 reported by Jeff Hillman
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Ian Booth
2.8
Fix Released
High
Joseph Phillips

Bug Description

juju 2.7.6 amd64 on bionic

New customer deploy for charmed kubernetes 1.18

In this environment, some machines are bare-metal and some are KVM as defined in the bundle to: statement.

When KVM machines are deployed via juju, they start out have ethX interfaces, but during startup it can be seen in syslog, dmesg and kern.log that these interfaces are being renamed to enps0fX. But juju info isn't getting this update.

Specifically, running 'juju run --unit canal/9 network-get cni' I see that it has the correct address space, but the interface is listed as eth2, when it has in fact been renamed to enps0f4.

Getting the return of eth2 is causing the flannel service not to start because it can't find the eth2 iface to bind 2 (the iface for canal config is coming from the network-get info).

Here's the bundle, with customer network info redacted:

https://paste.ubuntu.com/p/439pxc73YG/

Here's some output showing the information I mentioned above

https://paste.ubuntu.com/p/23c5jkgVKy/

Tags: cpe-onsite
George Kraft (cynerva)
Changed in juju:
status: New → Confirmed
Revision history for this message
Jeff Hillman (jhillman) wrote :

Subscribing field-critical per managerial direction.

Revision history for this message
Pen Gale (pengale) wrote :

I was able to reproduce this with the following bundle, deployed on top of MAAS:

```
machines:
  "0":
    constraints: tags=micro

applications:
  ubuntu:
    charm: cs:ubuntu
    num_units: 1
    to:
      - kvm:0
```

After deploying, run the following, and note that eth0 has been renamed to enp0s2:

    juju ssh ubuntu/0 -- dmesg | grep renamed

Revision history for this message
Pen Gale (pengale) wrote :

Also note that kvm pods spun up w/ MAAS don't exhibit the rename behavior. This appears to be specific to Juju.

Ian Booth (wallyworld)
Changed in juju:
assignee: nobody → Ian Booth (wallyworld)
importance: Undecided → High
status: Confirmed → In Progress
Revision history for this message
Ian Booth (wallyworld) wrote :
Revision history for this message
Ian Booth (wallyworld) wrote :

So my PR won't be sufficient since it turns out we also use the same code path to record link layer devices from the instance poller. There's a more significant chunk for work required to record the origin of the devices so they can be properly handled.

The core fix - do not ignore container link layer device updates - I think is still ok. It's just a matter of how we deal with the fact that there will be orphaned link layer device records if a rename is done on the machine and the new record is created, leaving the old one there also.

Revision history for this message
Joseph Phillips (manadart) wrote :

Ian, I was not correct in what I said on IRC this morning. It does not apply to the 2.7 branch. I sent an email to the team with some more detail.

At this time, I have proposed a patch that modifies the one above. It is here:
https://github.com/juju/juju/pull/11685

There is one further consideration, also mentioned in the email about a possible loss of fidelity if we update NICs with data source from *inside* the container.

Changed in juju:
milestone: none → 2.7.7
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
Revision history for this message
Joseph Phillips (manadart) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.