Don't Break On Duplicate Mac Addresses

Bug #1996789 reported by Brett Holman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Expired
Medium
Unassigned

Bug Description

Currently when duplicate mac addresses are detected, cloud-init dies.

While duplicate macs are typically corner cases, there are cases when they can be valid[1].

Consider this example[2]. After bonding two interfaces, the interfaces were left with duplicate mac addresses. Using cloud-init on this system fails at the time that these devices are detected.

If no network config is given, or if a config is given configuring a single address, we have the opportunity to do something intelligent to allow cloud-init to boot by using the "fallback interface" (in cloud-init this is the first interface), rather than throwing an exception and dying.

Netplan's mac matching assumes 1:1 mapping between mac addresses and interfaces, so in the case of multiple interfaces configured with matches, we still can't do anything intelligent.

[1] Until these have unique addresses, these interfaces will not be usable on the same broadcast domain, but they should still be able to work individually on different networks.
[2] https://stackoverflow.com/questions/74459180/deleted-bond-interface-left-me-with-duplicate-mac-on-two-interfaces

Tags: sts
Revision history for this message
Brett Holman (holmanb) wrote :
Changed in cloud-init:
importance: Undecided → Medium
Brett Holman (holmanb)
Changed in cloud-init:
status: New → Triaged
Revision history for this message
Trent Lloyd (lathiat) wrote :

I ran into this issue when doing SR-IOV Bonding on OpenStack. We can assign two VFs with the same MAC. An example of doing that is here:
https://www.redpill-linpro.com/techblog/2021/01/30/bonding-sriov-nics-with-openstack.html

While you can use unique MACs and use fail-over-mac-policy=active - then your metadata+DHCP breaks when using the slave interface. So it's ideal to have a duplicate as an option.

We keep running into this in various scenarios and already have multiple workarounds:
OVS bridge duplicates: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1912844
Azure advanced networking: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1844191
Oracle net_failover: https://github.com/canonical/cloud-init/commit/fa47d527a03a00319936323f0a857fbecafceaf7

In most cases the real use case for this is some kind of VF-to-virtio failover for live migration or bonding (such is the case for both oracle net_failover and azure). Sometimes it's because a bridge, bond or OVS duplicates/steals a MAC - we also have special case code for handling that.

Currently when you hit this, cloud-init errors out and attempts no network configuration.

It would be ideal for cloud-init to make an attempt to configure the network with one of the interfaces - perhaps the one that already has the correct name or with some kind of priority that may have specifics for each driver type we already have exceptions for (ignore ovs/bridge/bond, prioritise the correct net_failover device, etc).

tags: added: sts
Revision history for this message
James Falcon (falcojr) wrote :
Changed in cloud-init:
status: Triaged → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.