CloudStack provider cannot determine correct metadata IP with multiple network interfaces
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
cloud-init |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Problem]
When mutliple network interfaces are present in a CloudStack VM, cloud-init randomly chooses the gateway address to fetch the metadata from. This is not a problem when all network interfaces offer metadata. However, if a shared network interface is attached to the VM the gateway on that interface doesn't have the metadata. Cloud-init will timeout waiting for response from the gateway and will not apply metadata to the host.
[How to reproduce]
- Create VM with 1x Isolated and 1x Shared Network
- Ensure cloud-init is installed in the VM and CloudStack is configured as a metadata provider
- Boot VM
[Expected result]
- VM should boot and apply metadata from cloudstack
[Observed result]
- cloud-init sometimes chooses wrong metadata server IP
- cloud-init delays startup waiting for response
- metadata isn't applied
- cloud-init service fails
[Notes]
I noticed that in "cloudinit/
We used till recently cloud-init 0.7.5 but after the introduction of NetworkManager lease support we started running into this problem. (https:/
Up to this point cloud-init has always used the default_gateway method.
CentOS 7 has only recently updated cloud-init in it's repos, so we were stuck on this old version for a long time.
Maybe it would be nice to have a configuration option to choose between the methods manually?
Also it would be cool if on a fault cloud-init would choose the next possible dhcp lease.
[Attachment]
We added some files for debugging as a tar.gz.
Related branches
- Ryan Harper: Approve
- Server Team CI bot: Approve (continuous-integration)
- Joshua Hügli (community): Approve
- Scott Moser: Needs Information
- Dan Watkins: Pending requested
-
Diff: 47 lines (+20/-2)1 file modifiedcloudinit/sources/DataSourceCloudStack.py (+20/-2)
Changed in cloud-init: | |
status: | New → In Progress |
Changed in cloud-init: | |
status: | In Progress → Fix Committed |
Changed in cloud-init: | |
status: | Fix Committed → Fix Released |
Hi Joshua,
Thanks for using cloud-init, and for the detailed bug report!
> When mutliple network interfaces are present in a CloudStack VM, cloud-init
> randomly chooses the gateway address to fetch the metadata from.
The "random"ness here isn't contained within cloud-init per se. It will
deterministically select the most recent DHCP lease and use that to access
metadata. The problem is that the order in which the DHCP leases completes is
not stable, so cloud-init's assumption here is mistaken.
> I noticed that in "cloudinit/ sources/ DataSourceCloud Stack.py" gateway( )"?
> get_vr_address() the dhcp lease option is preferred over the default gateway.
> Wouldn't it be smarter to just always use "get_default_
Using the DHCP leases is the documented way of finding the metadata server in gateway( ) path is there for cases where DHCP has failed (or,
the CloudStack docs[0]. I don't know for sure, but I believe the
get_default_
perhaps, for static network configuration?). My intuition is that the first
interface to successfully DHCP would get the default route; do you send some
DHCP route configuration to make the default gateway stable?
> Maybe it would be nice to have a configuration option to choose between the
> methods manually?
This sounds feasible, as a datasource configuration option that you could bake
into your images/templates. I'd prefer for us to work out if there's a way we
can get this right without requiring configuration, though, because otherwise
new CloudStack operators (or image/template builders) have to discover this
option themselves when they're running in an environment that requires it. Do
you know if there's a way that we can tell which DHCP lease is the right one to
use?
(Marking this Incomplete for now, please move it back to New when you respond!)
Thanks!
Dan
[0] http:// docs.cloudstack .apache. org/projects/ cloudstack- administration/ en/4.8/ virtual_ machines/ user-data. html