Booting on vSphere fails the cloud-init guest customization due to missing dbus.service dependency

Bug #1921116 reported by Florian Bergmann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
open-vm-tools (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

When booting a virtual machine on vSphere there is an error that guest-customizations can not be run.

I was using: Ubuntu 20.04.2 LTS release deployed to a template using the ova file.

I ran into this problem, when attempting to boot a machine using ansible and ran into this VMWare bug: https://kb.vmware.com/s/article/59687

To summarize: the vmware tools attempt to use dbus before it's available and guest customization fails, which leads to the network adapters of the VM not being attached.

It seems all that is missing to make this work is to add the missing:

After=dbus.service

dependency to the systemd-service for open-vm-tools.service.

Once this is added the guest customizations can be used and after booting the network adapters will be attached successfully.

Revision history for this message
Paride Legovini (paride) wrote :

Hello Florian and thanks for this bug report. The vmware page you linked says that:

  This issue is resolved in VMware vSphere 6.7 Update 3,
  available at VMware downloads.

which makes me think that:

- This is actually a vSphere bug
- The bug is fixed in the newer versions of vSphere
- The proposed change to open-vm-tools.service is a workaround.

Could you please confirm the above, especially the fact that this is now fixed?

Workaround for high-impact issues can be included in Ubuntu if necessary, however I doubt the impact of this problem is high enough to justify a SRU to Focal (see the process details at [1]), especially given that a fix already exists. This makes be doubt that the proposed change is worth including at all, especially given that open-vm-tools is almost a sync from Debian.

However I lack familiarity the the VMware virtualization technology and I may be missing the scope of this problem. Is there a reason why this is better fixed with an Ubuntu upgrade rather than with a vSphere update? Thanks!

[1] https://wiki.ubuntu.com/StableReleaseUpdates

Changed in open-vm-tools (Ubuntu):
status: New → Incomplete
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi,
Paride is right, but didn't know about the background.
There are mulitple discussions upstream like:
https://github.com/vmware/open-vm-tools/issues/240
https://github.com/vmware/open-vm-tools/issues/350
https://github.com/vmware/open-vm-tools/issues/356

And the reason is isn't fixed there is that the "new way" that is supported and coded on isn't affected at all anyway.

That might seem odd, but we've considered adding After=dbus.service in the Ubuntu packages when it came up the first time.
https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/1793715
But it turned out that this will break more cases/setups than it will fix.
Therefore eventually even VMWare did ask to NOT set After=dbus.service by default and only published it for the affected as KB article.

This is a fairly complex case and I tried to summarize a bit here:
https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/1793715/comments/18

Essentially the one way out not breaking other setups is what I outlined therein as (B) which is upstream VMware to split the customization code/service into two and then one can be early and one can be later. But I guess since their new style of customization code doesn't have the issue they never started on that and with every passing month the benefit of doing so gets smaller.

Never the less - AFAIU the case - if we want to get this resolved we need to ring bells and whistles at upstream.

... This is one of the cases that makes me sad, I want to help but I cant :-/ ...

Marking as a duplicate to 1793715

Revision history for this message
Florian Bergmann (bergmann-f-h) wrote :

Hi,

Thanks for the background information - that's really interesting.

Yes I also saw that it is fixed in 6.7U3, however maybe that means they have regression, because we are seeing this issue in vSphere 7.

The version are: VMware ESXi, 7.0.1, 17551050
vSphere Client version 7.0.1.00200

I am not sure - maybe the workflow I am using is also at fault?

Right now I am using the OVA to deploy a initial machine I never power on but immediately turn into a template for further use (cloning a template is just a lot faster for spawning new VMs and we use the template in our ansible playbooks).

What I can see concerning the workflows:

- vSphere is generating the netplan configuration with the configuration set in ansible, so the data propagates correctly and the file is created.
- I tried to strip everything marked as used for guest customization (I only set the network name), but it still fails unfortunately.

If there is any more information I can provide just tell me, I'll happily try to get it.

I can also say it *used* to work with vSphere 7, but I don't know the versions or the image versions we were using at the time (and now we can't find them out anymore) - I assume that maybe it didn't use the cloud-init customization way then, but still the perl way as I understand the linked bugs.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.