cloud-init on Ubuntu 18.04 image does not run in VIO

Bug #1855458 reported by Riccardo Murri
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Incomplete
Undecided
Unassigned
open-vm-tools (Ubuntu)
New
Undecided
Unassigned

Bug Description

When running the official Ubuntu 18.04 cloud image in VIO (VMware Integrated OpenStack, see [1]), data source `Ec2` is not detected as working so `cloud-init` does not run.

As a result, it is impossible to log in or connect to the running image via SSH. In other words, VMs started from the official Ubuntu 18.04 images are unusable on VIO.

Note that this issue was already identified as a problem with `ds-identify` (part of the `cloud-init` package) in comment #16 to bug 1760776 (see [2]):

> VMWARE does not always use DatasourceOVF. When using OpenStack VIO
> (Openstack on VMWARE) the ec2 datasource fallback is no longer being
> used as of this more recent update. OpenStack VIO exposes user-data
> on 169.254.169.254 but now we get a platform unknown

However, the reply in comment #17 (see [3]) mentioned that a new bug should have been opened, which is what I'm doing now.

This issue applies to the released images (e.g., https://cloud-images.ubuntu.com/releases/bionic/release/ubuntu-18.04-server-cloudimg-amd64.vmdk) but also to the "daily build" images. I have also tried using different formats (VMDK, OVA, QCOW2 converted to VMDK with `qemu-img convert`) but all of them show the same issue. It does *not* apply to previous versions of Ubuntu (e.g., official cloud images of Ubuntu 16.04 boot fine).

[1]: https://www.vmware.com/products/openstack.html
[2]: https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/1760776/comments/16
[3]: https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/1760776/comments/17

Revision history for this message
Riccardo Murri (rmurri) wrote :

I am providing here the additional information that were requested in https://bugs.launchpad.net/ubuntu/+source/open-vm-tools/+bug/1760776/comments/17

I'm attaching the collected cloud-init logs; here is the output of commands.

The Linux kernel correctly determines the hypervisor is VMware (excerpt from `/var/log/syslog`):

   Dec 6 15:28:02 ubuntu kernel: [ 0.000000] Linux version 4.15.0-72-generic (buildd@lcy01-amd64-026) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #81
   -Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 (Ubuntu 4.15.0-72.81-generic 4.15.18)
   Dec 6 15:28:02 ubuntu kernel: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.15.0-72-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS
   0
   ...
   Dec 6 15:28:02 ubuntu kernel: [ 0.000000] DMI: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
   Dec 6 15:28:02 ubuntu kernel: [ 0.000000] Hypervisor detected: VMware
   Dec 6 15:28:02 ubuntu kernel: [ 0.000000] vmware: TSC freq read from hypervisor : 2399.999 MHz
   Dec 6 15:28:02 ubuntu kernel: [ 0.000000] vmware: Host bus clock speed read from hypervisor : 66000000 Hz
   Dec 6 15:28:02 ubuntu kernel: [ 0.000000] vmware: using sched offset of 13058703528 ns

Still, `/sys/hypervisor` is empty:

    root@ubuntu:~# sudo sh -c 'cd /sys/hypervisor && grep -r . *'
    grep: *: No such file or directory

    root@ubuntu:~# ls /sys/hypervisor/ -lA
    total 0

However, DMI info is there:

    root@ubuntu:~# sudo sh -c 'cd /sys/class/dmi/id && grep -r . *'
    bios_date:12/12/2018
    bios_vendor:Phoenix Technologies LTD
    bios_version:6.00
    board_name:440BX Desktop Reference Platform
    board_serial:None
    board_vendor:Intel Corporation
    board_version:None
    chassis_asset_tag:No Asset Tag
    chassis_serial:None
    chassis_type:1
    chassis_vendor:No Enclosure
    chassis_version:N/A
    modalias:dmi:bvnPhoenixTechnologiesLTD:bvr6.00:bd12/12/2018:svnVMware,Inc.:pnVMwareVirtualPlatform:pvrNone:rvnIntelCorporation:rn440BXDesktopReferencePlatform:rvrNone:cvnNoEnclosure:ct1:cvrN/A:
    power/runtime_active_time:0
    power/runtime_active_kids:0
    power/runtime_usage:0
    power/runtime_status:unsupported
    grep: power/autosuspend_delay_ms: Input/output error
    power/async:disabled
    power/runtime_suspended_time:0
    power/runtime_enabled:disabled
    power/control:auto
    product_name:VMware Virtual Platform
    product_serial:VMware-42 17 b1 e7 f5 b4 cf 93-5f 00 68 ae ca 46 eb 65
    product_uuid:E7B11742-B4F5-93CF-5F00-68AECA46EB65
    product_version:None
    sys_vendor:VMware, Inc.
    uevent:MODALIAS=dmi:bvnPhoenixTechnologiesLTD:bvr6.00:bd12/12/2018:svnVMware,Inc.:pnVMwareVirtualPlatform:pvrNone:rvnIntelCorporation:rn440BXDesktopReferencePlatform:rvrNone:cvnNoEnclosure:ct1:cvrN/A:

Revision history for this message
Riccardo Murri (rmurri) wrote :

From the `ds-identify.log` file in the uploaded archive, one can see that `ds-identify` rejects the `Ec2` metadata source and disables `cloud-init`:

    Running on vmware but rpctool query returned 1: No value found
    is_ds_enabled(IBMCloud) = true.
    ec2 platform is 'Unknown'.
    No ds found [mode=search, notfound=disabled]. Disabled cloud-init [1]

Revision history for this message
Ryan Harper (raharper) wrote :

Thanks for filing the bug. It's unfortunate that VIO does not identify that it's running images in OpenStack.

Looking at the output provided, all we can tell is that we're on VMWare, but no indication that it's OpenStack, so cloud-init can't reasonable expect to know that it should enable the OpenStack datasource.

I'm not familiar with the VIO product, but any of the DMI tables which set values could be an option to include OpenStackCompute.

> When using OpenStack VIO (Openstack on VMWARE) the ec2 datasource fallback
> is no longer being used as of this more recent update. OpenStack VIO exposes
> user-data on 169.254.169.254 but now we get a platform unknown

In the above dump of DMI data, is VMware Virtual Platform specific to VIO? and does VIO always expose metadata service?

If so, we could detect/enable OpenStack datasource if we know we're on VIO specifically.

Changed in cloud-init (Ubuntu):
status: New → Incomplete
Revision history for this message
Riccardo Murri (rmurri) wrote :

> Looking at the output provided, all we can tell is that we're on
> VMWare, but no indication that it's OpenStack, so cloud-init can't
> reasonable expect to know that it should enable the OpenStack
> datasource.

Let me mention again that this bug was introduced in Ubuntu 18.04; previous versions do not suffer from the same problem.

Also, if I modify the official cloud image and disable any datasource
*except* Ec2 (i.e., write `datasource_list: [ Ec2 ]` into
`/etc/cloud/cloud.cfg.d/90_dpkg.cfg`), then everything works fine.

So the issue is really `ds-identify` being too picky about when the
`Ec2` datasource is valid; from a cursory look, I can see that it just
checks if `/sys/hypervisor` contains some well-known AWS tag -- but
there are many cloud services which provide EC2-compatible metadata
(OpenNebula being another example), which would be skipped by this
logic!

> In the above dump of DMI data, is VMware Virtual Platform specific
> to VIO? and does VIO always expose metadata service?

This I do not know; I'm just a user, not a developer of the VIO
platform. But I don't see why probing for metadata sources should be
done only by looking at DMI information? EC2 metadata is handed out
by 169.254.169.254, if well-known locations for metadata are
available, then the EC2 datasource should be considered valid. (Other,
more specific, datasources can be probed earlier so they can be used
if found.)

This is a dump of the EC2 metadata that I can see from an instance in
VIO (same VM from whence the above files were extracted):

    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/
    ami-id
    ami-launch-index
    ami-manifest-path
    block-device-mapping/
    hostname
    instance-action
    instance-id
    instance-type
    local-hostname
    local-ipv4
    placement/
    public-hostname
    public-ipv4
    public-keys/
    reservation-id
    security-groups
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/ami-id
    ami-0000012a
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/ami-launch-index
    0
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/ami-manifest-path
    FIXME
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/hostname
    test.novalocal
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/instance-action
    none
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/instance-id
    i-00000793
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/instance-type
    m1.small
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/local-hostname
    test.novalocal
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/local-ipv4
    172.31.33.70
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/public-hostname
    test.novalocal
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/public-ipv4
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/reservation-id
    r-0asb5l2b
    root@ubuntu:~# w3m -dump http://169.254.169.254/latest/meta-data/security-groups
    default

Thanks,
Riccardo

Revision history for this message
Scott Moser (smoser) wrote :

For what its worth, see bug 1669875.
Specifically see comment 7 https://bugs.launchpad.net/cloud-init/+bug/1669875/comments/7 .

Revision history for this message
Riccardo Murri (rmurri) wrote :

@ Scott thanks a lot, that is very useful! Do I understand correctly that `smbios.assetTag` is a property of the *flavor*? I cannot seem to find any image property with the same effect.

Revision history for this message
Riccardo Murri (rmurri) wrote :

Another workaround for people experiencing the same issue is to force the use of a config drive in OpenStack. This can be done without involving VIO admins in two ways:

1. Set property `img__config_drive` on the image to `mandatory`.
2. Pass option `--config-drive true` to `openstack server create`.

Revision history for this message
Ryan Harper (raharper) wrote :
Revision history for this message
Ryan Harper (raharper) wrote :

> > Looking at the output provided, all we can tell is that we're on
> > VMWare, but no indication that it's OpenStack, so cloud-init can't
> > reasonable expect to know that it should enable the OpenStack
> > datasource.
>
> Let me mention again that this bug was introduced in Ubuntu 18.04; previous versions do not suffer from the same problem.

It's not a bug, but I understand from your perspective that it appears to be one.
We introduced ds-identify to prevent cloud-images from attempting to connect
out to URLs when none are available. Cloud-init was slower since it needed
to attempt to see if any network-based datasources were present and this
also prevents cloud-init from responding to network sources when it should not.
Cloud-init would also run even if no datasources or user-config were provided
at all which meant running a cloud-image outside of a cloud would hang for
a very long time waiting for cloud-init to timeout.

>
> Also, if I modify the official cloud image and disable any datasource
> *except* Ec2 (i.e., write `datasource_list: [ Ec2 ]` into
> `/etc/cloud/cloud.cfg.d/90_dpkg.cfg`), then everything works fine.
>
> So the issue is really `ds-identify` being too picky about when the
> `Ec2` datasource is valid; from a cursory look, I can see that it just
> checks if `/sys/hypervisor` contains some well-known AWS tag -- but
> there are many cloud services which provide EC2-compatible metadata
> (OpenNebula being another example), which would be skipped by this
> logic!

The ds-identify checks use local information on the instance to check
if it is running on a cloud platform. For OpenStack, the method used
is to set DMI values to indicate the VM is running on OpenStack.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.