The OpenStack network_config.json implementation fails on Hyper-V compute nodes

Bug #1642679 reported by Adrian Vladu on 2016-11-17
42
This bug affects 11 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Scott Moser
Ocata
Medium
Vladyslav Drok
cloud-init
Medium
Unassigned
cloud-init (Ubuntu)
Medium
Unassigned
Xenial
Medium
Scott Moser
Yakkety
Medium
Unassigned

Bug Description

=== Begin SRU Template ===
[Impact]
When a config drive provides network_data.json on Azure OpenStack,
cloud-init will fail to configure networking.

Console log and /var/log/cloud-init.log will show:
 ValueError: Unknown network_data link type: hyperv

This woudl also occur when the type of the network device as declared
to cloud-init was 'hw_veb', 'hyperv', 'vhostuser' or 'vrouter'.

[Test Case]
Launch an instance with config drive on hyperv cloud.

[Regression Potential]
Low to none. cloud-init is relaxing requirements and will accept things
now that it previously complained were invalid.
=== End SRU Template ===

We have discovered an issue when booting Xenial instances on OpenStack environments (Liberty or newer) and Hyper-V compute nodes using config drive as metadata source.

When applying the network_config.json, cloud-init fails with this error:
http://paste.openstack.org/show/RvHZJqn48JBb0TO9QznL/

The fix would be to add 'hyperv' as a link type here:
/usr/lib/python3/dist-packages/cloudinit/sources/helpers/openstack.py, line 587

Related bugs:
 * bug 1674946: cloud-init fails with "Unknown network_data link type: dvs
 * bug 1642679: OpenStack network_config.json implementation fails on Hyper-V compute nodes

Related branches

Adrian Vladu (avladu) on 2016-11-17
description: updated
Revision history for this message
Scott Moser (smoser) wrote :

Hi,
I've subscribed Xiang to this as he recently pinged me on a different string that may appear as a network device. My response to him was:

| This non-sense really needs to stop.
| We need to fix openstack to stop sending arbitrary "types" of network
| devices that mean nothing to the guest.
|
| No *new* ones should be allowed.
|
| 'vhostuser' or 'ovs' means nothing to the guest. They just see a nic.
| They can't possibly use that information in any way, so telling them is
| not helpful. The type of the device should be 'tap' or 'ethernet'.
|
| Can you submit a merge proposal upstream that does that?
|
| We can take these things in, but they're silly and quite obviously busted,
| unless you have some information that shows why they're not.

I'm willing to take this, but lets *please* work to fix the source
of the problem.

Adrian,
Can you please file a merge proposal upstream to fix this?

You're welcome to use this bug. I've made it "Also affects nova".

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/400883

Changed in nova:
assignee: nobody → Scott Moser (smoser)
status: New → In Progress
Revision history for this message
Scott Moser (smoser) wrote :

I've put up a request at https://review.openstack.org/400883

Scott Moser (smoser) on 2016-11-22
Changed in cloud-init:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Adrian Vladu (avladu) wrote :

Hello,

as the exposing behavior for nova is like this since a few releases, it is hard to believe they will change it, due to the backwards compatibility. Basically a few stable OpenStack releases(Liberty, Mitaka, Newton, Ocata) will be probably be stuck with it :(

Revision history for this message
Xiang Hui (xianghui) wrote :

@Scott, thanks for your fixing! BTW, would this cloud-init version target to xenial later?

Scott Moser (smoser) on 2016-12-02
Changed in cloud-init (Ubuntu):
status: New → Fix Released
importance: Undecided → Medium
Changed in cloud-init (Ubuntu Xenial):
status: New → Confirmed
Changed in cloud-init (Ubuntu Yakkety):
status: New → Confirmed
Changed in cloud-init (Ubuntu Xenial):
importance: Undecided → Medium
status: Confirmed → In Progress
assignee: nobody → Scott Moser (smoser)
Changed in cloud-init (Ubuntu Yakkety):
importance: Undecided → Medium
Scott Moser (smoser) on 2016-12-02
no longer affects: cloud-init (Ubuntu Vivid)
no longer affects: cloud-init (Ubuntu Wily)
Scott Moser (smoser) on 2016-12-02
description: updated
Abhimanyu (abhikec09) on 2016-12-06
Changed in nova:
status: In Progress → Fix Released
Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello Adrian, or anyone else affected,

Accepted cloud-init into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/0.7.8-49-g9e904bb-0ubuntu1~16.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in cloud-init (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed
Scott Moser (smoser) on 2016-12-16
description: updated
Revision history for this message
Scott Moser (smoser) wrote :

Adrian, Xiang,

Could you please verify this and mark 'verification-done' ?

At this point, this bug is blocking the release of cloud-init 0.7.8-49-g9e904bb-0ubuntu1~16.04.2 from xenial-proposed. That change contains fixes for other bugs that we need to get into -updates.

I've made requests off-bug to both Xiang Hui and to Adrian Vladu, but have not gotten a response.

Adrian has ACKed the upstream merge proposal at [1] with this fix.

While the code change does change behavior, the chance for regression is very low. See the code that was changed in context at [2]. Basically we extended the list of "physical types" to add 'hw_veb', 'hyperv', 'vhostuser'. Previously, if that condition did not match, then we would raise a ValueError exception that is not handled, leaving the system basically un-usable. Now, the strings are considered valid as "physical" and cloud-init will configure the devices as needed.

So:
  Before: cloud-init raise exception and no user-data or metadata is used... user cannot log into system.
  After: cloud-init configures networking and user-data and metadata is used.

Worst case for regression is really "still doesn't work".

--
[1] https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/311548
[2] https://git.launchpad.net/cloud-init/tree/cloudinit/sources/helpers/openstack.py#n599

Revision history for this message
Gabriel Samfira (gabriel-samfira) wrote :

Tested version 0.7.8-49-g9e904bb-0ubuntu1~16.04.2 on an OpenStack Mitaka install running Hyper-V as compute host.

VM booted successfully and cloud-init finished its run. The following output is from inside the VM after accessing it via SSH:

https://paste.ubuntu.com/23653864/

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.8-49-g9e904bb-0ubuntu1~16.04.2

---------------
cloud-init (0.7.8-49-g9e904bb-0ubuntu1~16.04.2) xenial-proposed; urgency=medium

  * cherry-pick 18203bf: disk_setup: Use sectors as unit when formatting
    MBR disks with sfdisk. (LP: #1460715)
  * cherry-pick 6e92c5f: net/cmdline: Consider ip= or ip6= on command
    line not only ip= (LP: #1639930)
  * cherry-pick 8c6878a: tests: fix assumptions that expected no eth0 in
    system. (LP: #1644043)
  * cherry-pick 2d2ec70: OpenStack: extend physical types to include
    hyperv, hw_veb, vhost_user. (LP: #1642679)

 -- Scott Moser <email address hidden> Thu, 01 Dec 2016 16:57:39 -0500

Changed in cloud-init (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Robie Basak (racb) wrote : Update Released

The verification of the Stable Release Update for cloud-init has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Scott Moser (smoser) wrote :

This is fixed in cloud-init 0.7.9.

Changed in cloud-init:
status: Confirmed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Adrian, or anyone else affected,

Accepted cloud-init into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/0.7.8-68-gca3ae67-0ubuntu1~16.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in cloud-init (Ubuntu Yakkety):
status: Confirmed → Fix Committed
tags: removed: verification-done
tags: added: verification-needed
Revision history for this message
Scott Moser (smoser) wrote :

Hi,
I'm going to mark this as verification-done as the original opener has not been able to do that, unfortunately. The change that went in for this fix can be seen at [1]. It is exactly the change that is in trunk, zesty, and also verified in stable release 16.04.

If it turns out that some interaction with yakkety made it *not* work, then we can re-address that.

If an sru team member wishes to disagree with my argument above, please just set it back to verification-needed, and I will attempt to get someone to do that.

Scott

--
[1] https://git.launchpad.net/cloud-init/commit/?h=ubuntu/yakkety&id=2d2ec70f06015f0624f1d0d328cc97f1fb5c29de

tags: added: verification-done
removed: verification-needed
Revision history for this message
Steve Langasek (vorlon) wrote :

Because this SRU includes a large number of other bugfixes that have been verified in yakkety, we have confidence that the package is not fundamentally broken, and as you say this change has been verified on other releases, so I'm willing to accept this for the present SRU.

(But I am not releasing it on a Friday.)

Revision history for this message
Adrian Vladu (avladu) wrote :

Hello,

sorry for the delay, we have successfully tested a latest yakkety image(we updated via chroot the cloud-init with the one from the -proposed repo).

Thanks,
Adrian Vladu

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.8-68-gca3ae67-0ubuntu1~16.10.1

---------------
cloud-init (0.7.8-68-gca3ae67-0ubuntu1~16.10.1) yakkety; urgency=medium

  * debian/cherry-pick: add utility for cherry picking commits from upstream
    into patches in debian/patches.
  * New upstream snapshot.
    - mounts: use mount -a again to accomplish mounts (LP: #1647708)
    - CloudSigma: Fix bug where datasource was not loaded in local search.
      (LP: #1648380)
    - when adding a user, strip whitespace from group list
      [Lars Kellogg-Stedman] (LP: #1354694)
    - fix decoding of utf-8 chars in yaml test
    - Replace usage of sys_netdev_info with read_sys_net (LP: #1625766)
    - fix problems found in python2.6 test.
    - OpenStack: extend physical types to include hyperv, hw_veb, vhost_user.
      (LP: #1642679)
    - tests: fix assumptions that expected no eth0 in system. (LP: #1644043)
    - net/cmdline: Consider ip= or ip6= on command line not only ip=
      (LP: #1639930)
    - Just use file logging by default [Joshua Harlow] (LP: #1643990)
    - Improve formatting for ProcessExecutionError [Wesley Wiedenmeier]
    - flake8: fix trailing white space
    - Doc: various documentation fixes [Sean Bright]
    - cloudinit/config/cc_rh_subscription.py: Remove repos before adding
      [Brent Baude]
    - packages/redhat: fix rpm spec file.
    - main: set TZ in environment if not already set. [Ryan Harper]
    - disk_setup: Use sectors as unit when formatting MBR disks with sfdisk.
      [Daniel Watkins] (LP: #1460715)

 -- Scott Moser <email address hidden> Mon, 19 Dec 2016 15:07:12 -0500

Changed in cloud-init (Ubuntu Yakkety):
status: Fix Committed → Fix Released
Revision history for this message
Scott Moser (smoser) wrote :

this is definitely *not* fix-released in nova.
We see more bugs like: bug 1674946

Changed in nova:
assignee: Scott Moser (smoser) → nobody
description: updated
Adrian Vladu (avladu) on 2017-03-30
Changed in nova:
status: Fix Released → Incomplete
Changed in nova:
assignee: nobody → Scott Moser (smoser)
status: Incomplete → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/400883
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f559be35a03f5801f527355895a97c89cdc3c336
Submitter: Jenkins
Branch: master

commit f559be35a03f5801f527355895a97c89cdc3c336
Author: Scott Moser <email address hidden>
Date: Fri Mar 31 17:01:33 2017 -0400

    Limit exposure of network device types to the guest.

    Previously, the 'type' of the hypervisor network device, was exposed to
    the guest directly. That does not make sense, as
    a.) this leaks needless information into the guest
    b.) the guest cannot be reasonably expected to make decisions
        based on a type of link that is present underneath the
        virtual device that is presented to the guest.
    c.) guests then are forced to either continuously track these types
        or to assume that unknown type is "phy".

    This limits the exposure of types to a specific list. Any other
    type will be shown to the guest as 'phy'.

    Change-Id: Iea458fba29596cd2773d8d3565451af60b02bcca
    Closes-Bug: #1642679

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.0.0b1

This issue was fixed in the openstack/nova 16.0.0.0b1 development milestone.

Revision history for this message
Sam Stoelinga (sammiestoel) wrote :

I still hit this issue on latest xenial cloudimg of April 25th. This is the error I saw when trying to run an Ubuntu 16.04 guest OS on a contrail based cloud: http://paste.openstack.org/show/608110/

description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/476195

Revision history for this message
Andrey Kirilochkin (andreika-mail) wrote :

Guys we still hitting the same bug, this started to be a huge issue for us.
http://paste.openstack.org/show/615013/
http://paste.openstack.org/show/615127/
Each time we run vm with ubuntu 16.04 we randomly see this bug in vm boot-log.
OpenStack: Mitaka
Juniper Contrail: 3.2.8
Please provide fix for that.

Matt Riedemann (mriedem) on 2017-08-11
Changed in nova:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/476195
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=cec7ecdc93c3b9ba401edf3cf84088b580247cb8
Submitter: Jenkins
Branch: stable/ocata

commit cec7ecdc93c3b9ba401edf3cf84088b580247cb8
Author: Scott Moser <email address hidden>
Date: Fri Mar 31 17:01:33 2017 -0400

    Limit exposure of network device types to the guest.

    Previously, the 'type' of the hypervisor network device, was exposed to
    the guest directly. That does not make sense, as
    a.) this leaks needless information into the guest
    b.) the guest cannot be reasonably expected to make decisions
        based on a type of link that is present underneath the
        virtual device that is presented to the guest.
    c.) guests then are forced to either continuously track these types
        or to assume that unknown type is "phy".

    This limits the exposure of types to a specific list. Any other
    type will be shown to the guest as 'phy'.

    Change-Id: Iea458fba29596cd2773d8d3565451af60b02bcca
    Closes-Bug: #1642679
    (cherry picked from commit f559be35a03f5801f527355895a97c89cdc3c336)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.7

This issue was fixed in the openstack/nova 15.0.7 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers