Baremetal port's host_id get updated during instance restart

Bug #1822801 reported by Hamdy Khader on 2019-04-02
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Hamdy Khader
Stein
High
Matt Riedemann

Bug Description

In case of baremetal overcloud, the instance ports gets updated during instance reboot[1] to change host_id
to be the nova compute host_id.

This way baremetal ports' host_id will be changed to indicate the nova host_id instead of ironic node uuid.

In case of normal instance or even baremetal instance it wouldn't be a problem but in case of SmartNIC
baremetal instance the port's host_id is important to communicate with the relevant Neutron agent running on the SmartNIC as the port's host_id contains the SmartNIC host name.

Reproduce:
- deploy baremetal overcloud
- create baremetal instance
- after creation complete, check port details and notice binding_host_id=overcloud-controller-0.localdomain

[1] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L7191

Nova version:
()[root@overcloud-controller-0 /]# rpm -qa | grep nova
puppet-nova-14.4.1-0.20190322112825.740f45a.el7.noarch
python2-nova-19.0.0-0.20190322140639.d7c8924.el7.noarch
python2-novajoin-1.1.2-0.20190322123935.e8b18c4.el7.noarch
openstack-nova-compute-19.0.0-0.20190322140639.d7c8924.el7.noarch
python2-novaclient-13.0.0-0.20190311121537.62bf880.el7.noarch
openstack-nova-common-19.0.0-0.20190322140639.d7c8924.el7.noarch

Hamdy Khader (hamdyk) on 2019-04-02
Changed in nova:
assignee: nobody → Hamdy Khader (hamdyk)

Fix proposed to branch: master
Review: https://review.openstack.org/649345

Changed in nova:
status: New → In Progress
Changed in nova:
assignee: Hamdy Khader (hamdyk) → Adrian Chiris (adrian.chiris)
Matt Riedemann (mriedem) wrote :

This was regressed in Stein with change https://review.openstack.org/#/c/603844/.

Changed in nova:
importance: Undecided → High
tags: added: compute ironic neutron
Matt Riedemann (mriedem) wrote :

The workaround for this would be to disable the periodic task by setting this config option to 0:

https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.heal_instance_info_cache_interval

Changed in nova:
assignee: Adrian Chiris (adrian.chiris) → Hamdy Khader (hamdyk)

Reviewed: https://review.opendev.org/649345
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=091aa3289694a27704e48931a579f50a3179b036
Submitter: Zuul
Branch: master

commit 091aa3289694a27704e48931a579f50a3179b036
Author: Hamdy Khader <email address hidden>
Date: Tue Apr 2 17:26:14 2019 +0300

    Do not perform port update in case of baremetal instance.

    In case of a baremetal instance, the instance's port binding:host_id
    gets updated during instance reboot to the nova compute host id by
    the periodic task: _heal_instance_info_cache. This regression was
    introduced in commit: I75fd15ac2a29e420c09499f2c41d11259ca811ae

    This is an un-desirable change as ironic virt driver did the original
    port binding, nova should not update the value.
    In case of a baremetal port, the binding:host_id represents the ironic
    node_uuid. In case of a SmartNIC(baremetal) port[1] the binding:host_id
    represent the SmartNIC hostname and it MUST not change since ironic
    relies on that information as well as the Neutron agent that runs on
    the SmartNIC.

    A new API method was added, "manages_port_bindings()", to ComputeDriver
    that defaults to False, and overriden in IronicDriver to True.

    A call to this API method is now made in _heal_instance_info_cache() to
    prevent port update for instance ports in case the underlying
    ComputeDriver manages the port binding.

    [1] I658754f7f8c74087b0aabfdef222a2c0b5698541

    Change-Id: I47d1aba17cd2e9fff67846cc243c8fbd9ac21659
    Closes-Bug: #1822801

Changed in nova:
status: In Progress → Fix Released

Reviewed: https://review.opendev.org/656176
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7cd816f43a9a699ccc5c5826b0904a3414664132
Submitter: Zuul
Branch: stable/stein

commit 7cd816f43a9a699ccc5c5826b0904a3414664132
Author: Hamdy Khader <email address hidden>
Date: Tue Apr 2 17:26:14 2019 +0300

    Do not perform port update in case of baremetal instance.

    In case of a baremetal instance, the instance's port binding:host_id
    gets updated during instance reboot to the nova compute host id by
    the periodic task: _heal_instance_info_cache. This regression was
    introduced in commit: I75fd15ac2a29e420c09499f2c41d11259ca811ae

    This is an un-desirable change as ironic virt driver did the original
    port binding, nova should not update the value.
    In case of a baremetal port, the binding:host_id represents the ironic
    node_uuid. In case of a SmartNIC(baremetal) port[1] the binding:host_id
    represent the SmartNIC hostname and it MUST not change since ironic
    relies on that information as well as the Neutron agent that runs on
    the SmartNIC.

    A new API method was added, "manages_port_bindings()", to ComputeDriver
    that defaults to False, and overriden in IronicDriver to True.

    A call to this API method is now made in _heal_instance_info_cache() to
    prevent port update for instance ports in case the underlying
    ComputeDriver manages the port binding.

    [1] I658754f7f8c74087b0aabfdef222a2c0b5698541

    Change-Id: I47d1aba17cd2e9fff67846cc243c8fbd9ac21659
    Closes-Bug: #1822801
    (cherry picked from commit 091aa3289694a27704e48931a579f50a3179b036)

This issue was fixed in the openstack/nova 19.0.1 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers