Nova compute needlessly depends on hostname for live migration url

Bug #1729566 reported by Kevin Tibi
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
kolla-ansible
Fix Released
Medium
Radosław Piliszek
Stein
Fix Released
Medium
Unassigned
Train
Fix Released
Medium
Unassigned
Ussuri
Fix Released
Medium
Radosław Piliszek

Bug Description

when you add a new compute to an already existing platform, kolla does not update the /etc/hosts used by old compute and libvirt containers.

So live migration fail because it can't resolve the name of the new compute.

Additionally, live migration interface is ignored even in non-TLS setup, and may fail live migration if hostname does not point to migration interface address.

Revision history for this message
Michal Nasiadka (mnasiadka) wrote :

kolla-ansible baremetal role (bootstrap-servers) is generating /etc/hosts entries, therefore it would need some refactor to generate /etc/hosts on all servers...

Changed in kolla-ansible:
importance: Undecided → Wishlist
status: New → Confirmed
Revision history for this message
Mark Goddard (mgoddard) wrote :

Why not just re-run bootstrap-servers? Or do we need a more fine-grained command?

Revision history for this message
Zijian Guo (zijianguo) wrote :

After re-generating /etc/hosts. The most important thing is the file '/etc/hosts' of the container 'nova_compute' does not refresh until the container is restarted.

Revision history for this message
Ross Martyn (rossmartyn04) wrote :

We're also seeing this.

Is there an easy way to trigger the restart of specific containers through KA (or Kayobe) without risking other changes as noted on the KA documentation under bootstrap-servers (Subsequent bootstrap considerations)?

Michal, Zljian, how did you combat this when scaling up your environments?

Revision history for this message
Zijian Guo (zijianguo) wrote :

Hi, Ross.

I suggest splitting the Generate /etc/hosts action from bootstrap. Restart the necessary containers after each update, such as nova_compute,nova_libvirt,nova_ssh, and so on.

Mark Goddard (mgoddard)
Changed in kolla-ansible:
status: Confirmed → Triaged
Mark Goddard (mgoddard)
summary: - No update for /etc/hosts after adding a new compute
+ No update for /etc/hosts after adding a new compute, breaking live
+ migration
Changed in kolla-ansible:
assignee: nobody → Mark Goddard (mgoddard)
status: Triaged → In Progress
summary: - No update for /etc/hosts after adding a new compute, breaking live
- migration
+ Nova compute needlessly depends on hostname for live migration url
description: updated
description: updated
Changed in kolla-ansible:
assignee: Mark Goddard (mgoddard) → Radosław Piliszek (yoctozepto)
Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

I think the not-updated-/etc/hosts-issue could be refactored here as we are in fact fixing a more general issue while leaving name resolution broken still.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/715494
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=628c27ce9e693902c5cabedc7f254f2a1229a195
Submitter: Zuul
Branch: master

commit 628c27ce9e693902c5cabedc7f254f2a1229a195
Author: John Garbutt <email address hidden>
Date: Fri Mar 27 17:37:51 2020 +0000

    Fix live migration to use migration int. address

    In kolla ansible we typically configure services to communicate via IP
    addresses rather than hostnames. One accidental exception to this was
    live migration, which used the hostname of the destination even when
    not required (i.e. TLS not being used for libvirt).

    To make such hostnames work, k-a adds entries to /etc/hosts in the
    bootstrap-servers command. Alternatively users may provide DNS.

    One problem with using /etc/hosts is that, if a new compute host is
    added to the cloud, or an IP address is changed, that will not be
    reflected in the /etc/hosts file of other hosts. This would cause live
    migration to the new host from an old host to fail, as the name cannot
    be resolved.

    The workaround for this was to update the /etc/hosts file (perhaps via
    bootstrap-servers) on all hosts after adding new compute hosts. Then the
    nova_libvirt container had to be restarted to pick up the change.

    Similarly, if user has overridden the migration_interface, the used
    hostname could point to a wrong address on which libvirt would not
    listen.

    This change adds the live_migration_inbound_addr option to nova.conf. If
    TLS is not in use for libvirt, this will be set to the IP address of the
    host on the migration network. If TLS is enabled for libvirt,
    live_migration_inbound_addr will be set to migration_hostname, since
    certificates will typically reference the hostname rather than the
    host's IP. With libvirt TLS enabled, DNS is recommended to avoid the
    /etc/hosts issue which is likely the case in production deployments.

    Change-Id: I0201b46a9fbab21433a9f53685131aeb461543a8
    Closes-Bug: #1729566

Changed in kolla-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/719416

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/719422

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/stein)

Reviewed: https://review.opendev.org/719422
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=f174ec063676bea17e17c537282eb9eb08e89e70
Submitter: Zuul
Branch: stable/stein

commit f174ec063676bea17e17c537282eb9eb08e89e70
Author: John Garbutt <email address hidden>
Date: Fri Mar 27 17:37:51 2020 +0000

    Fix live migration to use migration int. address

    In kolla ansible we typically configure services to communicate via IP
    addresses rather than hostnames. One accidental exception to this was
    live migration, which used the hostname of the destination even when
    not required (i.e. TLS not being used for libvirt).

    To make such hostnames work, k-a adds entries to /etc/hosts in the
    bootstrap-servers command. Alternatively users may provide DNS.

    One problem with using /etc/hosts is that, if a new compute host is
    added to the cloud, or an IP address is changed, that will not be
    reflected in the /etc/hosts file of other hosts. This would cause live
    migration to the new host from an old host to fail, as the name cannot
    be resolved.

    The workaround for this was to update the /etc/hosts file (perhaps via
    bootstrap-servers) on all hosts after adding new compute hosts. Then the
    nova_libvirt container had to be restarted to pick up the change.

    Similarly, if user has overridden the migration_interface, the used
    hostname could point to a wrong address on which libvirt would not
    listen.

    This change adds the live_migration_inbound_addr option to nova.conf. If
    TLS is not in use for libvirt, this will be set to the IP address of the
    host on the migration network. If TLS is enabled for libvirt,
    live_migration_inbound_addr will be set to migration_hostname, since
    certificates will typically reference the hostname rather than the
    host's IP. With libvirt TLS enabled, DNS is recommended to avoid the
    /etc/hosts issue which is likely the case in production deployments.

    Change-Id: I0201b46a9fbab21433a9f53685131aeb461543a8
    Closes-Bug: #1729566
    (cherry picked from commit 628c27ce9e693902c5cabedc7f254f2a1229a195)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/train)

Reviewed: https://review.opendev.org/719416
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=03784673557a0b2c5a2dba1ce06966dcd66473a8
Submitter: Zuul
Branch: stable/train

commit 03784673557a0b2c5a2dba1ce06966dcd66473a8
Author: John Garbutt <email address hidden>
Date: Fri Mar 27 17:37:51 2020 +0000

    Fix live migration to use migration int. address

    In kolla ansible we typically configure services to communicate via IP
    addresses rather than hostnames. One accidental exception to this was
    live migration, which used the hostname of the destination even when
    not required (i.e. TLS not being used for libvirt).

    To make such hostnames work, k-a adds entries to /etc/hosts in the
    bootstrap-servers command. Alternatively users may provide DNS.

    One problem with using /etc/hosts is that, if a new compute host is
    added to the cloud, or an IP address is changed, that will not be
    reflected in the /etc/hosts file of other hosts. This would cause live
    migration to the new host from an old host to fail, as the name cannot
    be resolved.

    The workaround for this was to update the /etc/hosts file (perhaps via
    bootstrap-servers) on all hosts after adding new compute hosts. Then the
    nova_libvirt container had to be restarted to pick up the change.

    Similarly, if user has overridden the migration_interface, the used
    hostname could point to a wrong address on which libvirt would not
    listen.

    This change adds the live_migration_inbound_addr option to nova.conf. If
    TLS is not in use for libvirt, this will be set to the IP address of the
    host on the migration network. If TLS is enabled for libvirt,
    live_migration_inbound_addr will be set to migration_hostname, since
    certificates will typically reference the hostname rather than the
    host's IP. With libvirt TLS enabled, DNS is recommended to avoid the
    /etc/hosts issue which is likely the case in production deployments.

    Change-Id: I0201b46a9fbab21433a9f53685131aeb461543a8
    Closes-Bug: #1729566
    (cherry picked from commit 628c27ce9e693902c5cabedc7f254f2a1229a195)

tags: added: in-stable-train
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.