live migration doesn't use the correct interface to transfer the data

Bug #1614063 reported by Alberto Planas
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Pawel Koniszewski

Bug Description

My compute nodes are attached to several networks (storage, admin,
etc). For each network I have a real or a virtual interface with an IP
assigned. The DNS is properly configured, so I can `ping node1`, or
`ping storage.node1`, and is resolving to the correct IP.

I want to use the second network to transfer the data so:

* Setup libvirtd to listen into the correct interface (checked with netstat)

* Configure nova.conf live_migration_uri

* Monitor interfaces and do nova live-migration

The migration works correctly, is doing what I think is a PEER2PEER
migration type, but the data is transfered via the normal interface.

I can replicate it doing a live migration via virsh.

After more checks I discover that if I do not use the --migrate-uri
parameter, libvirt will ask to the other node the hostname to build
this migrage_uri parameter. The hostname resolve via the slow
interface.

Using the --migrate-uri and the --listen-address (for the -incoming
parameter) works at libvirt level. So we need somehow inject this
paramer in migrateToURIx in the libvirt nova driver.

I have a patch (attached - WIP) that address this issue.

Revision history for this message
Alberto Planas (aplanas) wrote :
Changed in nova:
assignee: nobody → Sharat Sharma (sharat-sharma)
Revision history for this message
Alberto Planas (aplanas) wrote :
Revision history for this message
Daniel Berrange (berrange) wrote :

It is already possible to get nova to use a different interface for live migration. Just set

live_migration_inbound_addr=IP-ADDR-OF-FASTER-NIC

on the compute nodes.

Changed in nova:
status: New → Incomplete
status: Incomplete → Invalid
Revision history for this message
Alberto Planas (aplanas) wrote :

@berrange this is not correct! Reading the documentation [1] and checking the source code [2] is very clear to me that the purpose is specify the IP where `dest` is going to resolve. After that this `dest` is applied to the template that is specified in `live_migration_uri`

Basically if I set the variable that you propose with something like this:

live_migrate_uri='qemu+tcp://%s/system'
live_migrate_inbound_addr='192.168.10.10'

This will generate a dest="qemu+tcp://192.168.10.10/system", that is ok of the address where libvirtd is listening.

Also this break completely when:

live_migrate_uri='qemu+tcp://network1.%s/system'
live_migrate_inbound_addr='192.168.10.10'

Because produces the deeply wrong dest="qemu+tcp://network1.192.168.10.10/system", that do not resolve to anywhere.

Also this is no the problem described in the bug report, that is about the second connection between hypervisors. This is a direct connection [3], where one side is using -incoming to specify the interface where in the target of the migration is listening, and `migrate` that indicate where to make the connection.

This `migrate` default to the hostname of the other side, that is wrong. --migrate-uri is used to specify a different network here, and there is no code in Nova that set migrate_uri parameter.

Can you please reevaluate the status of the bug?

[1] http://docs.openstack.org/mitaka/config-reference/tables/conf-changes/nova.html
[2] https://git.openstack.org/cgit/openstack/nova/tree/nova/virt/libvirt/driver.py#n5833
[3] http://www.linux-kvm.org/page/Migration

Changed in nova:
assignee: Sharat Sharma (sharat-sharma) → nobody
Revision history for this message
Pawel Koniszewski (pawel-koniszewski) wrote :

Confirmed that it is a bug. Libvirt correctly uses live_migration_inbound_addr, but QEMU still defaults to the hostname of the other side instead of provided IP address.

Changed in nova:
status: Invalid → Confirmed
importance: Undecided → Medium
tags: added: live-migration
Revision history for this message
Alberto Planas (aplanas) wrote :

If the patch is finally not accepted, I found an external workaround:

    # in nova.conf
    live_migrate_uri = 'qemu+tcp://network1.%s/system'

    # in /etc/libvirt/libvirtd.cong:
    listen_addr = "network1.HOSTNAME"

    # in /etc/libvirt/qemu.conf update:
    migration_address = "network1.HOSTNAME"
    migration_host = "network1.HOSTNAME"

This will partially ignore the live_migration_uri parameter from Nova, and set a fixed parameter uri_in libvirtd.

I still do not like this workaround: this bypass the Nova configuration, and make this feature a bit difficult to orchestrate.

Revision history for this message
Sarafraj Singh (sarafraj-singh) wrote :

@Alberto: I added you as assignee so others know you are working on it

Changed in nova:
status: Confirmed → In Progress
assignee: nobody → Alberto Planas (aplanas)
Changed in nova:
assignee: Alberto Planas (aplanas) → Pawel Koniszewski (pawel-koniszewski)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/356558
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=94f89e96f86c1d1bb258da754d3d368856637a0a
Submitter: Jenkins
Branch: master

commit 94f89e96f86c1d1bb258da754d3d368856637a0a
Author: Alberto Planas <email address hidden>
Date: Wed Aug 17 17:37:48 2016 +0200

    Add migrate_uri for invoking the migration

    Add migrate_uri parameter in Guest.migrate method, to indicate
    the URI where we want to stablish the connection in a non-tunneled
    migration.

    Change-Id: I6c2ad0170d90560d7d710b578c45287e78c682d1
    Closes-Bug: #1614063

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/389554

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.0.0b1

This issue was fixed in the openstack/nova 15.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/newton)

Reviewed: https://review.openstack.org/389554
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1372c1fc8ff7f81995293312023bb989855aee6e
Submitter: Jenkins
Branch: stable/newton

commit 1372c1fc8ff7f81995293312023bb989855aee6e
Author: Alberto Planas <email address hidden>
Date: Wed Aug 17 17:37:48 2016 +0200

    Add migrate_uri for invoking the migration

    Add migrate_uri parameter in Guest.migrate method, to indicate
    the URI where we want to stablish the connection in a non-tunneled
    migration.

    Conflicts:
     nova/tests/unit/virt/libvirt/test_driver.py

    Closes-Bug: #1614063
    Change-Id: I6c2ad0170d90560d7d710b578c45287e78c682d1
    (cherry picked from commit 94f89e96f86c1d1bb258da754d3d368856637a0a)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.3

This issue was fixed in the openstack/nova 14.0.3 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.