live migration aborts with Ncat: No such file or directory.: Input/output error

Bug #1834330 reported by Martin Schuppert
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Martin Schuppert

Bug Description

Performing live migration after make sure we have live_migration_wait_for_vif_plug=False in nova.conf on the computes. Migration aborts with:

2019-06-26 08:17:43.518 6 INFO nova.compute.manager [-] [instance: a20ab13e-b348-4dc2-b294-e23865e2fc4b] Took 6.75 seconds for pre_live_migration on destination host compute-0.localdomain.
2019-06-26 08:17:43.519 6 DEBUG nova.compute.manager [-] [instance: a20ab13e-b348-4dc2-b294-e23865e2fc4b] Not waiting for events after pre_live_migration: [('network-vif-plugged', 'fbb880ff-b834-4cca-b12f-77a60a1984b2')]. _do_live_migration /usr/lib/python3.6/site-packages/nova/compute/manager.py:6548
...
2019-06-26 08:17:43.728 6 INFO nova.virt.libvirt.migration [-] [instance: a20ab13e-b348-4dc2-b294-e23865e2fc4b] Increasing downtime to 50 ms after 0 sec elapsed time
2019-06-26 08:17:43.857 6 INFO nova.virt.libvirt.driver [-] [instance: a20ab13e-b348-4dc2-b294-e23865e2fc4b] Migration running for 0 secs, memory 100% remaining; (bytes processed=0, remaining=0, total=0)
2019-06-26 08:17:44.293 6 ERROR nova.virt.libvirt.driver [-] [instance: a20ab13e-b348-4dc2-b294-e23865e2fc4b] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+ssh://<email address hidden>:2022/system?keyfile=/etc/nova/migration/identity: End of file while reading data: Ncat: No such file or directory.: Input/output error: libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+ssh://<email address hidden>:2022/system?keyfile=/etc/nova/migration/identity: End of file while reading data: Ncat: No such file or directory.: Input/output error
2019-06-26 08:17:44.295 6 DEBUG nova.virt.libvirt.driver [-] [instance: a20ab13e-b348-4dc2-b294-e23865e2fc4b] Migration operation thread notification thread_finished /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:8015
2019-06-26 08:17:44.360 6 DEBUG nova.virt.libvirt.migration [-] [instance: a20ab13e-b348-4dc2-b294-e23865e2fc4b] VM running on src, migration failed find_job_type /usr/lib/python3.6/site-packages/nova/virt/libvirt/migration.py:360
2019-06-26 08:17:44.360 6 DEBUG nova.virt.libvirt.driver [-] [instance: a20ab13e-b348-4dc2-b294-e23865e2fc4b] Fixed incorrect job type to be 4 _live_migration_monitor /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:7844
2019-06-26 08:17:44.361 6 ERROR nova.virt.libvirt.driver [-] [instance: a20ab13e-b348-4dc2-b294-e23865e2fc4b] Migration operation has aborted

Additional info:

during the migration, the disk files get copied to the new compute, so it is not a general connection issue between the computes:

[root@compute-0 ~]# ll -R /var/lib/nova/instances/a20ab13e-b348-4dc2-b294-e23865e2fc4b
/var/lib/nova/instances/a20ab13e-b348-4dc2-b294-e23865e2fc4b:
total 396
-rw-r--r--. 1 42436 42436 196616 Jun 26 08:47 disk
-rw-r--r--. 1 42436 42436 196624 Jun 26 08:47 disk.eph0
-rw-r--r--. 1 42436 42436 161 Jun 26 08:47 disk.info

Revision history for this message
Martin Schuppert (mschuppert) wrote :

The issue was introduced with [1]. The nova-migration-wrapper inside the container needs access to the libvirt socket [1].

[1] https://review.opendev.org/#/c/662109/
[2] https://github.com/rdo-packages/nova-distgit/blob/rpm-master/nova-migration-wrapper#L31

Changed in tripleo:
assignee: nobody → Martin Schuppert (mschuppert)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.opendev.org/667605

Changed in tripleo:
status: New → In Progress
Revision history for this message
Martin Schuppert (mschuppert) wrote :

need to be backported up to queens

tags: added: stein-backport-potential
tags: removed: stein-backport-potential
Changed in tripleo:
importance: Undecided → High
milestone: none → train-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/667804

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/667806

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/667807

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/667605
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=537822d47c455cb9e4184744a91d083f2c89609c
Submitter: Zuul
Branch: master

commit 537822d47c455cb9e4184744a91d083f2c89609c
Author: Martin Schuppert <email address hidden>
Date: Wed Jun 26 14:57:48 2019 +0200

    Add /run/libvirt to nova_migration_target container

    [1] removed the bind mount from /run inside the nova_migration_target
    container. But the nova-migration-wrapper inside the container needs
    access to the libvirt socket [2].

    This adds the bind mount of /run/libvirt to the nova_migration_target
    container to fix live migration issues.

    [1] https://review.opendev.org/#/c/662109/
    [2] https://github.com/rdo-packages/nova-distgit/blob/rpm-master/nova-migration-wrapper#L31

    Change-Id: I7e236b838328a7a140a0aba0745bd8ac1db00015
    Closes-Bug: #1834330

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/rocky)

Reviewed: https://review.opendev.org/667806
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=95347b3aa94bb4324d8e4aa2c1554fbce6f9beaa
Submitter: Zuul
Branch: stable/rocky

commit 95347b3aa94bb4324d8e4aa2c1554fbce6f9beaa
Author: Martin Schuppert <email address hidden>
Date: Wed Jun 26 14:57:48 2019 +0200

    Add /run/libvirt to nova_migration_target container

    [1] removed the bind mount from /run inside the nova_migration_target
    container. But the nova-migration-wrapper inside the container needs
    access to the libvirt socket [2].

    This adds the bind mount of /run/libvirt to the nova_migration_target
    container to fix live migration issues.

    [1] https://review.opendev.org/#/c/662109/
    [2] https://github.com/rdo-packages/nova-distgit/blob/rpm-master/nova-migration-wrapper#L31

    Change-Id: I7e236b838328a7a140a0aba0745bd8ac1db00015
    Closes-Bug: #1834330
    (cherry picked from commit 537822d47c455cb9e4184744a91d083f2c89609c)
    (cherry picked from commit 1425671a384f1634a8057c5ef6fbbe954f1ba4e0)

tags: added: in-stable-rocky
tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/queens)

Reviewed: https://review.opendev.org/667807
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=ad6effc40d91aa157bec9a5c74ef31fb4cee427f
Submitter: Zuul
Branch: stable/queens

commit ad6effc40d91aa157bec9a5c74ef31fb4cee427f
Author: Martin Schuppert <email address hidden>
Date: Wed Jun 26 14:57:48 2019 +0200

    Add /run/libvirt to nova_migration_target container

    [1] removed the bind mount from /run inside the nova_migration_target
    container. But the nova-migration-wrapper inside the container needs
    access to the libvirt socket [2].

    This adds the bind mount of /run/libvirt to the nova_migration_target
    container to fix live migration issues.

    [1] https://review.opendev.org/#/c/662109/
    [2] https://github.com/rdo-packages/nova-distgit/blob/rpm-master/nova-migration-wrapper#L31

    Change-Id: I7e236b838328a7a140a0aba0745bd8ac1db00015
    Closes-Bug: #1834330
    (cherry picked from commit 537822d47c455cb9e4184744a91d083f2c89609c)
    (cherry picked from commit 1425671a384f1634a8057c5ef6fbbe954f1ba4e0)
    (cherry picked from commit 95347b3aa94bb4324d8e4aa2c1554fbce6f9beaa)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/stein)

Reviewed: https://review.opendev.org/667804
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=1425671a384f1634a8057c5ef6fbbe954f1ba4e0
Submitter: Zuul
Branch: stable/stein

commit 1425671a384f1634a8057c5ef6fbbe954f1ba4e0
Author: Martin Schuppert <email address hidden>
Date: Wed Jun 26 14:57:48 2019 +0200

    Add /run/libvirt to nova_migration_target container

    [1] removed the bind mount from /run inside the nova_migration_target
    container. But the nova-migration-wrapper inside the container needs
    access to the libvirt socket [2].

    This adds the bind mount of /run/libvirt to the nova_migration_target
    container to fix live migration issues.

    [1] https://review.opendev.org/#/c/662109/
    [2] https://github.com/rdo-packages/nova-distgit/blob/rpm-master/nova-migration-wrapper#L31

    Change-Id: I7e236b838328a7a140a0aba0745bd8ac1db00015
    Closes-Bug: #1834330
    (cherry picked from commit 537822d47c455cb9e4184744a91d083f2c89609c)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 10.6.0

This issue was fixed in the openstack/tripleo-heat-templates 10.6.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.1.0

This issue was fixed in the openstack/tripleo-heat-templates 11.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 9.4.1

This issue was fixed in the openstack/tripleo-heat-templates 9.4.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 8.4.1

This issue was fixed in the openstack/tripleo-heat-templates 8.4.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.