2017-05-16 17:09:08 |
Artom Lifshitz |
bug |
|
|
added bug |
2017-05-17 00:34:04 |
OpenStack Infra |
nova: status |
New |
In Progress |
|
2017-05-17 00:34:04 |
OpenStack Infra |
nova: assignee |
|
Artom Lifshitz (notartom) |
|
2017-05-17 07:52:04 |
Artom Lifshitz |
description |
Description
===========
If an instance has had an attached volume volume-updated twice in a "round-trip" - ie, volume-update $vol1 $vol2, then volume-update $vol2 $vol1 - it cannot be live-migrated.
Steps to reproduce
==================
1. Create two iscsi volumes.
# cinder create --name test_vol1 --volume-type iscsi 1
# cinder create --name test_vol2 --volume-type iscsi 1
(--volume-type iscsi isn't mandatory - in my devstack environment there is no iscsi
volume-type, but that doesn't stop me from reproducing this bug)
2. Boot an instance.
# nova boot --flavor 1 --image $imageid --nic net-id=$netid testvm1
3. Attach one iscsi volume to testvm1.
# nova volume-attach testvm1 $test_vol1
4. Do volume-update to swap volume to 2nd one. (1st time volume-update)
# nova volume-update testvm1 $test_vol1 $test_vol2
5. Do volume-update again to swap volume back to the 1st one. (2nd time volume-update)
# nova volume-update testvm1 $test_vol2 $test_vol1
6. Live migrate instance to other compute node.
# nova live-migration testvm1
Expected result
===============
Live migration succeeds.
Actual result
=============
Live migration fails with:
Apr 27 10:32:14 multi9h-3 nova-compute: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in migrateToURI3
Apr 27 10:32:14 multi9h-3 nova-compute: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
Apr 27 10:32:14 multi9h-3 nova-compute: libvirtError: missing source information for device vdb
Environment
===========
This has been originally reported [1] in Red Hat OSP 9 (Mitaka) and is reproducible on devstack master as well.
Additional information
======================
There are two things going on here.
1. When performing the volume-update, the libvirt driver calls virDomainBlockRebase without the VIR_DOMAIN_BLOCK_REBASE_COPY_DEV flag [2], meaning the device XML changes from <source dev=/dev/isci/lun> to <source file=/dev/iscsi/lun>. This is a problem because /dev/iscsi/lun isn't a regular file, and causes the above error, except you need the "round-trip" volume-update to trigger it. Why? Because:
2. The serial number isn't updated when doing volume-update, and there's a bit of live-migration code [3] that checks for serial numbers before updating the XML. If the serial numbers don't match, the XML isn't updated, and libvirt doesn't notice that /dev/iscsi/lun isn't a file.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1446446
[2] http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockRebase
[3] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/migration.py#L158 |
Description
===========
If an instance has had an attached volume volume-updated twice in a "round-trip" - ie, volume-update $vol1 $vol2, then volume-update $vol2 $vol1 - it cannot be live-migrated.
Steps to reproduce
==================
1. Create two iscsi volumes.
# cinder create --name test_vol1 --volume-type iscsi 1
# cinder create --name test_vol2 --volume-type iscsi 1
(--volume-type iscsi isn't mandatory - in my devstack environment there
is no iscsi volume-type, but that doesn't stop me from reproducing this
bug)
2. Boot an instance.
# nova boot --flavor 1 --image $imageid --nic net-id=$netid testvm1
3. Attach one iscsi volume to testvm1.
# nova volume-attach testvm1 $test_vol1
4. Do volume-update to swap volume to 2nd one. (1st time volume-update)
# nova volume-update testvm1 $test_vol1 $test_vol2
5. Do volume-update again to swap volume back to the 1st one. (2nd time volume-update)
# nova volume-update testvm1 $test_vol2 $test_vol1
6. Live migrate instance to other compute node.
# nova live-migration testvm1
Expected result
===============
Live migration succeeds.
Actual result
=============
Live migration fails with:
Apr 27 10:32:14 multi9h-3 nova-compute: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in migrateToURI3
Apr 27 10:32:14 multi9h-3 nova-compute: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
Apr 27 10:32:14 multi9h-3 nova-compute: libvirtError: missing source information for device vdb
Environment
===========
This has been originally reported [1] in Red Hat OSP 9 (Mitaka) and is reproducible on devstack master as well.
Additional information
======================
There are two things going on here.
1. When performing the volume-update, the libvirt driver calls virDomainBlockRebase without the VIR_DOMAIN_BLOCK_REBASE_COPY_DEV flag [2], meaning the device XML changes from <source dev=/dev/isci/lun> to <source file=/dev/iscsi/lun>. This is a problem because /dev/iscsi/lun isn't a regular file, and causes the above error, except you need the "round-trip" volume-update to trigger it. Why? Because:
2. The serial number isn't updated when doing volume-update, and there's a bit of live-migration code [3] that checks for serial numbers before updating the XML. If the serial numbers don't match, the XML isn't updated, and libvirt doesn't notice that /dev/iscsi/lun isn't a file.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1446446
[2] http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockRebase
[3] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/migration.py#L158 |
|
2017-05-17 07:53:41 |
Artom Lifshitz |
description |
Description
===========
If an instance has had an attached volume volume-updated twice in a "round-trip" - ie, volume-update $vol1 $vol2, then volume-update $vol2 $vol1 - it cannot be live-migrated.
Steps to reproduce
==================
1. Create two iscsi volumes.
# cinder create --name test_vol1 --volume-type iscsi 1
# cinder create --name test_vol2 --volume-type iscsi 1
(--volume-type iscsi isn't mandatory - in my devstack environment there
is no iscsi volume-type, but that doesn't stop me from reproducing this
bug)
2. Boot an instance.
# nova boot --flavor 1 --image $imageid --nic net-id=$netid testvm1
3. Attach one iscsi volume to testvm1.
# nova volume-attach testvm1 $test_vol1
4. Do volume-update to swap volume to 2nd one. (1st time volume-update)
# nova volume-update testvm1 $test_vol1 $test_vol2
5. Do volume-update again to swap volume back to the 1st one. (2nd time volume-update)
# nova volume-update testvm1 $test_vol2 $test_vol1
6. Live migrate instance to other compute node.
# nova live-migration testvm1
Expected result
===============
Live migration succeeds.
Actual result
=============
Live migration fails with:
Apr 27 10:32:14 multi9h-3 nova-compute: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in migrateToURI3
Apr 27 10:32:14 multi9h-3 nova-compute: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
Apr 27 10:32:14 multi9h-3 nova-compute: libvirtError: missing source information for device vdb
Environment
===========
This has been originally reported [1] in Red Hat OSP 9 (Mitaka) and is reproducible on devstack master as well.
Additional information
======================
There are two things going on here.
1. When performing the volume-update, the libvirt driver calls virDomainBlockRebase without the VIR_DOMAIN_BLOCK_REBASE_COPY_DEV flag [2], meaning the device XML changes from <source dev=/dev/isci/lun> to <source file=/dev/iscsi/lun>. This is a problem because /dev/iscsi/lun isn't a regular file, and causes the above error, except you need the "round-trip" volume-update to trigger it. Why? Because:
2. The serial number isn't updated when doing volume-update, and there's a bit of live-migration code [3] that checks for serial numbers before updating the XML. If the serial numbers don't match, the XML isn't updated, and libvirt doesn't notice that /dev/iscsi/lun isn't a file.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1446446
[2] http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockRebase
[3] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/migration.py#L158 |
Description
===========
If an instance has had an attached volume volume-updated twice in a "round-trip" - ie, volume-update $vol1 $vol2, then volume-update $vol2 $vol1 - it cannot be live-migrated.
Steps to reproduce
==================
1. Create two iscsi volumes.
# cinder create --name test_vol1 --volume-type iscsi 1
# cinder create --name test_vol2 --volume-type iscsi 1
(--volume-type iscsi isn't mandatory - in my devstack environment there
is no iscsi volume-type, but that doesn't stop me from reproducing this
bug)
2. Boot an instance.
# nova boot --flavor 1 --image $imageid --nic net-id=$netid testvm1
3. Attach one iscsi volume to testvm1.
# nova volume-attach testvm1 $test_vol1
4. Do volume-update to swap volume to 2nd one. (1st time volume-update)
# nova volume-update testvm1 $test_vol1 $test_vol2
5. Do volume-update again to swap volume back to the 1st one. (2nd time
volume-update)
# nova volume-update testvm1 $test_vol2 $test_vol1
6. Live migrate instance to other compute node.
# nova live-migration testvm1
Expected result
===============
Live migration succeeds.
Actual result
=============
Live migration fails with:
Apr 27 10:32:14 multi9h-3 nova-compute: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in migrateToURI3
Apr 27 10:32:14 multi9h-3 nova-compute: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
Apr 27 10:32:14 multi9h-3 nova-compute: libvirtError: missing source information for device vdb
Environment
===========
This has been originally reported [1] in Red Hat OSP 9 (Mitaka) and is reproducible on devstack master as well.
Additional information
======================
There are two things going on here.
1. When performing the volume-update, the libvirt driver calls virDomainBlockRebase without the VIR_DOMAIN_BLOCK_REBASE_COPY_DEV flag [2], meaning the device XML changes from <source dev=/dev/isci/lun> to <source file=/dev/iscsi/lun>. This is a problem because /dev/iscsi/lun isn't a regular file, and causes the above error, except you need the "round-trip" volume-update to trigger it. Why? Because:
2. The serial number isn't updated when doing volume-update, and there's a bit of live-migration code [3] that checks for serial numbers before updating the XML. If the serial numbers don't match, the XML isn't updated, and libvirt doesn't notice that /dev/iscsi/lun isn't a file.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1446446
[2] http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockRebase
[3] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/migration.py#L158 |
|
2017-06-05 23:40:46 |
Matt Riedemann |
nominated for series |
|
nova/newton |
|
2017-06-05 23:40:46 |
Matt Riedemann |
bug task added |
|
nova/newton |
|
2017-06-05 23:40:46 |
Matt Riedemann |
nominated for series |
|
nova/ocata |
|
2017-06-05 23:40:46 |
Matt Riedemann |
bug task added |
|
nova/ocata |
|
2017-06-05 23:40:51 |
Matt Riedemann |
nova: importance |
Undecided |
High |
|
2017-06-05 23:40:53 |
Matt Riedemann |
nova/newton: importance |
Undecided |
High |
|
2017-06-05 23:40:56 |
Matt Riedemann |
nova/ocata: importance |
Undecided |
High |
|
2017-06-05 23:40:58 |
Matt Riedemann |
nova/newton: status |
New |
Confirmed |
|
2017-06-05 23:41:00 |
Matt Riedemann |
nova/ocata: status |
New |
Confirmed |
|
2017-06-05 23:43:02 |
Matt Riedemann |
tags |
|
libvirt live-migration volumes |
|
2017-06-06 13:44:33 |
OpenStack Infra |
nova/ocata: status |
Confirmed |
In Progress |
|
2017-06-06 13:44:33 |
OpenStack Infra |
nova/ocata: assignee |
|
Artom Lifshitz (notartom) |
|
2017-06-06 13:47:30 |
OpenStack Infra |
nova/newton: status |
Confirmed |
In Progress |
|
2017-06-06 13:47:30 |
OpenStack Infra |
nova/newton: assignee |
|
Artom Lifshitz (notartom) |
|
2017-06-06 15:12:18 |
OpenStack Infra |
nova/ocata: assignee |
Artom Lifshitz (notartom) |
Matt Riedemann (mriedem) |
|
2017-06-06 18:55:44 |
OpenStack Infra |
nova: status |
In Progress |
Fix Released |
|
2017-06-08 01:00:16 |
OpenStack Infra |
nova/ocata: assignee |
Matt Riedemann (mriedem) |
Artom Lifshitz (notartom) |
|
2017-06-10 17:44:51 |
OpenStack Infra |
nova/ocata: status |
In Progress |
Fix Committed |
|
2017-08-03 12:50:11 |
OpenStack Infra |
nova/newton: assignee |
Artom Lifshitz (notartom) |
Lee Yarwood (lyarwood) |
|
2017-08-30 15:31:29 |
OpenStack Infra |
nova/newton: assignee |
Lee Yarwood (lyarwood) |
Artom Lifshitz (notartom) |
|
2017-10-19 06:24:44 |
OpenStack Infra |
nova/newton: status |
In Progress |
Fix Committed |
|