Activity log for bug #1691195

Date Who What changed Old value New value Message
2017-05-16 17:09:08 Artom Lifshitz bug added bug
2017-05-17 00:34:04 OpenStack Infra nova: status New In Progress
2017-05-17 00:34:04 OpenStack Infra nova: assignee Artom Lifshitz (notartom)
2017-05-17 07:52:04 Artom Lifshitz description Description =========== If an instance has had an attached volume volume-updated twice in a "round-trip" - ie, volume-update $vol1 $vol2, then volume-update $vol2 $vol1 - it cannot be live-migrated. Steps to reproduce ================== 1. Create two iscsi volumes. # cinder create --name test_vol1 --volume-type iscsi 1 # cinder create --name test_vol2 --volume-type iscsi 1 (--volume-type iscsi isn't mandatory - in my devstack environment there is no iscsi volume-type, but that doesn't stop me from reproducing this bug) 2. Boot an instance. # nova boot --flavor 1 --image $imageid --nic net-id=$netid testvm1 3. Attach one iscsi volume to testvm1. # nova volume-attach testvm1 $test_vol1 4. Do volume-update to swap volume to 2nd one. (1st time volume-update) # nova volume-update testvm1 $test_vol1 $test_vol2 5. Do volume-update again to swap volume back to the 1st one. (2nd time volume-update) # nova volume-update testvm1 $test_vol2 $test_vol1 6. Live migrate instance to other compute node. # nova live-migration testvm1 Expected result =============== Live migration succeeds. Actual result ============= Live migration fails with: Apr 27 10:32:14 multi9h-3 nova-compute: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in migrateToURI3 Apr 27 10:32:14 multi9h-3 nova-compute: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self) Apr 27 10:32:14 multi9h-3 nova-compute: libvirtError: missing source information for device vdb Environment =========== This has been originally reported [1] in Red Hat OSP 9 (Mitaka) and is reproducible on devstack master as well. Additional information ====================== There are two things going on here. 1. When performing the volume-update, the libvirt driver calls virDomainBlockRebase without the VIR_DOMAIN_BLOCK_REBASE_COPY_DEV flag [2], meaning the device XML changes from <source dev=/dev/isci/lun> to <source file=/dev/iscsi/lun>. This is a problem because /dev/iscsi/lun isn't a regular file, and causes the above error, except you need the "round-trip" volume-update to trigger it. Why? Because: 2. The serial number isn't updated when doing volume-update, and there's a bit of live-migration code [3] that checks for serial numbers before updating the XML. If the serial numbers don't match, the XML isn't updated, and libvirt doesn't notice that /dev/iscsi/lun isn't a file. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1446446 [2] http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockRebase [3] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/migration.py#L158 Description =========== If an instance has had an attached volume volume-updated twice in a "round-trip" - ie, volume-update $vol1 $vol2, then volume-update $vol2 $vol1 - it cannot be live-migrated. Steps to reproduce ================== 1. Create two iscsi volumes.    # cinder create --name test_vol1 --volume-type iscsi 1    # cinder create --name test_vol2 --volume-type iscsi 1    (--volume-type iscsi isn't mandatory - in my devstack environment there is no iscsi volume-type, but that doesn't stop me from reproducing this bug) 2. Boot an instance.    # nova boot --flavor 1 --image $imageid --nic net-id=$netid testvm1 3. Attach one iscsi volume to testvm1.    # nova volume-attach testvm1 $test_vol1 4. Do volume-update to swap volume to 2nd one. (1st time volume-update)    # nova volume-update testvm1 $test_vol1 $test_vol2 5. Do volume-update again to swap volume back to the 1st one. (2nd time volume-update)    # nova volume-update testvm1 $test_vol2 $test_vol1 6. Live migrate instance to other compute node.    # nova live-migration testvm1 Expected result =============== Live migration succeeds. Actual result ============= Live migration fails with: Apr 27 10:32:14 multi9h-3 nova-compute: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in migrateToURI3 Apr 27 10:32:14 multi9h-3 nova-compute: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self) Apr 27 10:32:14 multi9h-3 nova-compute: libvirtError: missing source information for device vdb Environment =========== This has been originally reported [1] in Red Hat OSP 9 (Mitaka) and is reproducible on devstack master as well. Additional information ====================== There are two things going on here. 1. When performing the volume-update, the libvirt driver calls virDomainBlockRebase without the VIR_DOMAIN_BLOCK_REBASE_COPY_DEV flag [2], meaning the device XML changes from <source dev=/dev/isci/lun> to <source file=/dev/iscsi/lun>. This is a problem because /dev/iscsi/lun isn't a regular file, and causes the above error, except you need the "round-trip" volume-update to trigger it. Why? Because: 2. The serial number isn't updated when doing volume-update, and there's a bit of live-migration code [3] that checks for serial numbers before updating the XML. If the serial numbers don't match, the XML isn't updated, and libvirt doesn't notice that /dev/iscsi/lun isn't a file. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1446446 [2] http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockRebase [3] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/migration.py#L158
2017-05-17 07:53:41 Artom Lifshitz description Description =========== If an instance has had an attached volume volume-updated twice in a "round-trip" - ie, volume-update $vol1 $vol2, then volume-update $vol2 $vol1 - it cannot be live-migrated. Steps to reproduce ================== 1. Create two iscsi volumes.    # cinder create --name test_vol1 --volume-type iscsi 1    # cinder create --name test_vol2 --volume-type iscsi 1    (--volume-type iscsi isn't mandatory - in my devstack environment there is no iscsi volume-type, but that doesn't stop me from reproducing this bug) 2. Boot an instance.    # nova boot --flavor 1 --image $imageid --nic net-id=$netid testvm1 3. Attach one iscsi volume to testvm1.    # nova volume-attach testvm1 $test_vol1 4. Do volume-update to swap volume to 2nd one. (1st time volume-update)    # nova volume-update testvm1 $test_vol1 $test_vol2 5. Do volume-update again to swap volume back to the 1st one. (2nd time volume-update)    # nova volume-update testvm1 $test_vol2 $test_vol1 6. Live migrate instance to other compute node.    # nova live-migration testvm1 Expected result =============== Live migration succeeds. Actual result ============= Live migration fails with: Apr 27 10:32:14 multi9h-3 nova-compute: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in migrateToURI3 Apr 27 10:32:14 multi9h-3 nova-compute: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self) Apr 27 10:32:14 multi9h-3 nova-compute: libvirtError: missing source information for device vdb Environment =========== This has been originally reported [1] in Red Hat OSP 9 (Mitaka) and is reproducible on devstack master as well. Additional information ====================== There are two things going on here. 1. When performing the volume-update, the libvirt driver calls virDomainBlockRebase without the VIR_DOMAIN_BLOCK_REBASE_COPY_DEV flag [2], meaning the device XML changes from <source dev=/dev/isci/lun> to <source file=/dev/iscsi/lun>. This is a problem because /dev/iscsi/lun isn't a regular file, and causes the above error, except you need the "round-trip" volume-update to trigger it. Why? Because: 2. The serial number isn't updated when doing volume-update, and there's a bit of live-migration code [3] that checks for serial numbers before updating the XML. If the serial numbers don't match, the XML isn't updated, and libvirt doesn't notice that /dev/iscsi/lun isn't a file. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1446446 [2] http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockRebase [3] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/migration.py#L158 Description =========== If an instance has had an attached volume volume-updated twice in a "round-trip" - ie, volume-update $vol1 $vol2, then volume-update $vol2 $vol1 - it cannot be live-migrated. Steps to reproduce ================== 1. Create two iscsi volumes.    # cinder create --name test_vol1 --volume-type iscsi 1    # cinder create --name test_vol2 --volume-type iscsi 1    (--volume-type iscsi isn't mandatory - in my devstack environment there    is no iscsi volume-type, but that doesn't stop me from reproducing this    bug) 2. Boot an instance.    # nova boot --flavor 1 --image $imageid --nic net-id=$netid testvm1 3. Attach one iscsi volume to testvm1.    # nova volume-attach testvm1 $test_vol1 4. Do volume-update to swap volume to 2nd one. (1st time volume-update)    # nova volume-update testvm1 $test_vol1 $test_vol2 5. Do volume-update again to swap volume back to the 1st one. (2nd time volume-update)    # nova volume-update testvm1 $test_vol2 $test_vol1 6. Live migrate instance to other compute node.    # nova live-migration testvm1 Expected result =============== Live migration succeeds. Actual result ============= Live migration fails with: Apr 27 10:32:14 multi9h-3 nova-compute: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in migrateToURI3 Apr 27 10:32:14 multi9h-3 nova-compute: if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self) Apr 27 10:32:14 multi9h-3 nova-compute: libvirtError: missing source information for device vdb Environment =========== This has been originally reported [1] in Red Hat OSP 9 (Mitaka) and is reproducible on devstack master as well. Additional information ====================== There are two things going on here. 1. When performing the volume-update, the libvirt driver calls virDomainBlockRebase without the VIR_DOMAIN_BLOCK_REBASE_COPY_DEV flag [2], meaning the device XML changes from <source dev=/dev/isci/lun> to <source file=/dev/iscsi/lun>. This is a problem because /dev/iscsi/lun isn't a regular file, and causes the above error, except you need the "round-trip" volume-update to trigger it. Why? Because: 2. The serial number isn't updated when doing volume-update, and there's a bit of live-migration code [3] that checks for serial numbers before updating the XML. If the serial numbers don't match, the XML isn't updated, and libvirt doesn't notice that /dev/iscsi/lun isn't a file. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1446446 [2] http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockRebase [3] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/migration.py#L158
2017-06-05 23:40:46 Matt Riedemann nominated for series nova/newton
2017-06-05 23:40:46 Matt Riedemann bug task added nova/newton
2017-06-05 23:40:46 Matt Riedemann nominated for series nova/ocata
2017-06-05 23:40:46 Matt Riedemann bug task added nova/ocata
2017-06-05 23:40:51 Matt Riedemann nova: importance Undecided High
2017-06-05 23:40:53 Matt Riedemann nova/newton: importance Undecided High
2017-06-05 23:40:56 Matt Riedemann nova/ocata: importance Undecided High
2017-06-05 23:40:58 Matt Riedemann nova/newton: status New Confirmed
2017-06-05 23:41:00 Matt Riedemann nova/ocata: status New Confirmed
2017-06-05 23:43:02 Matt Riedemann tags libvirt live-migration volumes
2017-06-06 13:44:33 OpenStack Infra nova/ocata: status Confirmed In Progress
2017-06-06 13:44:33 OpenStack Infra nova/ocata: assignee Artom Lifshitz (notartom)
2017-06-06 13:47:30 OpenStack Infra nova/newton: status Confirmed In Progress
2017-06-06 13:47:30 OpenStack Infra nova/newton: assignee Artom Lifshitz (notartom)
2017-06-06 15:12:18 OpenStack Infra nova/ocata: assignee Artom Lifshitz (notartom) Matt Riedemann (mriedem)
2017-06-06 18:55:44 OpenStack Infra nova: status In Progress Fix Released
2017-06-08 01:00:16 OpenStack Infra nova/ocata: assignee Matt Riedemann (mriedem) Artom Lifshitz (notartom)
2017-06-10 17:44:51 OpenStack Infra nova/ocata: status In Progress Fix Committed
2017-08-03 12:50:11 OpenStack Infra nova/newton: assignee Artom Lifshitz (notartom) Lee Yarwood (lyarwood)
2017-08-30 15:31:29 OpenStack Infra nova/newton: assignee Lee Yarwood (lyarwood) Artom Lifshitz (notartom)
2017-10-19 06:24:44 OpenStack Infra nova/newton: status In Progress Fix Committed