2016-11-10 04:33:03 |
Hua Zhang |
bug |
|
|
added bug |
2016-11-10 07:43:13 |
Dominique Poulain |
bug |
|
|
added subscriber Dominique Poulain |
2016-11-10 07:56:14 |
Hua Zhang |
summary |
Nova live-migration corrupts some instances |
libvirt live-migration corrupts some instances |
|
2016-11-11 08:45:26 |
Christian Ehrhardt |
bug |
|
|
added subscriber Ubuntu Server Team |
2016-11-11 08:51:03 |
Christian Ehrhardt |
bug |
|
|
added subscriber ChristianEhrhardt |
2016-11-15 13:16:53 |
Hua Zhang |
summary |
libvirt live-migration corrupts some instances |
libvirt 1.2.12 live-migration corrupts some instances |
|
2016-11-15 13:18:11 |
Hua Zhang |
attachment added |
|
trusty_libvirt_migration_image_corruption.debdiff https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1640676/+attachment/4777701/+files/trusty_libvirt_migration_image_corruption.debdiff |
|
2016-11-15 13:18:36 |
Hua Zhang |
description |
We can replicate the corruption pretty much at will. The sequence of events to trigger it is:
Create an instance using a cloud image
Start a job running with the following command: "dd if=/dev/urandom of=/var/tmp/mjb.1 bs=4M count=1000"
Live migrate the instance using a command like: "nova live-migration --block-migrate <server-id> <target-hypervisor>"
Once the migration has finished, stop the dd job on the instance
do a "Hard reboot" of the instance
When the instance boots, file system corruption will be observed and it won't boot correctly |
[Impact]
While memory load is high, libvirt 1.2.12 (kilo) live-migration corrupts some instances
[Test Case]
We can replicate the corruption pretty much at will. The sequence of events to trigger it is:
Create an instance using a cloud image
Start a job running with the following command: "dd if=/dev/urandom of=/var/tmp/mjb.1 bs=4M count=1000"
Live migrate the instance using a command like: "nova live-migration --block-migrate <server-id> <target-hypervisor>"
Once the migration has finished, stop the dd job on the instance
do a "Hard reboot" of the instance (eg: for openstack, nova reboot --hard $INSTANCE)
When the instance boots, file system corruption will be observed and it won't boot correctly
[Regression Potential]
[Other Info]
Both libvirt 1.2.16 (kilo) and libvirt 1.2.13 have already fixed this problem.
Backported from upstream patches, before the commit 80c5f10e libvirt just polls the events we are interested which can lead to drive mirror can not be cancelled, then the destination is not in a consistent state. in this case it is not safe to continue with the migration. so the commit 80c5f10e introduces listening queue events instead of polling to fix the problem.
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=80c5f10e865cda0302519492f197cb020bd14a07
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=76c61cdca20c106960af033e5d0f5da70177af0f
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=c37943a0687a8fdb08e6eda8ae4b9f4f43f4f2ed
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=c88b323bf5d5a070c074fda7adc11085f14415ce
BTW, we have completed 20 to 30 live migrations with I/O running and have had no problems, and also tested that other functions continue to work as expected. |
|
2016-11-15 13:18:45 |
Hua Zhang |
libvirt (Ubuntu): assignee |
|
Hua Zhang (zhhuabj) |
|
2016-11-15 13:22:46 |
Hua Zhang |
description |
[Impact]
While memory load is high, libvirt 1.2.12 (kilo) live-migration corrupts some instances
[Test Case]
We can replicate the corruption pretty much at will. The sequence of events to trigger it is:
Create an instance using a cloud image
Start a job running with the following command: "dd if=/dev/urandom of=/var/tmp/mjb.1 bs=4M count=1000"
Live migrate the instance using a command like: "nova live-migration --block-migrate <server-id> <target-hypervisor>"
Once the migration has finished, stop the dd job on the instance
do a "Hard reboot" of the instance (eg: for openstack, nova reboot --hard $INSTANCE)
When the instance boots, file system corruption will be observed and it won't boot correctly
[Regression Potential]
[Other Info]
Both libvirt 1.2.16 (kilo) and libvirt 1.2.13 have already fixed this problem.
Backported from upstream patches, before the commit 80c5f10e libvirt just polls the events we are interested which can lead to drive mirror can not be cancelled, then the destination is not in a consistent state. in this case it is not safe to continue with the migration. so the commit 80c5f10e introduces listening queue events instead of polling to fix the problem.
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=80c5f10e865cda0302519492f197cb020bd14a07
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=76c61cdca20c106960af033e5d0f5da70177af0f
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=c37943a0687a8fdb08e6eda8ae4b9f4f43f4f2ed
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=c88b323bf5d5a070c074fda7adc11085f14415ce
BTW, we have completed 20 to 30 live migrations with I/O running and have had no problems, and also tested that other functions continue to work as expected. |
[Impact]
While memory load is high, libvirt 1.2.12 (kilo) live-migration corrupts some instances
[Test Case]
We can replicate the corruption pretty much at will. The sequence of events to trigger it is:
Create an instance using a cloud image
Start a job running with the following command: "dd if=/dev/urandom of=/var/tmp/mjb.1 bs=4M count=1000"
Live migrate the instance using a command like: "nova live-migration --block-migrate <server-id> <target-hypervisor>"
Once the migration has finished, stop the dd job on the instance
do a "Hard reboot" of the instance (eg: for openstack, nova reboot --hard $INSTANCE)
When the instance boots, file system corruption will be observed and it won't boot correctly
[Regression Potential]
[Other Info]
Both libvirt 1.2.16 (kilo) and libvirt 1.2.13 have already fixed this problem. So this problem only happens on trusty.
Backported from upstream patches, before the commit 80c5f10e libvirt just polls the events we are interested which can lead to drive mirror can not be cancelled, then the destination is not in a consistent state. in this case it is not safe to continue with the migration. so the commit 80c5f10e introduces listening queue events instead of polling to fix the problem.
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=80c5f10e865cda0302519492f197cb020bd14a07
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=76c61cdca20c106960af033e5d0f5da70177af0f
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=c37943a0687a8fdb08e6eda8ae4b9f4f43f4f2ed
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=c88b323bf5d5a070c074fda7adc11085f14415ce
BTW, we have completed 20 to 30 live migrations with I/O running and have had no problems, and also tested that other functions continue to work as expected. |
|
2016-11-15 13:34:26 |
Louis Bouchard |
nominated for series |
|
Ubuntu Trusty |
|
2016-11-15 13:34:26 |
Louis Bouchard |
bug task added |
|
libvirt (Ubuntu Trusty) |
|
2016-11-15 13:56:47 |
Christian Ehrhardt |
libvirt (Ubuntu Trusty): status |
New |
Triaged |
|
2016-11-15 13:56:50 |
Christian Ehrhardt |
libvirt (Ubuntu): status |
New |
Fix Released |
|
2016-11-15 13:56:53 |
Christian Ehrhardt |
libvirt (Ubuntu Trusty): importance |
Undecided |
High |
|
2016-11-15 15:01:16 |
Hua Zhang |
bug task added |
|
cloud-archive |
|
2016-11-15 15:05:33 |
Edward Hope-Morley |
nominated for series |
|
cloud-archive/kilo |
|
2016-11-15 15:07:34 |
Hua Zhang |
attachment removed |
trusty_libvirt_migration_image_corruption.debdiff https://bugs.launchpad.net/cloud-archive/+bug/1640676/+attachment/4777701/+files/trusty_libvirt_migration_image_corruption.debdiff |
|
|
2016-11-15 15:09:11 |
Hua Zhang |
attachment added |
|
trusty_libvirt_migration_image_corruption.debdiff https://bugs.launchpad.net/cloud-archive/+bug/1640676/+attachment/4777731/+files/trusty_libvirt_migration_image_corruption.debdiff |
|
2016-11-15 15:20:19 |
Hua Zhang |
description |
[Impact]
While memory load is high, libvirt 1.2.12 (kilo) live-migration corrupts some instances
[Test Case]
We can replicate the corruption pretty much at will. The sequence of events to trigger it is:
Create an instance using a cloud image
Start a job running with the following command: "dd if=/dev/urandom of=/var/tmp/mjb.1 bs=4M count=1000"
Live migrate the instance using a command like: "nova live-migration --block-migrate <server-id> <target-hypervisor>"
Once the migration has finished, stop the dd job on the instance
do a "Hard reboot" of the instance (eg: for openstack, nova reboot --hard $INSTANCE)
When the instance boots, file system corruption will be observed and it won't boot correctly
[Regression Potential]
[Other Info]
Both libvirt 1.2.16 (kilo) and libvirt 1.2.13 have already fixed this problem. So this problem only happens on trusty.
Backported from upstream patches, before the commit 80c5f10e libvirt just polls the events we are interested which can lead to drive mirror can not be cancelled, then the destination is not in a consistent state. in this case it is not safe to continue with the migration. so the commit 80c5f10e introduces listening queue events instead of polling to fix the problem.
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=80c5f10e865cda0302519492f197cb020bd14a07
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=76c61cdca20c106960af033e5d0f5da70177af0f
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=c37943a0687a8fdb08e6eda8ae4b9f4f43f4f2ed
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=c88b323bf5d5a070c074fda7adc11085f14415ce
BTW, we have completed 20 to 30 live migrations with I/O running and have had no problems, and also tested that other functions continue to work as expected. |
[Impact]
While memory load is high, libvirt 1.2.12 (kilo) live-migration corrupts some instances
[Test Case]
We can replicate the corruption pretty much at will. The sequence of events to trigger it is:
Create an instance using a cloud image
Start a job running with the following command: "dd if=/dev/urandom of=/var/tmp/mjb.1 bs=4M count=1000"
Live migrate the instance using a command like: "nova live-migration --block-migrate <server-id> <target-hypervisor>"
Once the migration has finished, stop the dd job on the instance
do a "Hard reboot" of the instance (eg: for openstack, nova reboot --hard $INSTANCE)
When the instance boots, file system corruption will be observed and it won't boot correctly
[Regression Potential]
[Other Info]
Both libvirt 1.2.16 (liberty) and libvirt 1.2.13 have already fixed this problem. So this problem only happens on kilo.
Backported from upstream patches, before the commit 80c5f10e libvirt just polls the events we are interested which can lead to drive mirror can not be cancelled, then the destination is not in a consistent state. in this case it is not safe to continue with the migration. so the commit 80c5f10e introduces listening queue events instead of polling to fix the problem.
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=80c5f10e865cda0302519492f197cb020bd14a07
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=76c61cdca20c106960af033e5d0f5da70177af0f
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=c37943a0687a8fdb08e6eda8ae4b9f4f43f4f2ed
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=c88b323bf5d5a070c074fda7adc11085f14415ce
BTW, we have completed 20 to 30 live migrations with I/O running and have had no problems, and also tested that other functions continue to work as expected. |
|
2016-11-15 15:30:29 |
Corey Bryant |
bug task added |
|
cloud-archive/kilo |
|
2016-11-15 15:30:46 |
Corey Bryant |
cloud-archive/kilo: status |
New |
Triaged |
|
2016-11-15 15:30:48 |
Corey Bryant |
cloud-archive/kilo: importance |
Undecided |
High |
|
2016-11-15 15:34:12 |
Hua Zhang |
tags |
|
sts-sru |
|
2016-11-17 09:30:10 |
Hua Zhang |
summary |
libvirt 1.2.12 live-migration corrupts some instances |
[SRU] libvirt 1.2.12 live-migration corrupts some instances |
|
2016-11-17 09:30:28 |
Hua Zhang |
libvirt (Ubuntu): assignee |
Hua Zhang (zhhuabj) |
|
|
2016-11-17 09:31:12 |
Hua Zhang |
cloud-archive/kilo: assignee |
|
Hua Zhang (zhhuabj) |
|
2016-11-17 09:34:28 |
Hua Zhang |
bug |
|
|
added subscriber Ubuntu Sponsors Team |
2016-11-17 15:06:33 |
Christian Ehrhardt |
libvirt (Ubuntu Trusty): status |
Triaged |
Incomplete |
|
2016-11-21 04:11:22 |
Mathew Hodson |
libvirt (Ubuntu): importance |
Undecided |
High |
|
2016-11-23 15:16:56 |
Corey Bryant |
cloud-archive: status |
New |
Invalid |
|
2016-11-23 16:40:43 |
Ryan Beisner |
cloud-archive/kilo: status |
Triaged |
Fix Committed |
|
2016-11-23 16:40:45 |
Ryan Beisner |
tags |
sts-sru |
sts-sru verification-kilo-needed |
|
2016-11-28 02:16:26 |
Hua Zhang |
tags |
sts-sru verification-kilo-needed |
sts-sru verification-kilo-done |
|
2016-12-07 16:22:17 |
James Page |
cloud-archive/kilo: status |
Fix Committed |
Fix Released |
|
2017-01-11 16:01:17 |
Sebastien Bacher |
removed subscriber Ubuntu Sponsors Team |
|
|
|
2017-01-18 09:22:57 |
Christian Ehrhardt |
bug task deleted |
libvirt (Ubuntu Trusty) |
|
|
2017-03-22 15:41:26 |
Louis Bouchard |
tags |
sts-sru verification-kilo-done |
sts-sru-done verification-kilo-done |
|