I think this is good to go, and does not need any more additional testing, assuming you have applied the changes from the stable-10.3.5-quiesced-snapshot branch . All these changes are also in the 10.3.10 release, which has gone through our testing already. I also did test your packages before with the same changes.
Thanks,
Oliver
________________________________
From: <email address hidden> <email address hidden> on behalf of Christian Ehrhardt <email address hidden>
Sent: Tuesday, April 9, 2019 11:10 PM
To: Oliver Kurth
Subject: [Bug 1814832] Re: Correct and/or improve handling of certain quiesced snapshot failures
@Oliver - any update on this testing?
That is the only one missing still to release this.
Title:
Correct and/or improve handling of certain quiesced snapshot failures
Status in open-vm-tools package in Ubuntu:
Fix Released
Status in open-vm-tools source package in Bionic:
Fix Committed
Status in open-vm-tools source package in Cosmic:
Fix Committed
Status in open-vm-tools package in Debian:
Fix Released
Bug description:
[Impact]
* Upstream identified an issue that can occur on aborted (or
due to communication issues while doing) quiesced snapshots.
* Backport the upstream changes as part of our work getting the latest
10.3.5 to the latest Ubuntu LTS (Bionic)
[Test Case]
* This is hard to test, but fortunately VMWare who have the right setup
for this tested our change from a PPA. I'll ask for that again on SRU.
Never the less I'll outline roughly what is needed to trigger [1]:
1. Use the host side interface to trigger a quiesced snapshot
2. this is the hard part - have communication failures between vmtools
(guest) and VMX (host) while this is ongoing.
3. From the Hosts POV the operation is aborted, but vmtools sends a
manifest eventually
4. Receiving this will make VMX reply a error (as it didn't wait for
anything like it)
5. Finally this broke the state machine and in subsequent cases vmtools
will not send a manifest again
* Further related fixes make sure vmtoolsd give up if VMX aborted the
snapshot [2] and another [3] makes sure manifests are always sent to
avoid any desync between VMX and vmtoolsd
* This is quite a change to the snapshot handling, so in theory there a
regression has to be assumed. Due to a lack of testcases and expertise
on our side that was handed to VMWare itself who have a much wider
matrix of tests and setups to run them on.
This was tested and confirmed good (even before the change made
it upstream).
* Furthermore those kind of snapshots are relevant to those
who use them (and they most likely want the fix for reliability as you
could get into a state where no further snapshots were possible). But
OTOH the majority of users of the open-vm-tools package most likely
don't use the feature at all. Fortunately changes are local to only the
vmbackup functionality.
[Other Info]
* n/a
---
Customers may hit issues with quiesced snapshots under certain
circumstances. This is fixed in a branch forked from 10.3.5:
I think this is good to go, and does not need any more additional testing, assuming you have applied the changes from the stable- 10.3.5- quiesced- snapshot branch . All these changes are also in the 10.3.10 release, which has gone through our testing already. I also did test your packages before with the same changes.
Thanks,
Oliver
_______ _______ _______ _______ ____
From: <email address hidden> <email address hidden> on behalf of Christian Ehrhardt <email address hidden>
Sent: Tuesday, April 9, 2019 11:10 PM
To: Oliver Kurth
Subject: [Bug 1814832] Re: Correct and/or improve handling of certain quiesced snapshot failures
@Oliver - any update on this testing?
That is the only one missing still to release this.
-- /nam04. safelinks. protection. outlook. com/?url= https%3A% 2F%2Fbugs. launchpad. net%2Fbugs% 2F1814832& amp;data= 02%7C01% 7Cokurth% 40vmware. com%7C4609d9c88 c5845489b4408d6 bd7cae0e% 7Cb39138ca3cee4 b4aa4d6cd83d9dd 62f0%7C0% 7C0%7C636904740 527856210& amp;sdata= mH6Si7A4rok4Wel YvcvT42Y5nQY85h uMEf8udAfGsSY% 3D& reserved= 0
You received this bug notification because you are subscribed to the bug
report.
https:/
Title:
Correct and/or improve handling of certain quiesced snapshot failures
Status in open-vm-tools package in Ubuntu:
Fix Released
Status in open-vm-tools source package in Bionic:
Fix Committed
Status in open-vm-tools source package in Cosmic:
Fix Committed
Status in open-vm-tools package in Debian:
Fix Released
Bug description:
[Impact]
* Upstream identified an issue that can occur on aborted (or
due to communication issues while doing) quiesced snapshots.
* Backport the upstream changes as part of our work getting the latest
10.3.5 to the latest Ubuntu LTS (Bionic)
[Test Case]
* This is hard to test, but fortunately VMWare who have the right setup
for this tested our change from a PPA. I'll ask for that again on SRU.
Never the less I'll outline roughly what is needed to trigger [1]:
1. Use the host side interface to trigger a quiesced snapshot
2. this is the hard part - have communication failures between vmtools
(guest) and VMX (host) while this is ongoing.
3. From the Hosts POV the operation is aborted, but vmtools sends a
manifest eventually
4. Receiving this will make VMX reply a error (as it didn't wait for
anything like it)
5. Finally this broke the state machine and in subsequent cases vmtools
will not send a manifest again
* Further related fixes make sure vmtoolsd give up if VMX aborted the
snapshot [2] and another [3] makes sure manifests are always sent to
avoid any desync between VMX and vmtoolsd
[1]: https:/ /nam04. safelinks. protection. outlook. com/?url= https%3A% 2F%2Fgithub. com%2Fvmware% 2Fopen- vm-tools% 2Fcommit% 2Fa1306fcbb6de6 eae5344d5d74747 068ea89aa5fc& amp;data= 02%7C01% 7Cokurth% 40vmware. com%7C4609d9c88 c5845489b4408d6 bd7cae0e% 7Cb39138ca3cee4 b4aa4d6cd83d9dd 62f0%7C0% 7C0%7C636904740 527856210& amp;sdata= jfveRSloyf1AgyB Jxg8jmLLGo3RJv7 e%2F3%2FTbkEHPc TI%3D& reserved= 0 /nam04. safelinks. protection. outlook. com/?url= https%3A% 2F%2Fgithub. com%2Fvmware% 2Fopen- vm-tools% 2Fcommit% 2F0c9174716ba82 8899418ba07efc3 aab0bff004cc& amp;data= 02%7C01% 7Cokurth% 40vmware. com%7C4609d9c88 c5845489b4408d6 bd7cae0e% 7Cb39138ca3cee4 b4aa4d6cd83d9dd 62f0%7C0% 7C0%7C636904740 527856210& amp;sdata= hjgIWWtoV1NsKhO ODYedPsUUU8Gwob qe9LN36uEDpZE% 3D& reserved= 0 /nam04. safelinks. protection. outlook. com/?url= https%3A% 2F%2Fgithub. com%2Fvmware% 2Fopen- vm-tools% 2Fcommit% 2Fc31710b3942f4 8b1c11ebde36f34 e7e159d1cbf0& amp;data= 02%7C01% 7Cokurth% 40vmware. com%7C4609d9c88 c5845489b4408d6 bd7cae0e% 7Cb39138ca3cee4 b4aa4d6cd83d9dd 62f0%7C0% 7C0%7C636904740 527856210& amp;sdata= GpYC68GodLJXZpk qbXA0nUGAB2V95% 2BuDVQ0kzZg9hi0 %3D& reserved= 0
[2]: https:/
[3]: https:/
[Regression Potential]
* This is quite a change to the snapshot handling, so in theory there a
regression has to be assumed. Due to a lack of testcases and expertise
on our side that was handed to VMWare itself who have a much wider
matrix of tests and setups to run them on.
This was tested and confirmed good (even before the change made
it upstream).
* Furthermore those kind of snapshots are relevant to those
who use them (and they most likely want the fix for reliability as you
could get into a state where no further snapshots were possible). But
OTOH the majority of users of the open-vm-tools package most likely
don't use the feature at all. Fortunately changes are local to only the
vmbackup functionality.
[Other Info]
* n/a
---
Customers may hit issues with quiesced snapshots under certain
circumstances. This is fixed in a branch forked from 10.3.5:
https:/ /nam04. safelinks. protection. outlook. com/?url= https%3A% 2F%2Fgithub. com%2Fvmware% 2Fopen- vm-tools% 2Ftree% 2Fstable- 10.3.5- quiesced- &data= 02%7C01% 7Cokurth% 40vmware. com%7C4609d9c88 c5845489b4408d6 bd7cae0e% 7Cb39138ca3cee4 b4aa4d6cd83d9dd 62f0%7C0% 7C0%7C636904740 527856210& amp;sdata= sMJqtiwDMw% 2B4Iw0YCAmbkOUn %2FKbdkLFzabcj5 x3u7aM% 3D& reserved= 0
snapshot
A more detailed description of the issue can be found in the
individual commit messages.
Also filed at Debian: https:/ /nam04. safelinks. protection. outlook. com/?url= https%3A% 2F%2Fbugs. debian. org%2Fcgi- &data= 02%7C01% 7Cokurth% 40vmware. com%7C4609d9c88 c5845489b4408d6 bd7cae0e% 7Cb39138ca3cee4 b4aa4d6cd83d9dd 62f0%7C0% 7C0%7C636904740 527856210& amp;sdata= k3dP%2FSN1uzs0D 2dlj%2B% 2Blhnf3o4dR% 2Fj5QUwtYwoDeBf s%3D& reserved= 0 .cgi?bug=921470
bin/bugreport
To manage notifications about this bug go to: /nam04. safelinks. protection. outlook. com/?url= https%3A% 2F%2Fbugs. launchpad. net%2Fubuntu% 2F%2Bsource% 2Fopen- vm-tools% 2F%2Bbug% 2F1814832% 2F%2Bsubscripti ons& data=02% 7C01%7Cokurth% 40vmware. com%7C4609d9c88 c5845489b4408d6 bd7cae0e% 7Cb39138ca3cee4 b4aa4d6cd83d9dd 62f0%7C0% 7C0%7C636904740 527856210& amp;sdata= vPhxP7gj8vffNOX R5Yv9cD% 2BpiJXldEcUhUAr HY%2FgqaQ% 3D& reserved= 0
https:/