Xen guest time corrupted after live migration in linux-image-4.15.0-50-generic and linux-image-generic-hwe-18.04

Bug #1830440 reported by Chris Brannon
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Chris Brannon

Bug Description

Effects observed after migration include:
* Stalled SSH connections.
* A crashed or unresponsive virtual machine.
* Inability to properly shut down the virtual machine.
* Incorrect timestamps in dmesg output.

The only effect that is really guaranteed is incorrect timestamps, but the rest are reproducible.

This issue was fixed in version 5.0.0 of the Linux
kernel, and the fix was backported to older supported kernels such as 4.14 and 4.19.
Please cherry-pick the following commit from the master branch of the Linux kernel's git repo:
867cefb4cb1012f42cada1c7d1f35ac8dd276071

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1830440

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Chris Brannon (cmbprgmr)
summary: - Xen guest time handling is broken across migration in linux-
+ Xen guest time corrupted after live migration in linux-
image-4.15.0-50-generic and linux-image-generic-hwe-18.04
Revision history for this message
Chris Brannon (cmbprgmr) wrote :
description: updated
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Hans van Kranenburg (knorrie) wrote :

Hi, I can confirm that this report makes sense. I was the person reporting it upstream and doing some debugging/bisecting to find the cause.

https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg02403.html

The bug causes the uptime of the two dom0s that live migration is done from/to to have an effect on the uptime counter (as seen in dmesg output) of the domU. This causes TCP timestamps to bork all over the place and cause network connections to stall.

I'd advise the reporter to test this kernel with the patch applied and use it in a reproduction scenario, where live migrate hits the bug all the time, and where it doesn't hit it anymore with the patch.

Thanks, Hans

Revision history for this message
Chris Brannon (cmbprgmr) wrote :

This has been fixed as part of https://bugs.launchpad.net/bugs/1837664.

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
assignee: nobody → Chris Brannon (cmbprgmr)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.