Comment 9 for bug 1494350

Revision history for this message
lickdragon (csights) wrote : Re: [Bug 1494350] Re: QEMU: causes vCPU steal time overflow on live migration

> Hi lickdragon,
> That's because the fix turned out to be in the kernel's KVM code; I can
> see it in the 4.4-rc1 upstream kernel.

Thanks! I found it.
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7cae2bedcbd4680b155999655e49c27b9cf020fa

> > Now, what's really interesting is that current->sched_info.run_delay
> > gets reset because the tasks (threads) using the vCPUs change, and
> > thus have a different current->sched_info: it looks like task 18446
> > created the two vCPUs, and then they were handed over to 18448 and
> > 18449 respectively. This is also verified by the fact that during the
> > overflow, both vCPUs have the old steal time of the last vcpu_load of
>
> > task 18446. However, according to Documentation/virtual/kvm/api.txt:
> The above is not entirely accurate: the vCPUs were created by the
> threads that are used to run them (18448 and 18449 respectively), it's
> just that the main thread is issuing ioctls during initialization, as
> illustrated by the strace output on a different process:
>
> [ vCPU #0 thread creating vCPU #0 (fd 20) ]
> [pid 1861] ioctl(14, KVM_CREATE_VCPU, 0) = 20
> [pid 1861] ioctl(20, KVM_X86_SETUP_MCE, 0x7fbd3ca40cd8) = 0
> [pid 1861] ioctl(20, KVM_SET_CPUID2, 0x7fbd3ca40ce0) = 0
> [pid 1861] ioctl(20, KVM_SET_SIGNAL_MASK, 0x7fbd380008f0) = 0
>
> [ vCPU #1 thread creating vCPU #1 (fd 21) ]
> [pid 1862] ioctl(14, KVM_CREATE_VCPU, 0x1) = 21
> [pid 1862] ioctl(21, KVM_X86_SETUP_MCE, 0x7fbd37ffdcd8) = 0
> [pid 1862] ioctl(21, KVM_SET_CPUID2, 0x7fbd37ffdce0) = 0
> [pid 1862] ioctl(21, KVM_SET_SIGNAL_MASK, 0x7fbd300008f0) = 0
>
> [ Main thread calling kvm_arch_put_registers() on vCPU #0 ]
> [pid 1859] ioctl(20, KVM_SET_REGS, 0x7ffc98aac230) = 0
> [pid 1859] ioctl(20, KVM_SET_XSAVE or KVM_SIGNAL_MSI, 0x7fbd38001000) =
> 0 [pid 1859] ioctl(20, KVM_PPC_ALLOCATE_HTAB or KVM_SET_XCRS,
> 0x7ffc98aac010) = 0 [pid 1859] ioctl(20, KVM_SET_SREGS, 0x7ffc98aac050) =
> 0
> [pid 1859] ioctl(20, KVM_SET_MSRS, 0x7ffc98aab820) = 87
> [pid 1859] ioctl(20, KVM_SET_MP_STATE, 0x7ffc98aac230) = 0
> [pid 1859] ioctl(20, KVM_SET_LAPIC, 0x7ffc98aabd80) = 0
> [pid 1859] ioctl(20, KVM_SET_MSRS, 0x7ffc98aac1b0) = 1
> [pid 1859] ioctl(20, KVM_SET_PIT2 or KVM_SET_VCPU_EVENTS,
> 0x7ffc98aac1b0) = 0 [pid 1859] ioctl(20, KVM_SET_DEBUGREGS or
> KVM_SET_TSC_KHZ, 0x7ffc98aac1b0) = 0
>
> [ Main thread calling kvm_arch_put_registers() on vCPU #1 ]
> [pid 1859] ioctl(21, KVM_SET_REGS, 0x7ffc98aac230) = 0
> [pid 1859] ioctl(21, KVM_SET_XSAVE or KVM_SIGNAL_MSI, 0x7fbd30001000) =
> 0 [pid 1859] ioctl(21, KVM_PPC_ALLOCATE_HTAB or KVM_SET_XCRS,
> 0x7ffc98aac010) = 0 [pid 1859] ioctl(21, KVM_SET_SREGS, 0x7ffc98aac050) =
> 0
> [pid 1859] ioctl(21, KVM_SET_MSRS, 0x7ffc98aab820) = 87
> [pid 1859] ioctl(21, KVM_SET_MP_STATE, 0x7ffc98aac230) = 0
> [pid 1859] ioctl(21, KVM_SET_LAPIC, 0x7ffc98aabd80) = 0
> [pid 1859] ioctl(21, KVM_SET_MSRS, 0x7ffc98aac1b0) = 1
> [pid 1859] ioctl(21, KVM_SET_PIT2 or KVM_SET_VCPU_EVENTS,
> 0x7ffc98aac1b0) = 0 [pid 1859] ioctl(21, KVM_SET_DEBUGREGS or
> KVM_SET_TSC_KHZ, 0x7ffc98aac1b0) = 0
>
> Using systemtap again, I noticed that the main thread's run_delay is
> copied to last_steal only after a KVM_SET_MSRS ioctl which enables the
> steal time MSR is issued by the main thread (see linux
> 3.16.7-ckt11-1/arch/x86/kvm/x86.c:2162). Taking an educated guess, I
> reverted the following qemu commits:
>
> commit 0e5035776df31380a44a1a851850d110b551ecb6
> Author: Marcelo Tosatti <email address hidden>
> Date: Tue Sep 3 18:55:16 2013 -0300
>
> fix steal time MSR vmsd callback to proper opaque type
>
> Convert steal time MSR vmsd callback pointer to proper X86CPU type.
>
> Signed-off-by: Marcelo Tosatti <email address hidden>
> Signed-off-by: Paolo Bonzini <email address hidden>
>
> commit 917367aa968fd4fef29d340e0c7ec8c608dffaab
> Author: Marcelo Tosatti <email address hidden>
> Date: Tue Feb 19 23:27:20 2013 -0300
>
> target-i386: kvm: save/restore steal time MSR
>
> Read and write steal time MSR, so that reporting is functional across
> migration.
>
> Signed-off-by: Marcelo Tosatti <email address hidden>
> Signed-off-by: Gleb Natapov <email address hidden>
>
> and the steal time jump on migration went away. However, steal time was
> not reported at all after migration, which is expected after reverting
> 917367aa.
>
> So it seems that after 917367aa, the steal time MSR is correctly saved
> and copied to the receiving side, but then it is restored by the main
> thread (probably during cpu_synchronize_all_post_init()), causing the
> overflow when the vCPU threads are unpaused.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/qemu/+bug/1494350/+subscriptions