> Hi lickdragon, > That's because the fix turned out to be in the kernel's KVM code; I can > see it in the 4.4-rc1 upstream kernel. Thanks! I found it. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7cae2bedcbd4680b155999655e49c27b9cf020fa > > Now, what's really interesting is that current->sched_info.run_delay > > gets reset because the tasks (threads) using the vCPUs change, and > > thus have a different current->sched_info: it looks like task 18446 > > created the two vCPUs, and then they were handed over to 18448 and > > 18449 respectively. This is also verified by the fact that during the > > overflow, both vCPUs have the old steal time of the last vcpu_load of > > > task 18446. However, according to Documentation/virtual/kvm/api.txt: > The above is not entirely accurate: the vCPUs were created by the > threads that are used to run them (18448 and 18449 respectively), it's > just that the main thread is issuing ioctls during initialization, as > illustrated by the strace output on a different process: > > [ vCPU #0 thread creating vCPU #0 (fd 20) ] > [pid 1861] ioctl(14, KVM_CREATE_VCPU, 0) = 20 > [pid 1861] ioctl(20, KVM_X86_SETUP_MCE, 0x7fbd3ca40cd8) = 0 > [pid 1861] ioctl(20, KVM_SET_CPUID2, 0x7fbd3ca40ce0) = 0 > [pid 1861] ioctl(20, KVM_SET_SIGNAL_MASK, 0x7fbd380008f0) = 0 > > [ vCPU #1 thread creating vCPU #1 (fd 21) ] > [pid 1862] ioctl(14, KVM_CREATE_VCPU, 0x1) = 21 > [pid 1862] ioctl(21, KVM_X86_SETUP_MCE, 0x7fbd37ffdcd8) = 0 > [pid 1862] ioctl(21, KVM_SET_CPUID2, 0x7fbd37ffdce0) = 0 > [pid 1862] ioctl(21, KVM_SET_SIGNAL_MASK, 0x7fbd300008f0) = 0 > > [ Main thread calling kvm_arch_put_registers() on vCPU #0 ] > [pid 1859] ioctl(20, KVM_SET_REGS, 0x7ffc98aac230) = 0 > [pid 1859] ioctl(20, KVM_SET_XSAVE or KVM_SIGNAL_MSI, 0x7fbd38001000) = > 0 [pid 1859] ioctl(20, KVM_PPC_ALLOCATE_HTAB or KVM_SET_XCRS, > 0x7ffc98aac010) = 0 [pid 1859] ioctl(20, KVM_SET_SREGS, 0x7ffc98aac050) = > 0 > [pid 1859] ioctl(20, KVM_SET_MSRS, 0x7ffc98aab820) = 87 > [pid 1859] ioctl(20, KVM_SET_MP_STATE, 0x7ffc98aac230) = 0 > [pid 1859] ioctl(20, KVM_SET_LAPIC, 0x7ffc98aabd80) = 0 > [pid 1859] ioctl(20, KVM_SET_MSRS, 0x7ffc98aac1b0) = 1 > [pid 1859] ioctl(20, KVM_SET_PIT2 or KVM_SET_VCPU_EVENTS, > 0x7ffc98aac1b0) = 0 [pid 1859] ioctl(20, KVM_SET_DEBUGREGS or > KVM_SET_TSC_KHZ, 0x7ffc98aac1b0) = 0 > > [ Main thread calling kvm_arch_put_registers() on vCPU #1 ] > [pid 1859] ioctl(21, KVM_SET_REGS, 0x7ffc98aac230) = 0 > [pid 1859] ioctl(21, KVM_SET_XSAVE or KVM_SIGNAL_MSI, 0x7fbd30001000) = > 0 [pid 1859] ioctl(21, KVM_PPC_ALLOCATE_HTAB or KVM_SET_XCRS, > 0x7ffc98aac010) = 0 [pid 1859] ioctl(21, KVM_SET_SREGS, 0x7ffc98aac050) = > 0 > [pid 1859] ioctl(21, KVM_SET_MSRS, 0x7ffc98aab820) = 87 > [pid 1859] ioctl(21, KVM_SET_MP_STATE, 0x7ffc98aac230) = 0 > [pid 1859] ioctl(21, KVM_SET_LAPIC, 0x7ffc98aabd80) = 0 > [pid 1859] ioctl(21, KVM_SET_MSRS, 0x7ffc98aac1b0) = 1 > [pid 1859] ioctl(21, KVM_SET_PIT2 or KVM_SET_VCPU_EVENTS, > 0x7ffc98aac1b0) = 0 [pid 1859] ioctl(21, KVM_SET_DEBUGREGS or > KVM_SET_TSC_KHZ, 0x7ffc98aac1b0) = 0 > > Using systemtap again, I noticed that the main thread's run_delay is > copied to last_steal only after a KVM_SET_MSRS ioctl which enables the > steal time MSR is issued by the main thread (see linux > 3.16.7-ckt11-1/arch/x86/kvm/x86.c:2162). Taking an educated guess, I > reverted the following qemu commits: > > commit 0e5035776df31380a44a1a851850d110b551ecb6 > Author: Marcelo Tosatti