Virtual devices in Windows VMs occasionally loose connection to host
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Expired
|
Undecided
|
Unassigned |
Bug Description
Description
===========
Multiple Windows virtual machine running on our OpenStack cloud seems to occasionally loose the connection from their disk & network virtual device (virtio) to the underlying host. The Windows VM is then no longer able to write down to the disk (see below for logs). Additionally the network connection is shown as "No internet access" (yellow alert symbol). Rebooting the affected VM fixes the problems.
Steps to reproduce
==================
We don't know how to reproduce the situation.
Expected result
===============
Windows VMs should run without any virtual device connectivity issues.
Actual result
=============
Virtual devices in Windows VMs occasionally loose connection to host, as described above.
Environment
===========
Host:
* CentOS Linux release 7.5.1804 (Core) / Kernel: 3.10.0-862
* Nova: openstack-
* Hypervisor: Libvirt + KVM (Libvirt 3.9.0 / QEMU 2.10.0)
* Storage: Ceph 10.2.10 (5dc1e4c05cb68d
* Networking: Neutron with OpenVSwitch
VM:
* Windows Server 2016 Standard
* Red Hat VirtIO SCSI Disk Device (Driver version: 10.0.14393.1613)
* Red Hat VirtIO Ethernet Adapter (Driver version: 100.76.104.14900)
* Network: Static IP configuration
Libvirt XML (critical but irelevant data replaced by '***'):
<domain type='kvm' id='466'>
<name>***</name>
<uuid>***</uuid>
<metadata>
<nova:instance xmlns:nova="http://
<nova:package version=
<
<
<nova:flavor name="c04m036">
<
<nova:owner>
<nova:user uuid="*
</nova:owner>
<nova:root type="image" uuid="***"/>
</nova:
</metadata>
<memory unit='KiB'
<currentMemory unit='KiB'
<vcpu placement=
<cputune>
<shares>
</cputune>
<resource>
<partition>
</resource>
<sysinfo type='smbios'>
<system>
<entry name='manufactu
<entry name='product'
<entry name='version'
<entry name='serial'
<entry name='uuid'
<entry name='family'
</system>
</sysinfo>
<os>
<type arch='x86_64' machine=
<boot dev='hd'/>
<smbios mode='sysinfo'/>
</os>
<features>
<acpi/>
<apic/>
<hyperv>
<relaxed state='on'/>
<vapic state='on'/>
<spinlocks state='on' retries='8191'/>
</hyperv>
</features>
<cpu mode='custom' match='exact' check='full'>
<model fallback=
<topology sockets='4' cores='1' threads='1'/>
<feature policy='require' name='vme'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='arat'/>
<feature policy='require' name='xsaveopt'/>
</cpu>
<clock offset='localtime'>
<timer name='pit' tickpolicy=
<timer name='rtc' tickpolicy=
<timer name='hpet' present='no'/>
<timer name='hypervclock' present='yes'/>
</clock>
<on_poweroff>
<on_reboot>
<on_crash>
<devices>
<emulator>
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='writeback'/>
<auth username='***'>
<secret type='ceph' uuid='***'/>
</auth>
<source protocol='rbd' name='***'>
<host name='***' port='***'/>
<host name='***' port='***'/>
<host name='***' port='***'/>
</source>
<target dev='vda' bus='virtio'/>
<
<alias name='virtio-
<address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
</disk>
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='writeback'/>
<auth username='***'>
<secret type='ceph' uuid='***'/>
</auth>
<source protocol='rbd' name='***'>
<host name='***' port='***'/>
<host name='***' port='***'/>
<host name='***' port='***'/>
</source>
<target dev='vdb' bus='virtio'/>
<
<alias name='virtio-
<address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
</disk>
<controller type='usb' index='0' model='nec-xhci'>
<alias name='usb'/>
<address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
</controller>
<controller type='sata' index='0'>
<alias name='ide'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
</controller>
<controller type='pci' index='0' model='pcie-root'>
<alias name='pcie.0'/>
</controller>
<controller type='virtio-
<alias name='virtio-
<address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
</controller>
<controller type='pci' index='1' model='
<model name='pcie-
<target chassis='1' port='0x10'/>
<alias name='pci.1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction=
</controller>
<controller type='pci' index='2' model='
<model name='pcie-
<target chassis='2' port='0x11'/>
<alias name='pci.2'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
</controller>
<controller type='pci' index='3' model='
<model name='pcie-
<target chassis='3' port='0x12'/>
<alias name='pci.3'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
</controller>
<controller type='pci' index='4' model='
<model name='pcie-
<target chassis='4' port='0x13'/>
<alias name='pci.4'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
</controller>
<controller type='pci' index='5' model='
<model name='pcie-
<target chassis='5' port='0x14'/>
<alias name='pci.5'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
</controller>
<controller type='pci' index='6' model='
<model name='pcie-
<target chassis='6' port='0x15'/>
<alias name='pci.6'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
</controller>
<interface type='bridge'>
<mac address='***'/>
<source bridge=
<target dev='tap70392c0
<model type='virtio'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
</interface>
<serial type='pty'>
<source path='/
<log file='/
<target type='isa-serial' port='0'>
<model name='isa-serial'/>
</target>
<alias name='serial0'/>
</serial>
<console type='pty' tty='/dev/pts/12'>
<source path='/
<log file='/
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<channel type='spicevmc'>
<target type='virtio' name='com.
<alias name='channel0'/>
<address type='virtio-
</channel>
<channel type='unix'>
<source mode='bind' path='/
<target type='virtio' name='org.
<alias name='channel1'/>
<address type='virtio-
</channel>
<input type='mouse' bus='ps2'>
<alias name='input0'/>
</input>
<input type='keyboard' bus='ps2'>
<alias name='input1'/>
</input>
<graphics type='spice' port='***' autoport='yes' listen='0.0.0.0' keymap='en-us'>
<listen type='address' address='***'/>
</graphics>
<video>
<model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
</video>
<memballoon model='virtio'>
<stats period='10'/>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
</memballoon>
</devices>
<seclabel type='dynamic' model='selinux' relabel='yes'>
<label>
<imagelabel
</seclabel>
<seclabel type='dynamic' model='dac' relabel='yes'>
<label>
<imagelabel
</seclabel>
</domain>
Logs & Configs
==============
Compute and storage servers:
No relevant log entries found.
Windows VM:
Level Date and Time Source Event ID Task Category
Warning 05.12.2018 07:54:44 vhdmp 129 None "\Device\RaidPort2
"
Warning 05.12.2018 07:54:37 Disk 153 None The IO operation at logical block address 0xa968 for Disk 9 (PDO name: \Device\00000f0d) was retried.
Warning 05.12.2018 07:54:33 NTFS 50 None {Delayed Write Failed} Windows was unable to save all the data for the file . The data has been lost. This error may be caused by a failure of your computer hardware or network connection. Please try to save this file elsewhere.
Information 05.12.2018 07:54:33 Application Popup 26 None Application popup: Windows - Delayed Write Failed : Exception Processing Message 0xc0000222 Parameters 0x7ffdf3051d28 0x7ffdf3051d28 0x7ffdf3051d28 0x7ffdf3051d28
Information 05.12.2018 07:54:33 Application Popup 26 None Application popup: Windows - Delayed Write Failed : Exception Processing Message 0xc0000222 Parameters 0x7ffdf3051d28 0x7ffdf3051d28 0x7ffdf3051d28 0x7ffdf3051d28
Warning 05.12.2018 07:54:33 NTFS 50 None {Delayed Write Failed} Windows was unable to save all the data for the file \$Extend\
Warning 05.12.2018 07:54:33 NTFS 50 None {Delayed Write Failed} Windows was unable to save all the data for the file \AppData\
Warning 05.12.2018 07:54:33 NTFS 50 None {Delayed Write Failed} Windows was unable to save all the data for the file \AppData\
Information 05.12.2018 07:54:33 Application Popup 26 None Application popup: Windows - Delayed Write Failed : Exception Processing Message 0xc0000222 Parameters 0x7ffdf3051d28 0x7ffdf3051d28 0x7ffdf3051d28 0x7ffdf3051d28
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Error 05.12.2018 07:54:32 NTFS 137 (2) The default transaction resource manager on volume \\?\Volume{
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Microsoft-
(A device which does not exist was specified.)"
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 153 None The IO operation at logical block address 0x2dc40 for Disk 6 (PDO name: \Device\00000ea0) was retried.
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 51 None An error was detected on device \Device\
Warning 05.12.2018 07:54:32 Disk 153 None The IO operation at logical block address 0xace0 for Disk 6 (PDO name: \Device\00000ea0) was retried.
Warning 05.12.2018 07:54:32 Disk 153 None The IO operation at logical block address 0x32ac0 for Disk 6 (PDO name: \Device\00000ea0) was retried.
@Pascal: do you have the qemu logs for the problematic instance? By default it is under /var/log/ libvirt/ qemu/
Is there anything generally happen on the compute when the VM lose the device? Could you check the journal of the host around the same time?
I mark this bug Incomplete until the requested logs are provided, please set it back to New state when you respond.