Activity log for bug #1705132

Date Who What changed Old value New value Message
2017-07-18 21:44:42 Jorge Niedbalski bug added bug
2017-07-19 06:17:47 Christian Ehrhardt  bug added subscriber ChristianEhrhardt
2017-07-19 06:57:33 Christian Ehrhardt  nominated for series Ubuntu Xenial
2017-07-19 06:57:33 Christian Ehrhardt  bug task added libvirt (Ubuntu Xenial)
2017-07-19 06:57:33 Christian Ehrhardt  nominated for series Ubuntu Zesty
2017-07-19 06:57:33 Christian Ehrhardt  bug task added libvirt (Ubuntu Zesty)
2017-07-19 06:57:33 Christian Ehrhardt  nominated for series Ubuntu Yakkety
2017-07-19 06:57:33 Christian Ehrhardt  bug task added libvirt (Ubuntu Yakkety)
2017-07-19 06:57:39 Christian Ehrhardt  libvirt (Ubuntu): status New Fix Released
2017-07-19 06:57:53 Christian Ehrhardt  libvirt (Ubuntu Xenial): status New Triaged
2017-07-19 06:57:55 Christian Ehrhardt  libvirt (Ubuntu Yakkety): status New Triaged
2017-07-19 06:57:57 Christian Ehrhardt  libvirt (Ubuntu Zesty): status New Triaged
2017-07-19 06:58:01 Christian Ehrhardt  libvirt (Ubuntu Xenial): status Triaged Incomplete
2017-07-19 14:57:09 Christian Ehrhardt  libvirt (Ubuntu Xenial): status Incomplete In Progress
2017-07-19 14:57:11 Christian Ehrhardt  libvirt (Ubuntu Yakkety): status Triaged In Progress
2017-07-19 14:57:13 Christian Ehrhardt  libvirt (Ubuntu Zesty): status Triaged In Progress
2017-07-19 16:14:29 Jorge Niedbalski description [Environment] Description: Ubuntu 16.04.2 LTS Release: 16.04 Codename: xenial root@buneary:/home/ubuntu# dpkg -l | grep kvm ii qemu-kvm 1:2.5+dfsg-5ubuntu10.14 amd64 QEMU Full virtualization [Description] - Configured a machine with 32 static VCPUs, 160GB of RAM using 1G hugepages on a NUMA capable machine. Domain definition (http://pastebin.ubuntu.com/25121106/) - Once started (virsh start). Libvirt log. LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name reproducer2 -S -machine pc-i440fx-2.5,accel=kvm,usb=off -cpu host -m 124928 -realtime mlock=off -smp 32,sockets=16,cores=1,threads=2 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=yes,size=64424509440,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-15,memdev=ram-node0 -object memory-backend-file,id=ram-node1,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=yes,size=66571993088,host-nodes=1,policy=bind -numa node,nodeid=1,cpus=16-31,memdev=ram-node1 -uuid d7a4af7f-7549-4b44-8ceb-4a6c951388d4 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-reproducer2/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/uvtool/libvirt/images/test.qcow,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on Then the following error is raised. virsh start reproducer2 error: Failed to start domain reproducer2 error: monitor socket did not show up: No such file or directory [Possible Fix] https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=85af0b803cd19a03f71bd01ab4e045552410368f;hp=67dcb797ed7f1fbb048aa47006576f424923933b [Description] - Configured a machine with 32 static VCPUs, 160GB of RAM using 1G hugepages on a NUMA capable machine. Domain definition (http://pastebin.ubuntu.com/25121106/) - Once started (virsh start). Libvirt log. LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name reproducer2 -S -machine pc-i440fx-2.5,accel=kvm,usb=off -cpu host -m 124928 -realtime mlock=off -smp 32,sockets=16,cores=1,threads=2 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=yes,size=64424509440,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-15,memdev=ram-node0 -object memory-backend-file,id=ram-node1,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=yes,size=66571993088,host-nodes=1,policy=bind -numa node,nodeid=1,cpus=16-31,memdev=ram-node1 -uuid d7a4af7f-7549-4b44-8ceb-4a6c951388d4 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-reproducer2/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/uvtool/libvirt/images/test.qcow,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on Then the following error is raised. virsh start reproducer2 error: Failed to start domain reproducer2 error: monitor socket did not show up: No such file or directory [Impact] * Cannot start virtual machines with large pools of memory allocated on NUMA nodes. [Test Case] * Configure a Machine with at least 2 NUMA nodes. root@buneary:/home/ubuntu# virsh freepages 0 1G 1048576KiB: 60 root@buneary:/home/ubuntu# virsh freepages 1 1G 1048576KiB: 62 * Create a guest that uses the full amount of available huge pages (on this case 122). (full guest definition: http://paste.ubuntu.com/25125500/) <memory unit='GiB'>120</memory> <currentMemory unit='GiB'>120</currentMemory> <memoryBacking> <hugepages> <page size='1' unit='GiB' nodeset='0'/> <page size='1' unit='GiB' nodeset='1'/> </hugepages> </memoryBacking> <cpu mode='host-passthrough'> <topology sockets='16' cores='1' threads='2'/> <numa> <cell id='0' cpus='0-15' memory='60' unit='GiB' memAccess='shared'/> <cell id='1' cpus='16-31' memory='62' unit='GiB' memAccess='shared'/> </numa> </cpu> * Define the guest, and try to start it. $ virsh define reproducer.xml $ virsh start reproducer * Verify that the following error is raised: root@buneary:/home/ubuntu# virsh start reproducer2 error: Failed to start domain reproducer2 error: monitor socket did not show up: No such file or directory [Expected Behavior] * Machine is started without issues as displayed https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1705132/comments/7 [Regression Potential] * None identified. [Other Info] https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=85af0b803cd19a03f71bd01ab4e045552410368f;hp=67dcb797ed7f1fbb048aa47006576f424923933b
2017-07-19 16:15:04 Jorge Niedbalski tags sts-sru-needed
2017-07-19 18:32:39 Dominique Poulain bug added subscriber Dominique Poulain
2017-07-20 07:40:56 Christian Ehrhardt  description [Description] - Configured a machine with 32 static VCPUs, 160GB of RAM using 1G hugepages on a NUMA capable machine. Domain definition (http://pastebin.ubuntu.com/25121106/) - Once started (virsh start). Libvirt log. LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name reproducer2 -S -machine pc-i440fx-2.5,accel=kvm,usb=off -cpu host -m 124928 -realtime mlock=off -smp 32,sockets=16,cores=1,threads=2 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=yes,size=64424509440,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-15,memdev=ram-node0 -object memory-backend-file,id=ram-node1,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=yes,size=66571993088,host-nodes=1,policy=bind -numa node,nodeid=1,cpus=16-31,memdev=ram-node1 -uuid d7a4af7f-7549-4b44-8ceb-4a6c951388d4 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-reproducer2/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/uvtool/libvirt/images/test.qcow,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on Then the following error is raised. virsh start reproducer2 error: Failed to start domain reproducer2 error: monitor socket did not show up: No such file or directory [Impact] * Cannot start virtual machines with large pools of memory allocated on NUMA nodes. [Test Case] * Configure a Machine with at least 2 NUMA nodes. root@buneary:/home/ubuntu# virsh freepages 0 1G 1048576KiB: 60 root@buneary:/home/ubuntu# virsh freepages 1 1G 1048576KiB: 62 * Create a guest that uses the full amount of available huge pages (on this case 122). (full guest definition: http://paste.ubuntu.com/25125500/) <memory unit='GiB'>120</memory> <currentMemory unit='GiB'>120</currentMemory> <memoryBacking> <hugepages> <page size='1' unit='GiB' nodeset='0'/> <page size='1' unit='GiB' nodeset='1'/> </hugepages> </memoryBacking> <cpu mode='host-passthrough'> <topology sockets='16' cores='1' threads='2'/> <numa> <cell id='0' cpus='0-15' memory='60' unit='GiB' memAccess='shared'/> <cell id='1' cpus='16-31' memory='62' unit='GiB' memAccess='shared'/> </numa> </cpu> * Define the guest, and try to start it. $ virsh define reproducer.xml $ virsh start reproducer * Verify that the following error is raised: root@buneary:/home/ubuntu# virsh start reproducer2 error: Failed to start domain reproducer2 error: monitor socket did not show up: No such file or directory [Expected Behavior] * Machine is started without issues as displayed https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1705132/comments/7 [Regression Potential] * None identified. [Other Info] https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=85af0b803cd19a03f71bd01ab4e045552410368f;hp=67dcb797ed7f1fbb048aa47006576f424923933b [Description] - Configured a machine with 32 static VCPUs, 160GB of RAM using 1G hugepages on a NUMA capable machine. Domain definition (http://pastebin.ubuntu.com/25121106/) - Once started (virsh start). Libvirt log. LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name reproducer2 -S -machine pc-i440fx-2.5,accel=kvm,usb=off -cpu host -m 124928 -realtime mlock=off -smp 32,sockets=16,cores=1,threads=2 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=yes,size=64424509440,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-15,memdev=ram-node0 -object memory-backend-file,id=ram-node1,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=yes,size=66571993088,host-nodes=1,policy=bind -numa node,nodeid=1,cpus=16-31,memdev=ram-node1 -uuid d7a4af7f-7549-4b44-8ceb-4a6c951388d4 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-reproducer2/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/uvtool/libvirt/images/test.qcow,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on Then the following error is raised. virsh start reproducer2 error: Failed to start domain reproducer2 error: monitor socket did not show up: No such file or directory - The fix is done via backports, as a TL;DR the change does: 1. instead of sleeping too short (1ms) in a loop for very long start small but exponentially increase for the few cases that need long. That way fast actions are done fast, but long actions are no cpu-hogs 2. huge guests get ~1s per 1Gb extra timeout to come up, that allows huge guests to initialize properly. [Impact]   * Cannot start virtual machines with large pools of memory allocated on NUMA nodes. [Test Case] * this is a tradeoff of memory clearing speed vs guest size. Once the clearing of guest memory exceeds ~30 seconds the issue will trigger. * Guest must be backed by huge pages as otherwise the kernel will fault in on demand instead of needing the initial clear.  * One way to "slow down" is to Configure a Machine with multiple NUMA nodes.    root@buneary:/home/ubuntu# virsh freepages 0 1G    1048576KiB: 60    root@buneary:/home/ubuntu# virsh freepages 1 1G    1048576KiB: 62  * Another one to slow down the init is to just use a really heg guest. In the example 122G guest was enough. (full guest definition: http://paste.ubuntu.com/25125500/) <memory unit='GiB'>120</memory>   <currentMemory unit='GiB'>120</currentMemory>   <memoryBacking>     <hugepages>       <page size='1' unit='GiB' nodeset='0'/>       <page size='1' unit='GiB' nodeset='1'/>     </hugepages>   </memoryBacking>   <cpu mode='host-passthrough'>     <topology sockets='16' cores='1' threads='2'/>     <numa>       <cell id='0' cpus='0-15' memory='60' unit='GiB' memAccess='shared'/>       <cell id='1' cpus='16-31' memory='62' unit='GiB' memAccess='shared'/>     </numa>   </cpu>  * Define the guest, and try to start it.   $ virsh define reproducer.xml   $ virsh start reproducer * Verify that the following error is raised: root@buneary:/home/ubuntu# virsh start reproducer2 error: Failed to start domain reproducer2 error: monitor socket did not show up: No such file or directory [Expected Behavior] * Machine is started without issues as displayed https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1705132/comments/7 [Regression Potential]  * The behavior on timeouts around starting a guest changed. We backported the fix along with a fix to that new behavior (where guests seemed to wait forever due to the exponential wait). Still the "allowed" wait time is increased, but users might expect it instantly as they are used from their laptop. Now if one starts a 1TB guest the allowed time is base+1000s. A user might think a while it is broken or hanging, but there is no way to avoid that. OTOH before the fix it would have failed to start after 30 seconds so not really a regression IMHO. [Other Info] https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=85af0b803cd19a03f71bd01ab4e045552410368f;hp=67dcb797ed7f1fbb048aa47006576f424923933b
2017-07-22 06:07:08 James Page bug task added cloud-archive
2017-07-22 06:07:19 James Page nominated for series cloud-archive/mitaka
2017-07-22 06:07:19 James Page bug task added cloud-archive/mitaka
2017-07-22 06:07:28 James Page cloud-archive: status New Fix Released
2017-07-22 06:08:02 James Page nominated for series cloud-archive/ocata
2017-07-22 06:08:02 James Page bug task added cloud-archive/ocata
2017-07-22 06:08:49 James Page cloud-archive/mitaka: importance Undecided Medium
2017-07-22 06:08:52 James Page cloud-archive/ocata: importance Undecided Medium
2017-07-24 16:56:45 Felipe Reyes tags sts-sru-needed sts sts-sru-needed
2017-07-27 22:54:57 Brian Murray libvirt (Ubuntu Yakkety): status In Progress Won't Fix
2017-07-27 22:55:35 Brian Murray libvirt (Ubuntu Zesty): status In Progress Fix Committed
2017-07-27 22:55:36 Brian Murray bug added subscriber Ubuntu Stable Release Updates Team
2017-07-27 22:55:38 Brian Murray bug added subscriber SRU Verification
2017-07-27 22:55:44 Brian Murray tags sts sts-sru-needed sts sts-sru-needed verification-needed verification-needed-zesty
2017-07-27 22:58:32 Brian Murray libvirt (Ubuntu Xenial): status In Progress Fix Committed
2017-07-27 22:58:40 Brian Murray tags sts sts-sru-needed verification-needed verification-needed-zesty sts sts-sru-needed verification-needed verification-needed-xenial verification-needed-zesty
2017-07-28 02:10:31 Jorge Niedbalski tags sts sts-sru-needed verification-needed verification-needed-xenial verification-needed-zesty sts sts-sru-needed verification-done-xenial verification-needed verification-needed-zesty
2017-07-31 14:22:00 James Page cloud-archive/ocata: status New Fix Committed
2017-07-31 14:22:01 James Page tags sts sts-sru-needed verification-done-xenial verification-needed verification-needed-zesty sts sts-sru-needed verification-done-xenial verification-needed verification-needed-zesty verification-ocata-needed
2017-07-31 14:24:11 James Page cloud-archive/mitaka: status New Fix Committed
2017-07-31 14:24:12 James Page tags sts sts-sru-needed verification-done-xenial verification-needed verification-needed-zesty verification-ocata-needed sts sts-sru-needed verification-done-xenial verification-mitaka-needed verification-needed verification-needed-zesty verification-ocata-needed
2017-08-01 09:57:21 Launchpad Janitor libvirt (Ubuntu Xenial): status Fix Committed Fix Released
2017-08-01 09:57:27 Adam Conrad removed subscriber Ubuntu Stable Release Updates Team
2017-08-01 14:32:37 Jorge Niedbalski tags sts sts-sru-needed verification-done-xenial verification-mitaka-needed verification-needed verification-needed-zesty verification-ocata-needed sts sts-sru-needed verification-done-xenial verification-mitaka-done verification-needed verification-needed-zesty verification-ocata-needed
2017-08-04 20:13:16 Jorge Niedbalski tags sts sts-sru-needed verification-done-xenial verification-mitaka-done verification-needed verification-needed-zesty verification-ocata-needed sts sts-sru-needed verification-done-xenial verification-done-zesty verification-mitaka-done verification-needed verification-ocata-done
2017-08-07 15:23:35 Launchpad Janitor libvirt (Ubuntu Zesty): status Fix Committed Fix Released
2017-08-09 07:46:32 Christian Ehrhardt  bug added subscriber Ubuntu Stable Release Updates Team
2017-08-21 14:47:24 Edward Hope-Morley nominated for series cloud-archive/newton
2017-08-22 18:59:16 Ryan Beisner cloud-archive/mitaka: status Fix Committed Fix Released
2017-08-22 20:02:02 Ryan Beisner cloud-archive/ocata: status Fix Committed Fix Released