Few (vFirefly) instances hungs after boot and are unreachable

Bug #1507882 reported by Vijay Anand
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenContrail
New
Undecided
Unassigned

Bug Description

Build: 2.21(102)

Test: 20 parallel instances
Service VM : vFirefly

Problem: 3/20 VMs hungs after boot - Console and Instance IPs are unreachable.

Note:
 Problem not seen consistently but noticed more than twice.

Logs:
vijanand@bng-lnx-shell2#pwd
/homes/vijanand/Contrail-2.21-bugs/Console-hang
vijanand@bng-lnx-shell2#cd Console-hang-log/
vijanand@bng-lnx-shell2#ls -l
total 12K
drwxr-x--- 2 vijanand slt 4096 Oct 20 11:33 contrail
-rw-r--r-- 1 vijanand slt 3454 Oct 20 11:33 contrail.log
drwxr-x--- 2 vijanand slt 4096 Oct 20 11:33 nova
vijanand@bng-lnx-shell2#

Revision history for this message
Hari Prasad Killi (haripk) wrote :

Qemu was consistently taking 200% cpu.
12857 libvirt+ 20 0 4760448 3.027g 12780 S 200.0 4.8 110:54.00 qemu-system-x86

root@csp-sol-lexus:~# ps aux | grep 12857
libvirt+ 12857 198 4.8 4760448 3174032 ? Sl 14:15 114:10 /usr/bin/qemu-system-x86_64 -name instance-000002ba -S -machine pc-i440fx-trusty,accel=kvm,usb=off -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 7742acc2-a1a4-4e61-9cb7-78530abc023f -smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=2014.1.3,serial=35383339-3134-5347-4832-33303950504b,uuid=7742acc2-a1a4-4e61-9cb7-78530abc023f -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-000002ba.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/7742acc2-a1a4-4e61-9cb7-78530abc023f/disk,if=none,id=drive-ide0-0-0,format=qcow2,cache=none -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,ifname=tap55bb8902-bb,script=,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=02:b3:1a:17:80:53,bus=pci.0,addr=0x3 -netdev tap,ifname=tapa9159f8e-39,script=,id=hostnet1 -device e1000,netdev=hostnet1,id=net1,mac=02:91:5f:f8:53:42,bus=pci.0,addr=0x4 -netdev tap,ifname=tap38aafb06-20,script=,id=hostnet2 -device e1000,netdev=hostnet2,id=net2,mac=02:c3:ab:90:48:71,bus=pci.0,addr=0x5 -chardev file,id=charserial0,path=/var/lib/nova/instances/7742acc2-a1a4-4e61-9cb7-78530abc023f/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -vnc 7.7.7.50:12 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6

The UUID 7742acc2-a1a4-4e61-9cb7-78530abc023f belongs to the VM in question.

Revision history for this message
Vijay Anand (vijanand) wrote : Re: [Bug 1507882] Re: Few (vFirefly) instances hungs after boot and are unreachable
Download full text (4.8 KiB)

Hi Hari
 We are hitting this issue consistently which is blocking scale testing. Spawned 10 instances in a compute - 1 or 2 going to unresponsive state. Any workaround or suggestions?

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 2838 libvirt+ 20 0 33.026g 0.018t 12860 S 104.9 29.7 9015:59 qemu-system-x86
 5612 libvirt+ 20 0 4760452 3.025g 12700 S 14.9 4.8 4:03.13 qemu-system-x86
 5415 libvirt+ 20 0 4760452 3.025g 12700 S 14.3 4.8 4:10.39 qemu-system-x86

root@csp-sol-lexus:~# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 44
Stepping: 2
CPU MHz: 1600.000
BogoMIPS: 5331.78
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 12288K
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23
root@csp-sol-lexus:~#

Regards
Vijay

On 10/20/15, 3:18 PM, "<email address hidden> on behalf of Hari Prasad Killi" <<email address hidden> on behalf of <email address hidden>> wrote:

>Qemu was consistently taking 200% cpu.
>12857 libvirt+ 20 0 4760448 3.027g 12780 S 200.0 4.8 110:54.00 qemu-system-x86
>
>root@csp-sol-lexus:~# ps aux | grep 12857
>libvirt+ 12857 198 4.8 4760448 3174032 ? Sl 14:15 114:10 /usr/bin/qemu-system-x86_64 -name instance-000002ba -S -machine pc-i440fx-trusty,accel=kvm,usb=off -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 7742acc2-a1a4-4e61-9cb7-78530abc023f -smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=2014.1.3,serial=35383339-3134-5347-4832-33303950504b,uuid=7742acc2-a1a4-4e61-9cb7-78530abc023f -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-000002ba.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/7742acc2-a1a4-4e61-9cb7-78530abc023f/disk,if=none,id=drive-ide0-0-0,format=qcow2,cache=none -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,ifname=tap55bb8902-bb,script=,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=02:b3:1a:17:80:53,bus=pci.0,addr=0x3 -netdev tap,ifname=tapa9159f8e-39,script=,id=hostnet1 -device e1000...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.