instance stuck at "booting from hard disk"

Bug #1688496 reported by fxpester
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
kolla
Won't Fix
Undecided
Unassigned

Bug Description

kolla 4.0.0 from pip - default AIO - on Xenial

when booting image it stuck at "booting from hard disk" - tried cirros (it prints "GRUB" brfore stuck) and ubuntu.

on host machine I installed virt utils and VM boots succesfully with: virt-install -n test2 -r 1024 --disk path=/tmp/trusty-server-cloudimg-amd64-disk1.qcow2,bus=virtio,format=qcow2 --graphics vnc,listen=0.0.0.0 --nonetworks --noautoconsole -v --import

after that I disable libvirtd, and start kolla`s containers - in result VMs in stuck again.

qemu-kvm.x86_64 10:1.5.3-126.el7_3.6
qemu.x86_64 2:2.0.0-1.el7.6
libvirt.x86_64 2.0.0-10.el7_3.5

Revision history for this message
Eduardo Gonzalez (egonzalez90) wrote :

Hi, can you share your globals.yml config, any custom_config at /etc/kolla/config/* and logs from nova and libvirt in the hypervisor?

Regards

Revision history for this message
fxpester (a-yurtaykin) wrote :

logs include my debugging attempts, so they`re can`t tell much - http://paste.openstack.org/show/608944/

Revision history for this message
fxpester (a-yurtaykin) wrote :
Download full text (8.3 KiB)

just reprodeced it on Xenial with 'apt-get update && apt-get upgrade'

1) no selinux at all on ubuntu

2)
root@ubuntu-xenial:~# docker info
Containers: 31
 Running: 31
 Paused: 0
 Stopped: 0
Images: 30
Server Version: 17.04.0-ce
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 208
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary:
containerd version: 422e31ce907fd9c3833a38d7b8fdd023e5a76e73
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-62-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.635GiB
Name: ubuntu-xenial
ID: EQW5:QZYG:5WPQ:7SIN:MDZH:UI6O:IUOX:ST4T:35HO:YF5F:ZCMO:2ILM
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

3)
root@ubuntu-xenial:~# ps -ef | grep libvirt
root 15196 15183 0 14:29 ? 00:00:01 /usr/sbin/libvirtd --listen
42427 26340 15167 99 14:48 ? 00:05:32 /usr/libexec/qemu-kvm -name guest=instance-00000001,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-instance-00000001/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off -cpu Broadwell,+vme,+ds,+ss,+vmx,+osxsave,+f16c,+rdrand,+hypervisor,+arat,+tsc_adjust,+xsaveopt,+pdpe1gb,+abm,+rtm,+hle -m 64 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 48aa739c-cb8f-4049-b82d-824b79cc9f8b -smbios type=1,manufacturer=RDO,product=OpenStack Compute,version=15.0.0-1.el7,serial=564d14fa-e7c8-2496-6f09-0dcf4fc4d276,uuid=48aa739c-cb8f-4049-b82d-824b79cc9f8b,family=Virtual Machine -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-instance-00000001/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/48aa739c-cb8f-4049-b82d-824b79cc9f8b/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:5a:69:d6,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/48aa739c-cb8f-4049-b82d-824b79cc9f8b/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 192.168.1.10:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
root 27439 2053 0 14:54 pts/1 00:00:00 grep --color=auto libvirt

4)...

Read more...

Revision history for this message
fxpester (a-yurtaykin) wrote :

quick fix:
in virsh xml setting change
-machine pc-i440fx-rhel7.3.0

to:
-machine rhel6.5.0

Revision history for this message
Eduardo Gonzalez (egonzalez90) wrote :

Machine type is right passed from hypervisor to container (if compute nodes run on rhel/centos 7). not sure why this fail when machine type is correct. Might be something with vmware.

Better than modifying libvirt.xml, change hw_machine_type = None in [libvirt] section of nova.conf

[libvirt]
hw_machine_type = rhel6.5.0

You can find what capabilities the host support with on host and libvirt container to see any difference:
virsh capabilities

Regards

Revision history for this message
Eduardo Gonzalez (egonzalez90) wrote :

Sorry, wrong value in option, should be:

[libvirt]
hw_machine_type=x86_64=pc-i440fx-rhel7.3.0

Or for multi kvm archs:

[libvirt]
hw_machine_type=x86_64=pc-i440fx-rhel7.3.0,aarch64=virt-rhel7.3.0,ppc64=pseries-rhel7.3.0,ppc64le=pseries-rhel7.3.0

Ensure the host is running on centos/rhel 7.3.0

Revision history for this message
Eduardo Gonzalez (egonzalez90) wrote :

Take care when changing machine type that can lead to future issues on instance migration or upgrades.
Check bugzilla topic
https://bugzilla.redhat.com/show_bug.cgi?id=1393480

Revision history for this message
fxpester (a-yurtaykin) wrote :

experimenting with this for some time, no luck yet, error is "no pci slots".

I was sure it`s some "new features" from REDHAT, yet if I use ubuntu as container OS I get same result (ubuntu -machine default is set to something like 'zesty').

and yes, I use VSphere Hypervisor 6.0 with "expose HW Virtualization bits" to enable KVM.

Revision history for this message
Eduardo Gonzalez (egonzalez90) wrote :

I think the issue is with virt caps exposed from ESXi to VM, maybe something wrong or that differs from the OS. Also ensure the VM is with RHEL7 profile or `Other Linux`

Revision history for this message
fxpester (a-yurtaykin) wrote :

got same results with devstack (master), the end of ESX is near ? X_X

Changed in kolla:
status: New → Invalid
Revision history for this message
fxpester (a-yurtaykin) wrote :
Revision history for this message
fxpester (a-yurtaykin) wrote :

ESX 6.5 affected too.

better workaround:
run `yum downgrade qemu-kvm-ev qemu-kvm-common-ev qemu-img-ev -y` run it 4 times to get version: qemu-kvm-ev.x86_64 10:2.6.0-28.el7_3.3.1

cat << EOF > /etc/kolla/config/nova/nova-compute.conf
[libvirt]
#virt_type = qemu
cpu_mode = none
EOF

kolla-ansible reconfigure

the only way to get KVM VMs on ESX.

Changed in kolla:
status: Invalid → Confirmed
Revision history for this message
Eduardo Gonzalez (egonzalez90) wrote :

This is more a esxi + nested kvm than kolla bug. marking as wont fix.

Please, re-open again if having an issue we can fix in our side.

Regards

Changed in kolla:
status: Confirmed → Won't Fix
Revision history for this message
trakatelis (trakatelis) wrote :

An update to the quick fix proposed by Eduardo if the host is running on centos/rhel 7.3.
In the [libvirt] section of /etc/nova/nova.conf set hw_machine_type as follows:

hw_machine_type = x86_64=pc-i440fx-rhel7.2.0

It was tested and found to behave as expected.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.