Brief Description
----------------
Instance schedules and appears to be active on compute but can not ping VM from natbox
Severity
--------
Major
Steps to Reproduce
------------------
1. clean install
2. Hypervisors confirmed to be unlocked and enabled con compute-3 and compute-1 at ~14:33:09
2019-08-25T14:33:09.000 controller-0 fmManager: info { "event_log_id" : "275.001", "reason_text" : "Host compute-3 hypervisor is now unlocked-enabled", "entity_instance_id" : "host=compute-3.hypervisor=c933301a-bf29-4e60-a198-c36e9ba0dc5e", "severity" : "critical", "state" : "msg", "timestamp" : "
2019-08-25 14:33:09.729915" }
2019-08-25T14:33:09.000 controller-0 fmManager: info { "event_log_id" : "275.001", "reason_text" : "Host compute-0 hypervisor is now unlocked-enabled", "entity_instance_id" : "host=compute-0.hypervisor=9266efa5-0cd9-4c09-a0da-c91b120ad41c", "severity" : "critical", "state" : "msg", "timestamp" : "
The application apply completed
platform-integ-apps | 1.0-7 | platform-integration-manifest | manifest.yaml | applied | completed
stx-openstack | 1.0-17-centos-stable-versioned | armada-manifest | stx-openstack.yaml | applied | completed
fm-event.log
2019-08-25 14:33:09.772201" }
2019-08-25T14:34:52.000 controller-0 fmManager: info { "event_log_id" : "750.004", "reason_text" : "Application Apply In Progress", "entity_instance_id" : "region=RegionOne.system=yow-cgcs-wildcat-113-121.k8s_application=stx-openstack", "severity" : "warning", "state" : "clear", "timestamp" : "2019-08-25 14:34:52.197055" }
2019-08-25T14:43:33.000 controller-0 fmManager: info { "event_log_id" : "700.108", "reason_text" : "Instance tenant1-vol_root-1 owned by tenant1 has been created", "entity_instance_id" : "tenant=8861432c-b06c-4b6e-be32-75da72a2e944.instance=c52f094e-2c75-4660-a4a4-14c99b1fcb04", "severity" : "critical", "state" : "msg", "timestamp" : "2019-08-25 14:43:33.720109" }
2019-08-25T14:43:46.000 controller-0 fmManager: info { "event_log_id" : "700.101", "reason_text" : "Instance tenant1-vol_root-1 is enabled on host compute-3", "entity_instance_id" : "tenant=8861432c-b06c-4b6e-be32-75da72a2e944.instance=c52f094e-2c75-4660-a4a4-14c99b1fcb04", "severity" : "critical", "state" : "msg", "timestamp" : "2019-08-25 14:43:45.861492" }
All hosts are enabled and available
[2019-08-25 14:42:23,675]
1 | controller-0 | controller | unlocked | enabled | available |
| 2 | storage-0 | storage | unlocked | enabled | available |
| 3 | storage-1 | storage | unlocked | enabled | available |
| 4 | compute-0 | worker | unlocked | enabled | available |
| 5 | compute-1 | worker | unlocked | enabled | available |
| 6 | compute-2 | worker | unlocked | enabled | available |
| 7 | compute-3 | worker | unlocked | enabled | available |
| 8 | compute-4 | worker | unlocked | enabled | available |
| 9 | controller-1 | controller | unlocked | enabled | available
compute-0 and compute-3 are remote storage hosts
| availability_zone | None
|
| created_at | 2019-08-25T14:40:33.000000
|
| deleted | False
|
| deleted_at | None
|
| hosts | [u'compute-0', u'compute-3']
|
| id | 8
|
| name | remote_storage_hosts
|
| properties | stx_storage='remote'
|
| updated_at | None
3. Flavor 1a04f50f-e589-4db7-8203-df29a81b0713
created with --property aggregate_instance_extra_specs:stx_storage=remote --property hw:mem_page_size=large
| disk | 2
|
| id | 1a04f50f-e589-4db7-8203-df29a81b0713
|
| name | flv_rootdisk
|
| os-flavor-access:is_public | True
|
| properties |
|
| ram | 1024
4. volume created
[2019-08-25 14:43:25,878] 301 DEBUG MainThread ssh.send :: Send 'openstack --os-username 'tenant1' --os-password 'Li69nux*' --os-project-name tenant1 --os-auth-url http://keystone.openstack.svc.cluster.local/v3 --os-user-domain-name Default --os-project-domain-name Default --os-identity-api-version 3 --os-interface internal --os-region-name RegionOne volume create --image=ec34c5ef-8270-44f5-9dab-83d5704272f6 --size=2 --bootable vol-tenant1-vol_root-1-1'
5. instance booted and lands on compute-3 (host with remote label)
[2019-08-25 14:43:29,724] 301 DEBUG MainThread ssh.send :: Send 'nova --os-username 'tenant1' --os-password 'Li69nux*' --os-project-name tenant1 --os-auth-url http://keystone.openstack.svc.cluster.local/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne boot --flavor=1a04f50f-e589-4db7-8203-df29a81b0713 --poll --boot-volume=03cb9c01-4657-4b07-84f4-73fd13109a2a --availability-zone=nova:compute-3 --key-name=keypair-tenant1 --nic net-id=4c530bc7-06db-46de-bb63-75b8004676c9 --nic net-id=02060741-2e80-4b6e-a740-c514ecef8722 tenant1-vol_root-1'
'sudo virsh domblklist instance-000001f1'
| c52f094e-2c75-4660-a4a4-14c99b1fcb04 | tenant1-vol_root-1 | ACTIVE | tenant1-net0=172.16.0.244; tenant2-mgmt-net=192.168.187.13 | | flv_rootdisk
6. attempt to ping to the tenant2-mgmt-net from the natbox
ping -c 3 192.168.187.13'
Expected Behavior
------------------
Ping should not have failed
Actual Behavior
----------------
Reproducibility
---------------
Fails (ping failures) in all nova remote test on this load
FAIL TestDefaultGuest::test_evacuate_vms_with_inst_backing[remote]
FAIL test_live_migrate_vm_positive[remote-0-0-None-2-volume-False]
FAIL test_live_migrate_vm_positive[remote-1-0-dedicated-1-volume-False]
FAIL test_live_migrate_vm_positive[remote-1-512-None-1-image-False]
FAIL test_live_migrate_vm_positive[remote-0-512-dedicated-2-image_with_vol-False]
FAIL test_live_migrate_vm_negative[remote-0-0-volume-True-LiveMigErr.BLOCK_MIG_UNSUPPORTED]
FAIL test_live_migrate_vm_negative[remote-1-0-volume-True-LiveMigErr.BLOCK_MIG_UNSUPPORTED]
FAIL test_live_migrate_vm_negative[remote-0-512-volume-True-LiveMigErr.BLOCK_MIG_UNSUPPORTED]
FAIL test_live_migrate_vm_negative[remote-0-512-image-True-LiveMigErr.BLOCK_MIG_UNSUPPORTED]
FAIL test_live_migrate_vm_negative[remote-0-0-image_with_vol-True-LiveMigErr.BLOCK_MIG_UNSUPPORTED]
FAIL test_cold_migrate_vm[remote-0-0-None-2-volume-confirm]
FAIL test_cold_migrate_vm[remote-1-0-None-1-volume-confirm]
FAIL test_cold_migrate_vm[remote-1-512-None-1-image-confirm]
FAIL test_cold_migrate_vm[remote-0-0-None-2-image_with_vol-confirm]
FAIL test_cold_migrate_vm[remote-0-0-None-2-volume-revert]
FAIL test_cold_migrate_vm[remote-1-512-None-1-volume-revert]
FAIL test_cold_migrate_vm[remote-0-0-None-1-image-revert]
FAIL test_cold_migrate_vm[remote-1-0-None-2-image_with_vol-revert]
FAIL test_migrate_vm_various_guest[ubuntu_14-1-1024-shared-volume]
FAIL test_migrate_vm_various_guest[ubuntu_14-2-1024-dedicated-image]
FAIL TestResizeSameHost::test_resize_vm_positive[remote-4_0_0-5_1_512-image]
FAIL TestResizeSameHost::test_resize_vm_positive[remote-4_1_512-5_2_1024-image]
FAIL TestResizeSameHost::test_resize_vm_positive[remote-4_1_512-4_1_0-image]
FAIL TestResizeSameHost::test_resize_vm_positive[remote-4_0_0-1_1_512-volume]
FAIL TestResizeSameHost::test_resize_vm_positive[remote-4_1_512-8_2_1024-volume]
FAIL TestResizeSameHost::test_resize_vm_positive[remote-4_1_512-0_1_0-volume]
FAIL TestResizeSameHost::test_resize_vm_negative[remote-5_0_0-0_0_0-image]
FAIL TestResizeSameHost::test_resize_vm_negative[remote-5_2_512-5_1_512-image]
FAIL TestResizeSameHost::test_resize_vm_negative[remote-1_1_512-1_0_512-volume]
FAIL TestResizeDiffHost::test_resize_different_comp_node[remote]
System Configuration
--------------------
2+5+2
h/w Lab: WCP_113_121
Branch/Pull Time/Commit
-----------------------
2019-08-23_20-59-00
Timestamp/Logs
see attached console.log
nfv-vim-events.log
log-id = 976
event-id = hypervisor-state-change
event-type = state-event
event-context = admin
importance = high
entity = host=compute-0.hypervisor=9266efa5-0cd9-4c09-a0da-c91b120ad41c
reason_text = Host compute-0 hypervisor is now locked-enabled
additional_text =
timestamp = 2019-08-25 14:21:51.968209
====================================================================================
====================================================================================
log-id = 977
event-id = hypervisor-state-change
event-type = state-event
event-context = admin
importance = high
entity = host=compute-0.hypervisor=9266efa5-0cd9-4c09-a0da-c91b120ad41c
reason_text = Host compute-0 hypervisor is now locked-disabled
additional_text =
timestamp = 2019-08-25 14:21:52.481091
====================================================================================
====================================================================================
log-id = 978
event-id = hypervisor-state-change
event-type = state-event
event-context = admin
importance = high
entity = host=compute-3.hypervisor=c933301a-bf29-4e60-a198-c36e9ba0dc5e
reason_text = Host compute-3 hypervisor is now locked-enabled
additional_text =
timestamp = 2019-08-25 14:22:15.938086
====================================================================================
====================================================================================
log-id = 979
event-id = hypervisor-state-change
event-type = state-event
event-context = admin
importance = high
entity = host=compute-3.hypervisor=c933301a-bf29-4e60-a198-c36e9ba0dc5e
reason_text = Host compute-3 hypervisor is now locked-disabled
additional_text =
timestamp = 2019-08-25 14:22:16.508601
====================================================================================
====================================================================================
log-id = 980
event-id = hypervisor-state-change
event-type = state-event
event-context = admin
importance = high
entity = host=compute-3.hypervisor=c933301a-bf29-4e60-a198-c36e9ba0dc5e
reason_text = Host compute-3 hypervisor is now unlocked-disabled
additional_text =
timestamp = 2019-08-25 14:30:57.326260
====================================================================================
====================================================================================
log-id = 981
event-id = hypervisor-state-change
event-type = state-event
event-context = admin
importance = high
entity = host=compute-0.hypervisor=9266efa5-0cd9-4c09-a0da-c91b120ad41c
reason_text = Host compute-0 hypervisor is now unlocked-disabled
additional_text =
timestamp = 2019-08-25 14:30:57.328746
====================================================================================
====================================================================================
log-id = 982
event-id = hypervisor-state-change
event-type = state-event
event-context = admin
importance = high
entity = host=compute-3.hypervisor=c933301a-bf29-4e60-a198-c36e9ba0dc5e
reason_text = Host compute-3 hypervisor is now unlocked-enabled
additional_text =
timestamp = 2019-08-25 14:33:09.728937
====================================================================================
====================================================================================
log-id = 983
event-id = hypervisor-state-change
event-type = state-event
event-context = admin
importance = high
entity = host=compute-0.hypervisor=9266efa5-0cd9-4c09-a0da-c91b120ad41c
reason_text = Host compute-0 hypervisor is now unlocked-enabled
additional_text =
timestamp = 2019-08-25 14:33:09.731669
====================================================================================
====================================================================================
log-id = 984
event-id = instance-created
event-type = action-event
event-context = admin
importance = high
entity = tenant=8861432c-b06c-4b6e-be32-75da72a2e944.instance=c52f094e-2c75-4660-a4a4-14c99b1fcb04
reason_text = Instance tenant1-vol_root-1 owned by tenant1 has been created
additional_text =
timestamp = 2019-08-25 14:43:33.719109
====================================================================================
====================================================================================
log-id = 985
event-id = instance-enabled
event-type = action-event
event-context = admin
importance = high
entity = tenant=8861432c-b06c-4b6e-be32-75da72a2e944.instance=c52f094e-2c75-4660-a4a4-14c99b1fcb04
reason_text = Instance tenant1-vol_root-1 is enabled on host compute-3
additional_text =
timestamp = 2019-08-25 14:43:45.860829
====================================================================================
=====================================================
instance-000001f1.log (compute-3)
2019-08-25 14:43:41.399+0000: starting up libvirt version: 4.7.0, package: 1.tis.101 (Unknown, 2019-08-19-05:07:36, 4d0394103521), qemu version: 3.0.0qemu-kvm-ev-3.0.0-0.tis.100, kernel: 3.10.0-957.12.2.el7.3.tis.x86_64, hostname: compute-3
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOME=/root QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-000001f1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-instance-000001f1/master-key.aes -machine pc-i440fx-3.0,accel=kvm,usb=off,dump-guest-core=off -cpu Skylake-Client,ss=on,vmx=on,hypervisor=on,tsc_adjust=on,pdpe1gb=on,topoext=on,mpx=off,xsavec=off,xgetbv1=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/huge-2048kB/libvirt/qemu/1-instance-000001f1,share=yes,size=1073741824,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0,memdev=ram-node0 -uuid c52f094e-2c75-4660-a4a4-14c99b1fcb04 -smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=19.0.2,serial=c52f094e-2c75-4660-a4a4-14c99b1fcb04,uuid=c52f094e-2c75-4660-a4a4-14c99b1fcb04,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=27,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -object secret,id=virtio-disk0-secret0,data=a81iLj+PpI1kzyeEkq01/97kNMZUB90sSNE5tHygfBk=,keyid=masterKey0,iv=nMkVDBVEKtNIrgZgVyZEOg==,format=base64 -drive 'file=rbd:cinder-volumes/03cb9c01-4657-4b07-84f4-73fd13109a2a:id=cinder:auth_supported=cephx\;none:mon_host=192.168.222.3\:6789\;192.168.222.4\:6789\;192.168.222.231\:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0,cache=writeback,discard=unmap' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on,serial=03cb9c01-4657-4b07-84f4-73fd13109a2a -chardev socket,id=charnet0,path=/var/run/openvswitch/vhue53950d9-03,server -netdev vhost-user,chardev=charnet0,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:ff:97:95,bus=pci.0,addr=0x3 -chardev socket,id=charnet1,path=/var/run/openvswitch/vhu746a78be-31,server -netdev vhost-user,chardev=charnet1,id=hostnet1 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=fa:16:3e:4b:d5:06,bus=pci.0,addr=0x4 -chardev pty,id=charserial0,logfile=/var/lib/nova/instances/c52f094e-2c75-4660-a4a4-14c99b1fcb04/console.log,logappend=off -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
2019-08-25 14:43:41.399+0000: Domain id=1 is tainted: high-privileges
2019-08-25T14:43:41.637854Z qemu-kvm: -chardev socket,id=charnet0,path=/var/run/openvswitch/vhue53950d9-03,server: info: QEMU waiting for connection on: disconnected:unix:/var/run/openvswitch/vhue53950d9-03,server
2019-08-25T14:43:42.629695Z qemu-kvm: -chardev socket,id=charnet1,path=/var/run/openvswitch/vhu746a78be-31,server: info: QEMU waiting for connection on: disconnected:unix:/var/run/openvswitch/vhu746a78be-31,server
2019-08-25T14:43:42.630216Z qemu-kvm: -chardev pty,id=charserial0,logfile=/var/lib/nova/instances/c52f094e-2c75-4660-a4a4-14c99b1fcb04/console.log,logappend=off: char device redirected to /dev/pts/0 (label charserial0)
2019-08-25T14:48:37.210073Z qemu-kvm: terminating on signal 15 from pid 45400 (/usr/sbin/libvirtd)
2019-08-25 14:48:37.418+0000: shutting down, reason=shutdown
see nova-compute.log compute-3
{"log":"2019-08-25 14:43:45.762 48388 INFO nova.compute.manager [req-35bccecb-42cd-45d8-82f4-b57a10790497 98bddb3440ec451097276bce4a529811 8861432cb06c4b6ebe3275da72a2e944 - default default] [instance: c52f094e-2c75-4660-a4a4-14c99b1fcb04] Took 9.12 seconds to spawn the instance on the hypervisor.\n","stream":"stdout","time":"2019-08-25T14:43:45.762576429Z"}
{"log":"2019-08-25 14:43:45.762 48388 INFO nova.compute.manager [req-35bccecb-42cd-45d8-82f4-b57a10790497 98bddb3440ec451097276bce4a529811 8861432cb06c4b6ebe3275da72a2e944 - default default] [instance: c52f094e-2c75-4660-a4a4-14c99b1fcb04] Took 9.12 seconds to spawn the instance on the hypervisor.\n","stream":"stdout","time":"2019-08-25T14:43:45.762842977Z"}
{"log":"2019-08-25 14:43:45.869 48388 INFO nova.compute.manager [req-35bccecb-42cd-45d8-82f4-b57a10790497 98bddb3440ec451097276bce4a529811 8861432cb06c4b6ebe3275da72a2e944 - default default] [instance: c52f094e-2c75-4660-a4a4-14c99b1fcb04] Took 12.12 seconds to build instance.\n","stream":"stdout","time":"2019-08-25T14:43:45.869641994Z"}
{"log":"2019-08-25 14:43:45.869 48388 INFO nova.compute.manager [req-35bccecb-42cd-45d8-82f4-b57a10790497 98bddb3440ec451097276bce4a529811 8861432cb06c4b6ebe3275da72a2e944 - default default] [instance: c52f094e-2c75-4660-a4a4-14c99b1fcb04] Took 12.12 seconds to build instance.\n","stream":"stdout","time":"2019-08-25T14:43:45.870048087Z"}
{"log":"2019-08-25 14:43:45.927 48388 INFO nova.compute.manager [req-19bf0eaa-664d-4c52-b83c-15c5916e335f - - - - -] [instance: c52f094e-2c75-4660-a4a4-14c99b1fcb04] During sync_power_state the instance has a pending task (spawning). Skip.\n","stream":"stdout","time":"2019-08-25T14:43:45.928122108Z"}
{"log":"2019-08-25 14:43:45.927 48388 INFO nova.compute.manager [req-19bf0eaa-664d-4c52-b83c-15c5916e335f - - - - -] [instance: c52f094e-2c75-4660-a4a4-14c99b1fcb04] During sync_power_state the instance has a pending task (spawning). Skip.\n","stream":"stdout","time":"2019-08-25T14:43:45.928420437Z"}
{"log":"2019-08-25 14:43:45.928 48388 INFO nova.compute.manager [req-19bf0eaa-664d-4c52-b83c-15c5916e335f - - - - -] [instance: c52f094e-2c75-4660-a4a4-14c99b1fcb04] VM Resumed (Lifecycle Event)\n","stream":"stdout","time":"2019-08-25T14:43:45.92878652Z"}
{"log":"2019-08-25 14:43:45.928 48388 INFO nova.compute.manager [req-19bf0eaa-664d-4c52-b83c-15c5916e335f - - - - -] [instance: c52f094e-2c75-4660-a4a4-14c99b1fcb04] VM Resumed (Lifecycle Event)\n","stream":"stdout","time":"2019-08-25T14:43:45.929021138Z"}
{"log":"2019-08-25 14:47:25.866 48388 INFO nova.compute.manager [req-efa4602f-076f-4884-befe-36c778536245 bdafc7345149445a9866032291963ce5 daf1e68dadf94a318a9e1597c2b10b7f - default default] [instance: c52f094e-2c75-4660-a4a4-14c99b1fcb04] Get console output\n","stream":"stdout","time":"2019-08-25T14:47:25.866587624Z"}
{"log":"2019-08-25 14:47:25.866 48388 INFO nova.compute.manager [req-efa4602f-076f-4884-befe-36c778536245 bdafc7345149445a9866032291963ce5 daf1e68dadf94a318a9e1597c2b10b7f - default default] [instance: c52f094e-2c75-4660-a4a4-14c99b1fcb04] Get console output\n","stream":"stdout","time":"2019-08-25T14:47:25.866809681Z"}
Same issue experienced after clean install of the following load on different h/w 2.0_build"
ip 7-12
2019-08-26_20-59-00
JOB="StarlingX_
Will be tracked with LP-1841660
Launched a single instance after clean install (flavor has hw:cpu_ policy= dedicated) 5499-455e- a3a1-875b1f6e17 3c Create Aug. 28, 2019, 17:08 p.m. 0f63df81938e4f8 d923cbd0f6bb7ba 02
Instance appears active but cannot be reached from natbox
req-a826992c-