Live Migration Error: Failed to live migrate instance to host "AUTO_SCHEDULE".

Bug #1837256 reported by Ricardo Perez
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Invalid
High
Ricardo Perez

Bug Description

Brief Description
-----------------
In a newly installed StarlingX (Duplex), after creating few VMs. Waiting until all VMs are in running state if you go to Horizon interface, and try to perform the live migration the following error will be thrown: "Error: Failed to live migrate instance to host "AUTO_SCHEDULE"."

Severity
--------
<Critical: System/Feature is not usable due to the defect> The Live Migration Feature isn't working.

Steps to Reproduce
------------------
1.- Follow the steps described here to setup a Duplex Configuration:
https://wiki.openstack.org/wiki/StarlingX/Containers/InstallationOnAIODX

2.- Add the property mem_page_sizeo to the flavor (just to have ping / available)
openstack flavor set $UUID_my_flavor --property hw:mem_page_size=large

3.- Create an image:
openstack image create --container-format bare --disk-format qcow2 --file cirros-0.3.4-x86_64-disk.img cirros
4.- Create some VMs, in my case I have created 6 with the same flavor / image, and different name each one
   openstack server create --image cirros --flavor my_tiny --network public-net0 vm1
5.- Go to Openstack Horizon interface:
http://$OAM_IP:31000 ----> Admin - Instances

6.- In the instances Menu, go to the Actions drop down menu, and select any VM (all 6 VMs created before should have the Running State), and select "Live Migrate Instance"

7.- Leave "Automatically schedule new host".

8.- Clck submit

9.- You will get the error that it's in the description.

Expected Behavior
------------------
The VM should be migrated with no issues to the next available controller (in the case of Duplex)

Actual Behavior
----------------
The VM is not migrated and the following error is shown in Horizon: Error: Failed to live migrate instance to host "AUTO_SCHEDULE".

Reproducibility
---------------
<Reproducible in a fresh installed system>

System Configuration
--------------------
<Two node system / Duplex>

Branch/Pull Time/Commit
-----------------------
controller-0:~$ cat /etc/build.info
###
### StarlingX
### Built from master
###

OS="centos"
SW_VERSION="19.01"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="20190715T233000Z"

JOB="STX_build_master_master"
<email address hidden>"
BUILD_NUMBER="182"
BUILD_HOST="starlingx_mirror"
BUILD_DATE="2019-07-15 23:30:00 +0000"

Last Pass
---------
Unknown

Timestamp/Logs
--------------

/var/log/nfv-vim.log

2019-07-19T14:46:59.171 controller-0 VIM_Thread[802041] DEBUG _vim_nfvi_events.py.235 Instance state-change, nfvi_instance={'attached_volut': None, 'name': u'richo6', 'recovery_priority': None, 'tenant_id': '744e785d-d011-41c5-8d78-30e374c99cc9', 'avail_status': [], 'nfvi_data': {'vm_state': u'active', 'task_state': 'none', 'powsupport': None, 'instance_type': None, 'oper_state': 'enabled', 'host_name': u'controller-0', 'admin_state': 'unlocked', 'action': '', 'image_uuid': u'e94eb945-44f0-4c94-b657-423c4b9af222', 'ud12af6771824'}.
2019-07-19T14:54:25.716 controller-0 VIM_Thread[802041] DEBUG _vim_nfvi_events.py.304 Instance action, uuid=ce777205-36f8-49c8-a786-29cca4e action, type=live-migrate, params={'host': None, 'block-migration': False}, state=initial, reason=
2019-07-19T14:54:26.057 controller-0 VIM_Thread[802041] DEBUG _vim_nfvi_events.py.235 Instance state-change, nfvi_instance={'attached_volut': None, 'name': u'richo1', 'recovery_priority': None, 'tenant_id': '744e785d-d011-41c5-8d78-30e374c99cc9', 'avail_status': [], 'nfvi_data': {'vm_state': u'active', 'task_state': u'migrating'ation_support': None, 'instance_type': None, 'oper_state': 'enabled', 'host_name': u'controller-0', 'admin_state': 'unlocked', 'action': 'migrating', 'image_uuid': u'e94eb945-44f0-4c94-b657-4236f8-49c8-a786-29cca41e4035'}.
2019-07-19T14:54:27.633 controller-0 VIM_Thread[802041] DEBUG _vim_nfvi_events.py.235 Instance state-change, nfvi_instance={'attached_volut': None, 'name': u'richo1', 'recovery_priority': None, 'tenant_id': '744e785d-d011-41c5-8d78-30e374c99cc9', 'avail_status': [], 'nfvi_data': {'vm_state': u'active', 'task_state': 'none', 'powsupport': None, 'instance_type': None, 'oper_state': 'enabled', 'host_name': u'controller-0', 'admin_state': 'unlocked', 'action': '', 'image_uuid': u'e94eb945-44f0-4c94-b657-423c4b9af222', 'u29cca41e4035'}.
2019-07-19T14:54:27.685 controller-0 VIM_Thread[802041] ERROR Caught exception while trying to live migrate an instance, error=[OpenStack Rest-API Exception: methostack.svc.cluster.local:8774/v2.1/744e785dd01141c58d7830e374c99cc9/servers/ce777205-36f8-49c8-a786-29cca41e4035/action, headers={'Content-Type': 'application/json'}, body={"os-migrateLive": {"_migration": false, "host": null}}, status_code=400, reason=HTTP Error 400: Bad Request, response_headers=[('Content-Length', '184'), ('X-Compute-Request-Id', 'req-83334395-991d-4c4b-aedf-69a2API-Version, X-OpenStack-Nova-API-Version'), ('Openstack-Api-Version', 'compute 2.1'), ('X-Openstack-Nova-Api-Version', '2.1'), ('Date', 'Fri, 19 Jul 2019 14:54:27 GMT'), ('Content-Type', 'app('X-Openstack-Request-Id', 'req-83334395-991d-4c4b-aedf-69a294a5b3f7')], response_body={"badRequest": {"message": "controller-0 is not on shared storage: Shared storage live-migration requiresrom-volume with no local disks.", "code": 400}}].
OpenStackRestAPIException: [OpenStack Rest-API Exception: method=POST, url=http://nova-api.openstack.svc.cluster.local:8774/v2.1/744e785dd01141c58d7830e374c99cc9/servers/ce777205-36f8-49c8-a78'Content-Type': 'application/json'}, body={"os-migrateLive": {"disk_over_commit": false, "block_migration": false, "host": null}}, status_code=400, reason=HTTP Error 400: Bad Request, response4'), ('X-Compute-Request-Id', 'req-83334395-991d-4c4b-aedf-69a294a5b3f7'), ('Vary', 'OpenStack-API-Version, X-OpenStack-Nova-API-Version'), ('Openstack-Api-Version', 'compute 2.1'), ('X-Openst'Date', 'Fri, 19 Jul 2019 14:54:27 GMT'), ('Content-Type', 'application/json; charset=UTF-8'), ('X-Openstack-Request-Id', 'req-83334395-991d-4c4b-aedf-69a294a5b3f7')], response_body={"badReques not on shared storage: Shared storage live-migration requires either shared storage or boot-from-volume with no local disks.", "code": 400}}]
2019-07-19T14:54:27.688 controller-0 VIM_Thread[802041] DEBUG _instance_task_work.py.110 Live-Migrate-Instance callback for richo1, response=controller-0 is not on shared storage: shared storage live-migration requires either shared storage or boot-from-volume with no local disks'}.

Test Activity
-------------
[Regression Testing]

Revision history for this message
Ricardo Perez (richomx) wrote :
Ghada Khalil (gkhalil)
tags: added: stx.distro.other
tags: added: stx.distro.openstack
removed: stx.distro.other
Revision history for this message
Ghada Khalil (gkhalil) wrote :

This may be a configuration issue given the error msg returned by nova:
Shared storage live-migration requires either shared storage or boot-from-volume with no local disks.", "code": 400
controller-0 is not on shared storage: shared storage live-migration requires either shared storage or boot-from-volume with no local disks'

Assigning to the openstack team to review and provide the correct steps if applicable.

description: updated
Changed in starlingx:
assignee: nobody → Shuquan Huang (shuquan)
Revision history for this message
yong hu (yhu6) wrote :

maybe duplicated to https://bugs.launchpad.net/starlingx/+bug/1837759.
Even with cmdline, live-migration didn't work

Changed in starlingx:
importance: Undecided → High
Revision history for this message
Ghada Khalil (gkhalil) wrote :

I asked Chris Winnicki the reporter of https://bugs.launchpad.net/starlingx/+bug/1837759 and he said that the failure signature in this bug is different.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Adding the stx.2.0 label since Yong Hu (distro.openstack PL) marked this bug as high priority

tags: added: stx.2.0
Changed in starlingx:
status: New → Triaged
Revision history for this message
Gerry Kopec (gerry-kopec) wrote :

I suspect that the "Block Migration" box was not checked. When using Horizon, if the instance is boot from image (as in this case), then "Block Migration" must be selected on the "Live Migrate" panel. If boot from volume, then "Block Migration" must not be selected.

If live migration is run from cli, e.g. "nova live-migration <instance-uuid>" then no option is required. This is because more recent nova api microversions (>= 2.25) support the "auto" option which will determine the storage backing and do the appropriate steps. However, as it seems Horizon is using an old api microversion, "Block Migration" must be specified explicitly.

Suggest a re-test.

hutianhao27 (hutianhao)
Changed in starlingx:
assignee: Shuquan Huang (shuquan) → hutianhao27 (hutianhao)
Revision history for this message
hutianhao27 (hutianhao) wrote :

I try to reproduced it with Duplex StarlingX, and the results are just like Gerry said. When using Horizon, if the instance is boot from image, then "Block Migration" must be selected on the "Live Migrate" panel. If boot from volume, then "Block Migration" must not be selected. Otherwise live migration will fail. So I think this may be a operational problem.

Revision history for this message
yong hu (yhu6) wrote :

@tianhao, from the original log, could you confirm the issue was made due to a wrong user setting?
If so, we can set this LP as invalid.

Revision history for this message
hutianhao27 (hutianhao) wrote :

This is part of original log:

OpenStackRestAPIException: [OpenStack Rest-API Exception: method=POST, url=http://nova-api.openstack.svc.cluster.local:8774/v2.1/744e785dd01141c58d7830e374c99cc9/servers/ce777205-36f8-49c8-a78'Content-Type': 'application/json'}, body={"os-migrateLive": {"disk_over_commit": false, "block_migration": false, "host": null}}, status_code=400, reason=HTTP Error 400: Bad Request, response4'), ('X-Compute-Request-Id', 'req-83334395-991d-4c4b-aedf-69a294a5b3f7'), ('Vary', 'OpenStack-API-Version, X-OpenStack-Nova-API-Version'), ('Openstack-Api-Version', 'compute 2.1'), ('X-Openst'Date', 'Fri, 19 Jul 2019 14:54:27 GMT'), ('Content-Type', 'application/json; charset=UTF-8'), ('X-Openstack-Request-Id', 'req-83334395-991d-4c4b-aedf-69a294a5b3f7')], response_body={"badReques not on shared storage: Shared storage live-migration requires either shared storage or boot-from-volume with no local disks.", "code": 400

And we can find that block_migration is false, and we can see from response_body that “badReques not on shared storage: Shared storage live-migration requires either shared storage or boot-from-volume with no local disks”

And following is part of nova-compute-compute-0-75ea0372-kpzgj_openstack_nova-compute-a8e443c58a184ebd3cd4067ab8f11bda6c1b5da9c05c7ac8f995a1933c5be260.log in our environment.

{"log":"2019-08-01 09:43:48.480 55894 ERROR oslo_messaging.rpc.server File \"//
var/lib/openstack/lib/python2.7/site-packages/nova/compute/manager.py\", line 633
47, in check_can_live_migrate_source\n","stream":"stdout","time":"2019-08-01T09::
43:48.495053539Z"}
{"log":"2019-08-01 09:43:48.480 55894 ERROR oslo_messaging.rpc.server block__
device_info)\n","stream":"stdout","time":"2019-08-01T09:43:48.495064399Z"}
{"log":"2019-08-01 09:43:48.480 55894 ERROR oslo_messaging.rpc.server \n","streaa
m":"stdout","time":"2019-08-01T09:43:48.495073449Z"}
{"log":"2019-08-01 09:43:48.480 55894 ERROR oslo_messaging.rpc.server File \"//
var/lib/openstack/lib/python2.7/site-packages/nova/virt/libvirt/driver.py\", linn
e 7317, in check_can_live_migrate_source\n","stream":"stdout","time":"2019-08-011
T09:43:48.495081458Z"}
{"log":"2019-08-01 09:43:48.480 55894 ERROR oslo_messaging.rpc.server raise
exception.InvalidLocalStorage(reason=reason, path=source)\n","stream":"stdout",""
time":"2019-08-01T09:43:48.495110138Z"}
{"log":"2019-08-01 09:43:48.480 55894 ERROR oslo_messaging.rpc.server \n","streaa
m":"stdout","time":"2019-08-01T09:43:48.495121889Z"}
{"log":"2019-08-01 09:43:48.480 55894 ERROR oslo_messaging.rpc.server InvalidLocc
alStorage: compute-1 is not on local storage: Block migration can not be used wii
th shared storage.\n","stream":"stdout","time":"2019-08-01T09:43:48.495130497Z"}
{"log":"2019-08-01 09:43:48.480 55894 ERROR oslo_messaging.rpc.server \n","streaa
m":"stdout","time":"2019-08-01T09:43:48.495139609Z"}

We can see from the log that Block migration can not be used with shared storage.So I think the issue was made due to a wrong user setting.

Revision history for this message
yong hu (yhu6) wrote :

@Ricardo,
please see the analysis from @Tianhao - "We can see from the log that Block migration can not be used with shared storage.So I think the issue was made due to a wrong user setting."

Can you re-test this case with an appropriate storage?

Changed in starlingx:
assignee: hutianhao27 (hutianhao) → Ricardo Perez (richomx)
Revision history for this message
Ricardo Perez (richomx) wrote :

@Yong, I have tested again this scenario following the comments:

Block Migration option selected when the VM has been booted from an image.
Live Migration option directly, when the VM has been booted from a volume.

Both cases where successfully.

I have executed this operation back and forth several times, with the same result. So I believe you can ask for this bug to be closed.

All the experiments were performed using this StarlingX image version:

controller-0:~$ cat /etc/build.info
###
### StarlingX
### Built from master
###

OS="centos"
SW_VERSION="19.01"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="20190802T013000Z"

JOB="STX_build_master_master"
<email address hidden>"
BUILD_NUMBER="200"
BUILD_HOST="starlingx_mirror"
BUILD_DATE="2019-08-02 01:30:00 +0000"
controller-0:~$

Revision history for this message
yong hu (yhu6) wrote :

Thanks to Ricard for the confirmation.
We can mark this LP as invalid.

Changed in starlingx:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.