Live migration should use the same memory over subscription logic as instance boot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
High
|
Sylvain Bauza |
Bug Description
I encounter an issue when live migrate an instance specified the target host, i think the operation will be successes , but it is failed for below reason:
MigrationPreChe
1 . My OpenStack cluster information :
1). There are two compute nodes in my cluster, and i created 4 instance(
-----------
mysql> select hypervisor_
+------
| hypervisor_hostname | vcpus | vcpus_used | running_vms | memory_mb | memory_mb_used | free_ram_mb | deleted |
+------
| hchenos1.
| hchenos2.
+------
2 rows in set (0.00 sec)
mysql>
-------
[root@hchenos ~]# nova list
+------
| ID | Name | Status | Networks |
+------
| a34f9b88-
| f6aaeff9-
| bbee57a2-
| 74fe26ec-
+------
[root@hchenos ~]#
2). I also enable the ComputeFilter,
2. In the above conditions, live migrate instance vm1 to hchenos2 failed:
[root@hchenos ~]# nova live-migration vm1 hchenos2
ERROR: Live migration of instance a34f9b88-
conductor log:
...
ckages/
I think the reason for above as below:
the free_ram_mb for 'hchenos2 ' is 336M, the request memory is 512M, so the operation is failed.
free_ram_mb = memory_mb (1872) - 512(reserved_
3. But successfully boot an instance on 'hchenos2'
[root@hchenos ~]# nova boot --image cirros-0.3.0-x86_64 --flavor 1 --availability-zone nova:hchenos2 xhu
[root@hchenos ~]# nova list
+------
| ID | Name | Status | Networks |
+------
| a34f9b88-
| f6aaeff9-
| bbee57a2-
| 74fe26ec-
| 364d1a01-
+------
[root@hchenos ~]#
mysql> select hypervisor_
+------
| hypervisor_hostname | vcpus | vcpus_used | running_vms | memory_mb | memory_mb_used | free_ram_mb | deleted |
+------
| hchenos1.
| hchenos2.
+------
2 rows in set (0.00 sec)
mysql>
So, I'm very confused for above test result, why boot an instance is OK on 'hchenos2', but live migration an instance to this host failed due to "not enough memory" ?
After carefully go through NOVA source code (live_migrate.py: execute()) , i think below will cause this issue:
1). The function '_check_
I think the free memory of host 'hchenos2' should be:
free_ram_mb = memory_mb (1872) * ram_allocation_
2) why not check vcpu for live migration target host, only check memory is enough?
live_migrate.py: execute
if not self.destination:
else:
def _check_
3) The VM status need to be considering as well, for example, if the instance is off, it doesn't consume compute node resource anymore on KVM platform(is different form IBM PowerVM), but in resource_
is taken into account when calculate resource usage.
summary: |
summary: |
- The destination host check for live migration is not correct + Live migration should use the same memory over subscription logic as + instance boot |
Changed in nova: | |
status: | In Progress → Triaged |
importance: | Undecided → Medium |
Changed in nova: | |
assignee: | Jake Liu (jake-liu) → nobody |
tags: |
added: live-migrate removed: live migration |
Changed in nova: | |
status: | Triaged → Confirmed |
importance: | Medium → High |
Changed in nova: | |
status: | Confirmed → In Progress |
tags: |
added: live-migration removed: live-migrate |
Changed in nova: | |
status: | Fix Committed → Fix Released |
There is no need to check memory for live migration, since it live migration failed, nova compute will rollback the live migration operation.
Also some hypervisors does support resource overcommit natively such as KVM and VMWare, we can remove the memory checking and let nova compute handle this case.