Nova compute - misleading warning message

Bug #1821536 reported by Helen Walsh on 2019-03-24
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Undecided
Brin Zhang

Bug Description

Minor issue, but logged by our QA team
Its on a Live Migration operation
Probably too late to make Stein release as it is a warning log change.

07/22/2018 10:18:55 AM

We are moving instance from hostB ---- > hostA
LM operation is successful, but when we see below logs from node hostA, it is
giving wrong info

Jul 12 15:44:52 dldv0031 nova-compute[20419]: WARNING nova.compute.resource_tracker [None req-7d104b01-064f-4e8d-b96d-7c27db48fded admin admin] Instance d459e746-ac99-448b-9e44-890fbfdcb6f0 has been moved to another host hostB(hostB). There are allocations remaining against the source host that might need to be removed: {u'resources': {u'VCPU': 1, u'MEMORY_MB': 512, u'DISK_GB': 1}}.

When the instance is moved from hostB --> to hostA,

Why does nova-compute log "Instance d459e746-ac99-448b-9e44-890fbfdcb6f0 has been moved to another host hostB(hostB). "

I did a little investigating of _remove_deleted_instances_allocations in ./nova/compute/resource_tracker.py.

1. It is logged as a WARNING and not as an INFO as you would expect for a status update and it is not at the end of LM operation(not a final status)

2. It is part of the _remove_deleted_instances_allocations method which deals with the corner cases in move, local delete, unshelve and rebuild operations for when allocations should be deleted when things didn't happen according to the normal flow of events, where the scheduler always creates allocations for an instance.

3. The corner case where the WARNING gets logged is described as:

# The instance has been moved to another host either via a
# migration, evacuation or unshelve in between the time when we
# ran InstanceList.get_by_host_and_node(), added those
# instances to RT.tracked_instances and the above
# allocations that reference this compute host if the VM is in
# a stable terminal state (i.e. it isn't in a state of waiting
# for resize to confirm/revert), however if the destination
# host is an Ocata compute host, it will delete the allocation
# that contains this source compute host information anyway and
# recreate an allocation that only refers to itself. So we
# don't need to do anything in that case. Just log the
# situation here for information but don't attempt to delete or
# change the allocation.

The code condition is:

if instance.host != cn.host:
    LOG.warning("Instance %s has been moved to another host "
                "%s(%s). There are allocations remaining against "
                "the source host that might need to be removed: "
                "%s.",
                 instance_uuid, instance.host, instance.node, alloc)

My conclusion is that the warning message could be changed to something different, perhaps something like

    LOG.warning("Instance %s has been temporarily moved to another host "
                "%s(%s). There are allocations remaining against "
                "the source host that might need to be removed: "
                "%s.",
                 instance_uuid, instance.host, instance.node, alloc)

Fix proposed to branch: master
Review: https://review.opendev.org/657916

Changed in nova:
assignee: nobody → Brin Zhang (zhangbailin)
status: New → In Progress
Brin Zhang (zhangbailin) wrote :

I am sorry, please ignore the comment above on ^%^.
This log is not at the end of LM operation(not a final status), so I think add the "temporarily" is suitable.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers