Misleading log messages "Unexpected attempt to pop events during shutdown" on source host during live migration

Bug #1756994 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Matt Riedemann
Pike
Fix Committed
Low
s10
Queens
Fix Committed
Low
s10

Bug Description

As seen here:

http://logs.openstack.org/71/551371/6/check/legacy-tempest-dsvm-multinode-live-migration/4d466b2/logs/subnode-2/screen-n-cpu.txt.gz#_Mar_19_14_25_07_798071

Mar 19 14:25:07.797329 ubuntu-xenial-rax-dfw-0003055554 nova-compute[29059]: DEBUG nova.compute.manager [None req-24e1d4e3-6f9c-4438-8059-5f7cf69566f0 service nova] [instance: 6347c3ea-894c-4cc8-a3b4-ea9bfe49033e] Received event network-vif-plugged-cd0afcf3-c654-488e-b578-6d3c45f3a5a5 {{(pid=29059) external_instance_event /opt/stack/new/nova/nova/compute/manager.py:7702}}
Mar 19 14:25:07.797762 ubuntu-xenial-rax-dfw-0003055554 nova-compute[29059]: DEBUG oslo_concurrency.lockutils [None req-24e1d4e3-6f9c-4438-8059-5f7cf69566f0 service nova] Lock "6347c3ea-894c-4cc8-a3b4-ea9bfe49033e-events" acquired by "nova.compute.manager._pop_event" :: waited 0.000s {{(pid=29059) inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:273}}
Mar 19 14:25:07.798071 ubuntu-xenial-rax-dfw-0003055554 nova-compute[29059]: DEBUG nova.compute.manager [None req-24e1d4e3-6f9c-4438-8059-5f7cf69566f0 service nova] [instance: 6347c3ea-894c-4cc8-a3b4-ea9bfe49033e] Unexpected attempt to pop events during shutdown {{(pid=29059) _pop_event /opt/stack/new/nova/nova/compute/manager.py:347}}
Mar 19 14:25:07.798448 ubuntu-xenial-rax-dfw-0003055554 nova-compute[29059]: DEBUG oslo_concurrency.lockutils [None req-24e1d4e3-6f9c-4438-8059-5f7cf69566f0 service nova] Lock "6347c3ea-894c-4cc8-a3b4-ea9bfe49033e-events" released by "nova.compute.manager._pop_event" :: held 0.001s {{(pid=29059) inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:285}}
Mar 19 14:25:07.798780 ubuntu-xenial-rax-dfw-0003055554 nova-compute[29059]: DEBUG nova.compute.manager [None req-24e1d4e3-6f9c-4438-8059-5f7cf69566f0 service nova] [instance: 6347c3ea-894c-4cc8-a3b4-ea9bfe49033e] No waiting events found dispatching network-vif-plugged-cd0afcf3-c654-488e-b578-6d3c45f3a5a5 {{(pid=29059) pop_instance_event /opt/stack/new/nova/nova/compute/manager.py:363}}
Mar 19 14:25:07.799222 ubuntu-xenial-rax-dfw-0003055554 nova-compute[29059]: WARNING nova.compute.manager [None req-24e1d4e3-6f9c-4438-8059-5f7cf69566f0 service nova] [instance: 6347c3ea-894c-4cc8-a3b4-ea9bfe49033e] Received unexpected event network-vif-plugged-cd0afcf3-c654-488e-b578-6d3c45f3a5a5 for instance with vm_state active and task_state migrating.

This is when the destination host has plugged VIFs during pre_live_migration, and then because the external server events are routed to the host that the instance is currently on (source host during live migration), the events are routed there and nothing is waiting on them. The message is misleading since the compute service isn't shutting down, it's just that nothing is waiting on the event, which is OK.

That comes from this code:

https://github.com/openstack/nova/blob/43c72f02f277d6f1083e5e9a4009d3d03d965368/nova/compute/manager.py#L346

That should be comparing self._events to None which indicates shutdown:

https://github.com/openstack/nova/blob/43c72f02f277d6f1083e5e9a4009d3d03d965368/nova/compute/manager.py#L398

https://github.com/openstack/nova/blob/43c72f02f277d6f1083e5e9a4009d3d03d965368/nova/compute/manager.py#L1166

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/554380

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/554380
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bbb668bbec2e2fa6b7f64b9af9504b973016058c
Submitter: Zuul
Branch: master

commit bbb668bbec2e2fa6b7f64b9af9504b973016058c
Author: Matt Riedemann <email address hidden>
Date: Mon Mar 19 19:24:32 2018 -0400

    Fix message for unexpected external event

    During live migration, we get vif plugged events
    that we're not waiting for, and if InstanceEvents._events
    is empty, pop_instance_event() will log a message about
    shutting down, but the compute service isn't actually
    shutting down. When it is, ComputeManager.cleanup_host()
    calls cancel_all_events() which sets _events to None, so
    we should only mention shutting down when _events is None.

    Change-Id: Ie868273c4b9629f9b35645df5785304ebba5d64c
    Closes-Bug: #1756994

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.0.0b1

This issue was fixed in the openstack/nova 18.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/589503

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/589505

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/589505
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=91d47b11e939bccff469b1070e072cc58b61b5df
Submitter: Zuul
Branch: stable/queens

commit 91d47b11e939bccff469b1070e072cc58b61b5df
Author: Matt Riedemann <email address hidden>
Date: Mon Mar 19 19:24:32 2018 -0400

    Fix message for unexpected external event

    During live migration, we get vif plugged events
    that we're not waiting for, and if InstanceEvents._events
    is empty, pop_instance_event() will log a message about
    shutting down, but the compute service isn't actually
    shutting down. When it is, ComputeManager.cleanup_host()
    calls cancel_all_events() which sets _events to None, so
    we should only mention shutting down when _events is None.

    Change-Id: Ie868273c4b9629f9b35645df5785304ebba5d64c
    Closes-Bug: #1756994
    (cherry picked from commit bbb668bbec2e2fa6b7f64b9af9504b973016058c)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.6

This issue was fixed in the openstack/nova 17.0.6 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/589503
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=05b945e61ab2739b39869f571379e1c95a017906
Submitter: Zuul
Branch: stable/pike

commit 05b945e61ab2739b39869f571379e1c95a017906
Author: Matt Riedemann <email address hidden>
Date: Mon Mar 19 19:24:32 2018 -0400

    Fix message for unexpected external event

    During live migration, we get vif plugged events
    that we're not waiting for, and if InstanceEvents._events
    is empty, pop_instance_event() will log a message about
    shutting down, but the compute service isn't actually
    shutting down. When it is, ComputeManager.cleanup_host()
    calls cancel_all_events() which sets _events to None, so
    we should only mention shutting down when _events is None.

    Change-Id: Ie868273c4b9629f9b35645df5785304ebba5d64c
    Closes-Bug: #1756994
    (cherry picked from commit bbb668bbec2e2fa6b7f64b9af9504b973016058c)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.1.6

This issue was fixed in the openstack/nova 16.1.6 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.