rebooted instances are shutdown by libvirt lifecycle event handling

Bug #1443186 reported by Matt Riedemann
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Matt Riedemann
Juno
Fix Released
Undecided
Unassigned
Kilo
Fix Released
Undecided
Unassigned

Bug Description

This is a continuation of bug 1293480 (which created bug 1433049). Those were reported against xen domains with the libvirt driver but we have a recreate with CONF.libvirt.virt_type=kvm, see the attached logs and reference the instance with uuid 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78.

In this case, we're running a stress test of soft rebooting 30 active instances at once. Because of a delay in the libvirt lifecycle event handling, they are all shutdown after the reboot operation is complete and the instances go from ACTIVE to SHUTDOWN.

This was reported to me against Icehouse code but the recreate is against Juno code with patch:

https://review.openstack.org/#/c/169782/

For better logging.

Snippets from the log:

2015-04-10 21:02:38.234 11195 AUDIT nova.compute.manager [req-b24d4f8d-4a10-44c8-81d7-f79f27e3a3e7 None] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] Rebooting instance

2015-04-10 21:03:47.703 11195 DEBUG nova.compute.manager [req-8219e6cf-dce8-44e7-a5c1-bf1879e155b2 None] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] Received event network-vif-unplugged-0b2c7633-a5bc-4150-86b2-c8ba58ffa785 external_instance_event /usr/lib/python2.6/site-packages/nova/compute/manager.py:6285

2015-04-10 21:03:49.299 11195 INFO nova.virt.libvirt.driver [req-b24d4f8d-4a10-44c8-81d7-f79f27e3a3e7 None] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] Instance shutdown successfully.

2015-04-10 21:03:53.251 11195 DEBUG nova.compute.manager [req-521a6bdb-172f-4c0c-9bef-855087d7dff0 None] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] Received event network-vif-plugged-0b2c7633-a5bc-4150-86b2-c8ba58ffa785 external_instance_event /usr/lib/python2.6/site-packages/nova/compute/manager.py:6285

2015-04-10 21:03:53.259 11195 INFO nova.virt.libvirt.driver [-] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] Instance running successfully.

2015-04-10 21:03:53.261 11195 INFO nova.virt.libvirt.driver [req-b24d4f8d-4a10-44c8-81d7-f79f27e3a3e7 None] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] Instance soft rebooted successfully.

**
At this point we have successfully soft rebooted the instance
**

now we get a lifecycle event from libvirt that the instance is stopped, since we're no longer running a task we assume the hypervisor is correct and we call the stop API

2015-04-10 21:04:01.133 11195 DEBUG nova.virt.driver [-] Emitting event <LifecycleEvent: 1428699829.13, 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78 => Stopped> emit_event /usr/lib/python2.6/site-packages/nova/virt/driver.py:1298
2015-04-10 21:04:01.134 11195 INFO nova.compute.manager [-] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] VM Stopped (Lifecycle Event)
2015-04-10 21:04:01.245 11195 INFO nova.compute.manager [-] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] Synchronizing instance power state after lifecycle event "Stopped"; current vm_state: active, current task_state: None, current DB power_state: 1, VM power_state: 4
2015-04-10 21:04:01.334 11195 INFO nova.compute.manager [-] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (4). Updating power_state in the DB to match the hypervisor.
2015-04-10 21:04:01.463 11195 WARNING nova.compute.manager [-] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] Instance shutdown by itself. Calling the stop API. Current vm_state: active, current task_state: None, original DB power_state: 1, current VM power_state: 4

**
now we get a lifecycle event from libvirt that the instance is started, but since the instance already has a task_state of 'powering-off' because of the previous stop API call from _sync_instance_power_state, we ignore it.
**

2015-04-10 21:04:02.085 11195 DEBUG nova.virt.driver [-] Emitting event <LifecycleEvent: 1428699831.45, 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78 => Started> emit_event /usr/lib/python2.6/site-packages/nova/virt/driver.py:1298
2015-04-10 21:04:02.086 11195 INFO nova.compute.manager [-] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] VM Started (Lifecycle Event)
2015-04-10 21:04:02.190 11195 INFO nova.compute.manager [-] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] Synchronizing instance power state after lifecycle event "Started"; current vm_state: active, current task_state: powering-off, current DB power_state: 4, VM power_state: 1
2015-04-10 21:04:02.414 11195 INFO nova.compute.manager [-] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] During sync_power_state the instance has a pending task (powering-off). Skip.
2015-04-10 21:04:05.152 11195 DEBUG nova.compute.manager [req-682ce732-8a99-47d7-82c2-49ebdaea5332 None] [instance: 9ad8f6c5-a5dc-4820-9ea5-fa081e74ec78] Stopping instance; current vm_state: active, current task_state: powering-off, current DB power_state: 4, current VM power_state: 1 do_stop_instance /usr/lib/python2.6/site-packages/nova/compute/manager.py:2631

**
from here on out, we'd shutdown the instance and that's why it's left in this state.
**

--

There was a change made in kilo https://review.openstack.org/#/c/112946/ that delays sending the stopped lifecycle event waiting for a started lifecycle event which will cancel out the stopped event if it falls within a 15 second window. That was only made for xen domains but as we're seeing this isn't just xen domains, it's also kvm, and therefore is probably an issue for any libvirt domain in nova.

We made a change to include kvm in the delay logic:

        if CONF.libvirt.virt_type in ("xen", "kvm"):
            self._lifecycle_delay = 15
        else:
            self._lifecycle_delay = 0

After that, only 1 out of 372 instances hit the problem and there was a warning in the logs about a looping call reporting state:

2015-04-12 17:13:31.330 30126 AUDIT nova.compute.manager [req-7d7976d8-6e66-4dd3-958f-f8ef2454dd8c None] [instance: 2189fcd6-8bed-4a1c-916a-1190b40ba568] Rebooting instance
2015-04-12 17:13:52.281 30126 AUDIT nova.compute.resource_tracker [-] Auditing locally available compute resources
2015-04-12 17:14:02.296 30126 AUDIT nova.compute.manager [req-9dfd29ff-9bb0-4e10-a16d-4c8f86744176 None] [instance: 4a659b53-75ad-4410-bd91-d85dafe0175f] Rebooting instance
2015-04-12 17:14:28.122 30126 AUDIT nova.compute.resource_tracker [-] Total physical ram (MB): 129026, total allocated virtual ram (MB): 12800
2015-04-12 17:14:28.124 30126 AUDIT nova.compute.resource_tracker [-] Free disk (GB): 4875
2015-04-12 17:14:28.125 30126 AUDIT nova.compute.resource_tracker [-] Total usable vcpus: 32, total allocated vcpus: 6
2015-04-12 17:14:28.126 30126 AUDIT nova.compute.resource_tracker [-] PCI stats: []
2015-04-12 17:14:38.036 30126 AUDIT nova.compute.manager [req-aa25e5f9-5083-46b8-bf29-cc16c2c44c75 None] [instance: 81688c40-3eeb-4d6c-85b3-b2b7dd66eb81] Rebooting instance
2015-04-12 17:14:59.883 30126 WARNING nova.openstack.common.loopingcall [-] task <bound method DbDriver._report_state of <nova.servicegroup.drivers.db.DbDriver object at 0x3401750>> run outlasted interval by 14.99 sec
2015-04-12 17:15:02.486 30126 AUDIT nova.compute.resource_tracker [-] Auditing locally available compute resources
2015-04-12 17:15:05.019 30126 WARNING nova.compute.manager [req-c17f5069-7540-42be-b7cf-718bf551df66 None] [instance: 81688c40-3eeb-4d6c-85b3-b2b7dd66eb81] Instance shutdown by itself. Calling the stop API. Current vm_state: active, current task_state: None, current DB power_state: 4, current VM power_state: 4
2015-04-12 17:15:20.602 30126 AUDIT nova.compute.manager [req-34a41e3b-3460-4170-b06e-6e506efc5d69 None] [instance: c9ba059c-e0a2-43e8-a7de-e3710565337f] Rebooting instance
2015-04-12 17:15:40.152 30126 AUDIT nova.compute.manager [req-2d790be7-cc9b-4256-859d-e0b521459f5d None] [instance: e996f312-4dd3-456a-9747-972ff674e9fe] Rebooting instance
2015-04-12 17:15:45.047 30126 AUDIT nova.compute.manager [req-b4ab6799-7d15-471b-b76f-cc4e0e35b359 None] [instance: ef46f5c4-daef-4b42-b5c9-1764cfead244] Rebooting instance

I'll post a change to remove the xen-specific filtering in the lifecycle delay processing and I also already had a change posted to add a config option to disable the lifecycle event handling since it's racy and not tested at scale in the gate so operators should be able to turn it off:

https://review.openstack.org/#/c/159275/

Revision history for this message
Matt Riedemann (mriedem) wrote :
tags: added: kilo-backport-potential
Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/172775

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/172775
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1d4d00ec1c6d59fd6df5be776c781c38a4de8e28
Submitter: Jenkins
Branch: master

commit 1d4d00ec1c6d59fd6df5be776c781c38a4de8e28
Author: Matt Riedemann <email address hidden>
Date: Sun Apr 12 15:59:52 2015 -0700

    Delay STOPPED lifecycle event for all domains, not just Xen

    Commit bd8329b34098436d18441a8129f3f20af53c2b91 added code to delay
    sending a STOPPED lifecycle event for Xen domains to try and ease a race
    during reboot where the STOPPED event is sent after the instance is
    running again after the reboot operation completed and the
    instance.task_state is None. This would trigger handling in the compute
    manager which would assume the instance shouldn't be running and shuts
    it down. Since the instance is being shutdown, it's task_state is
    'powering-off' when the lifecycle STARTED event comes in shortly after
    the STOPPED event and it's ignored in _sync_instance_power_state.

    The change here is to remove the Xen restriction because there is a
    recreate (on Icehouse and Juno) with virt_type=KVM, so we can assume
    this is racy for any configuration of libvirt and the reboot operation
    and should add the delay for all virt types. The detailed logs with the
    recreate are attached in the bug report.

    Closes-Bug: #1443186

    Change-Id: I3667786ae72f782b3664960e914c98ab94767970

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/178381

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/178446

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/159275
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d09785b97a282e8538642f6f8bcdd8491197ed74
Submitter: Jenkins
Branch: master

commit d09785b97a282e8538642f6f8bcdd8491197ed74
Author: Matt Riedemann <email address hidden>
Date: Wed Feb 25 14:13:45 2015 -0800

    Add config option to disable handling virt lifecycle events

    Historically the _sync_power_states periodic task has had the potential
    for race conditions and several changes have been made to try and
    tighten up this code:

    cc5388bbe81aba635fb757e202d860aeed98f3e8
    aa1792eb4c1d10e9a192142ce7e20d37871d916a
    baabab45e0ae0e9e35872cae77eb04bdb5ee0545
    bd8329b34098436d18441a8129f3f20af53c2b91

    The handle_lifecycle_events method which gets power state change events
    from the compute driver (currently only implemented by the libvirt
    driver) and calls _sync_instance_power_state - the same method that the
    _sync_power_states periodic task uses, except the periodic task at least
    locks when it's running - expands the scope for race problems in the
    compute manager so cloud providers should be able to turn it off. It is
    also known to have races with reboot where rebooted instances are
    automatically shutdown because of delayed lifecycle events that the
    instance is stopped even though it's running.

    This is consistent with the view that Nova should manage it's own state
    and not rely on external events telling it what to do about state
    changes. For example, in _sync_instance_power_state, if the Nova
    database thinks an instance is stopped but the hypervisor says it's
    running, the compute manager issues a force-stop on the instance.

    Also, although not documented (at least from what I can find), Nova has
    historically held a stance that it does not support out-of-band
    discovery and management of instances, so allowing external events to
    change state somewhat contradicts that stance and should be at least a
    configurable deployment option.

    DocImpact: New config option "handle_virt_lifecycle_events" in the
               DEFAULT group of nova.conf. By default the value is True
               so there is no upgrade impact or change in functionality.

    Related-Bug: #1293480
    Partial-Bug: #1443186
    Partial-Bug: #1444630

    Change-Id: I26a1bc70939fb40dc38e9c5c43bf58ed1378bcc7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/kilo)

Reviewed: https://review.openstack.org/178381
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=709f0bb5b1839a9b5c534bb93f25f6998c162190
Submitter: Jenkins
Branch: stable/kilo

commit 709f0bb5b1839a9b5c534bb93f25f6998c162190
Author: Matt Riedemann <email address hidden>
Date: Sun Apr 12 15:59:52 2015 -0700

    Delay STOPPED lifecycle event for all domains, not just Xen

    Commit bd8329b34098436d18441a8129f3f20af53c2b91 added code to delay
    sending a STOPPED lifecycle event for Xen domains to try and ease a race
    during reboot where the STOPPED event is sent after the instance is
    running again after the reboot operation completed and the
    instance.task_state is None. This would trigger handling in the compute
    manager which would assume the instance shouldn't be running and shuts
    it down. Since the instance is being shutdown, it's task_state is
    'powering-off' when the lifecycle STARTED event comes in shortly after
    the STOPPED event and it's ignored in _sync_instance_power_state.

    The change here is to remove the Xen restriction because there is a
    recreate (on Icehouse and Juno) with virt_type=KVM, so we can assume
    this is racy for any configuration of libvirt and the reboot operation
    and should add the delay for all virt types. The detailed logs with the
    recreate are attached in the bug report.

    Closes-Bug: #1443186

    Change-Id: I3667786ae72f782b3664960e914c98ab94767970
    (cherry picked from commit 1d4d00ec1c6d59fd6df5be776c781c38a4de8e28)

tags: added: in-stable-kilo
Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/juno)

Reviewed: https://review.openstack.org/178446
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6208365d061c7abe58678222671a19ff63e24b24
Submitter: Jenkins
Branch: stable/juno

commit 6208365d061c7abe58678222671a19ff63e24b24
Author: Matt Riedemann <email address hidden>
Date: Sun Apr 12 15:59:52 2015 -0700

    Delay STOPPED lifecycle event for all domains, not just Xen

    Commit bd8329b34098436d18441a8129f3f20af53c2b91 added code to delay
    sending a STOPPED lifecycle event for Xen domains to try and ease a race
    during reboot where the STOPPED event is sent after the instance is
    running again after the reboot operation completed and the
    instance.task_state is None. This would trigger handling in the compute
    manager which would assume the instance shouldn't be running and shuts
    it down. Since the instance is being shutdown, it's task_state is
    'powering-off' when the lifecycle STARTED event comes in shortly after
    the STOPPED event and it's ignored in _sync_instance_power_state.

    The change here is to remove the Xen restriction because there is a
    recreate (on Icehouse and Juno) with virt_type=KVM, so we can assume
    this is racy for any configuration of libvirt and the reboot operation
    and should add the delay for all virt types. The detailed logs with the
    recreate are attached in the bug report.

    Conflicts:
     nova/tests/unit/virt/libvirt/test_host.py
     nova/virt/libvirt/host.py

    NOTE(mriedem): The conflicts are mainly due to a large refactor in kilo
    which pulled this code into the nova.virt.libvirt.host module. The
    tests were also moved in kilo.

    Closes-Bug: #1443186

    Change-Id: I3667786ae72f782b3664960e914c98ab94767970
    (cherry picked from commit 1d4d00ec1c6d59fd6df5be776c781c38a4de8e28)
    (cherry picked from commit c6575c6918e3ef01e4e8b828bf7bda6d80b43dc1)

tags: added: in-stable-juno
Revision history for this message
Yang Zhang (bjzyang) wrote :

Hi Matt, I have applied your fix of this bug to my Juno environment, it works well, thanks! : )

Thierry Carrez (ttx)
Changed in nova:
milestone: liberty-1 → 12.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.