Ports not connected for recovered instances

Bug #1799163 reported by Lucian Petrut
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
compute-hyperv
Fix Released
Undecided
Unassigned

Bug Description

After a while, the Failover Cluster will stop retrying when attempting to bring back up failed instances. For example, if the CSV is down more than a few minutes, the cluster groups will be set in "Failed" state, while the VMs won't be registered on any Hyper-V node.

The issue is that we're only handling cluster group owner changes (moved instances). If the admin fixes the issue and manually brings the cluster groups back up, the instances are recreated but we aren't handling this, so ports won't get reconnected.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to compute-hyperv (master)

Reviewed: https://review.openstack.org/611637
Committed: https://git.openstack.org/cgit/openstack/compute-hyperv/commit/?id=f6c6dbdf9da59e6cfff9e64590765c9066857530
Submitter: Zuul
Branch: master

commit f6c6dbdf9da59e6cfff9e64590765c9066857530
Author: Lucian Petrut <email address hidden>
Date: Thu Oct 18 17:23:27 2018 +0300

    Refactor instance state change event handler

    We ended up having too much logic in the event handler. This change
    trims it down, making it easier to add callbacks.

    The actions taken when receiving events are now moved out of the
    event handler, being registered as callbacks.

    Related-Bug: #1799163
    Change-Id: Ie93aa0e184c5ff9daef4e857e212bb54e9473a78

Changed in compute-hyperv:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to compute-hyperv (master)

Reviewed: https://review.openstack.org/611638
Committed: https://git.openstack.org/cgit/openstack/compute-hyperv/commit/?id=d3360948f6dde1639200829a946a434f83a91f6e
Submitter: Zuul
Branch: master

commit d3360948f6dde1639200829a946a434f83a91f6e
Author: Lucian Petrut <email address hidden>
Date: Thu Oct 18 17:49:22 2018 +0300

    Retry plugging ports when clustered instances start

    After a while, the Failover Cluster will stop retrying when attempting
    to bring back up failed instances. For example, if the CSV is down
    more than a few minutes, the cluster groups will be set in "Failed"
    state, while the VMs won't be registered on any Hyper-V node.

    The issue is that we're only handling cluster group owner changes
    (moved instances). If the admin fixes the issue and manually brings
    the cluster groups back up, the instances are recreated but we aren't
    handling this, so ports won't get reconnected.

    This change will double check the ports when clustered instances
    start.

    Closes-Bug: #1799163

    Change-Id: I5caa65d7b7922dc9632b18acedaf1aedeec3fcc3

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/compute-hyperv 9.0.0.0rc1

This issue was fixed in the openstack/compute-hyperv 9.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.