xenapi: windows agent unreliable due to reboots

Bug #1370999 reported by John Garbutt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
John Garbutt

Bug Description

The windows nova-agent now can trigger a gust reboot during resetnetwork, so the hostname is correctly updated.

Also there was always a reboot during the first stages of polling for the agent version that can cause the need to wait for a call to timeout, rather than detecting a reboot.

Either way, we need to take more care to detect reboots while talking to the agent.

Tags: xenserver
Changed in nova:
importance: Undecided → Medium
status: New → Triaged
Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/122100
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d65ca49990f1241a5a8afdc3ad0e1a99c878275a
Submitter: Jenkins
Branch: master

commit d65ca49990f1241a5a8afdc3ad0e1a99c878275a
Author: John Garbutt <email address hidden>
Date: Mon Jun 30 14:09:24 2014 +0100

    xenapi: deal with reboots while talking to agent

    The latest version of the agent has started changing the hostname
    through the official windows APIs, when resetnetwork is called. This
    means a reboot might happen after resetnetwork.

    A reboot happens when polling for the version. Until now we wait for the
    call timeout, before fetching the new dom id. This change ensures that
    if we spot a reboot, the plugin exits early rather than keeping polling
    the wrong dom id.

    Turns out its best to wait for the dom_id to change, before trying
    to poll the agent again. Once the dom_id in xenapi has been updated,
    the xenstore keys are always in place.

    Trying too early leads to lots of reboot detections because we are
    retrying with the old dom_id. XenServer continues to return the old
    dom_id for a little while after the reboot.

    Closes-Bug: #1370999

    Change-Id: Id0bf5b64f2b271d162db5bbce50167ab1f665c87

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → kilo-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.