xenapi: checking agent by default is confusing

Bug #1178223 reported by Bob Ball
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
John Garbutt

Bug Description

I've installed OpenStack (Version 2012.1.1-18) on top of a XenServer installation. Instances are created fine, but it takes really long for Openstack to realize that the instance is up and running. In the nova-compute.log I see the following messages:

2013-05-09 05:19:47 DEBUG nova.virt.xenapi_conn [-] Got exception: ['XENAPI_PLUGIN_FAILURE', 'version', 'PluginError', 'TIMEOUT: No response from agent within 30 seconds.'] from (pid=3047) _unwrap_plugin_exceptions /usr/lib/python2.7/dist-packages/nova/virt/xenapi_conn.py:612
2013-05-09 05:19:47 ERROR nova.virt.xenapi.vmops [req-80868dbf-a0ec-4e5c-ba1c-40073407a69e e2d41ace30d1492ebb24edfbf30b9089 136c4d2c8f3e4c74a0a4a4ef0e45a901] TIMEOUT: The call to version timed out. VM id=a1978a21-7598-4d5e-984b-a9ca858f7237; args={'path': '', 'dom_id': '4', 'id': '957d7359-1632-4b38-bf19-8bfd0d45aca5', 'host_uuid': 'c38359bb-7a82-ac2b-0ee6-4a6dd68c5285'}
2013-05-09 05:19:47 ERROR nova.virt.xenapi.vmops [req-80868dbf-a0ec-4e5c-ba1c-40073407a69e e2d41ace30d1492ebb24edfbf30b9089 136c4d2c8f3e4c74a0a4a4ef0e45a901] Failed to query agent version: {'message': 'TIMEOUT: No response from agent within 30 seconds.', 'returncode': 'timeout'}

These three messages are repeated 5 times, until I get these messages:

2013-05-09 05:24:20 DEBUG nova.compute.manager [req-80868dbf-a0ec-4e5c-ba1c-40073407a69e e2d41ace30d1492ebb24edfbf30b9089 136c4d2c8f3e4c74a0a4a4ef0e45a901] [instance: a1978a21-7598-4d5e-984b-a9ca858f7237] Checking state from (pid=3047) _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:272
2013-05-09 05:24:20 INFO nova.virt.xenapi.vm_utils [req-80868dbf-a0ec-4e5c-ba1c-40073407a69e e2d41ace30d1492ebb24edfbf30b9089 136c4d2c8f3e4c74a0a4a4ef0e45a901] (VM_UTILS) xenserver vm state -> |Running|
2013-05-09 05:24:20 INFO nova.virt.xenapi.vm_utils [req-80868dbf-a0ec-4e5c-ba1c-40073407a69e e2d41ace30d1492ebb24edfbf30b9089 136c4d2c8f3e4c74a0a4a4ef0e45a901] (VM_UTILS) xenapi power_state -> |1|

And then the dashboard properly shows the instance as running. But the instance was up and running long ago. The instace takes about 30 seconds until command prompt, but OpenStack waits about 4 minutes to mark it as 'Running'.

Tags: xenserver
Bob Ball (bob-ball)
Changed in nova:
status: New → Confirmed
tags: added: xenserver
Revision history for this message
John Garbutt (johngarbutt) wrote :

It seems friendlier to disable the agent by default.

Changed in nova:
importance: Undecided → Medium
assignee: nobody → John Garbutt (johngarbutt)
Revision history for this message
Bob Ball (bob-ball) wrote :

This can be fixed on a per-installation basis by setting "xenapi_disable_agent = True" in /etc/nova/nova.conf.

This bug is to track that the default should be changed to be False.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/28665

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
John Garbutt (johngarbutt) wrote : Re: Failed to query agent version

Taking a different approach: https://review.openstack.org/#/c/28676/

summary: - Failed to query agent version
+ xenapi: checking agent by default is confusing
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/28676
Committed: http://github.com/openstack/nova/commit/0357b01c12eb6b84b5038bbf465fd3b9d4921a29
Submitter: Jenkins
Branch: master

commit 0357b01c12eb6b84b5038bbf465fd3b9d4921a29
Author: John Garbutt <email address hidden>
Date: Thu May 9 15:44:38 2013 +0100

    xenapi: make the xenapi agent optional per image

    This adds the ability to decide, per image, if xenapi should use
    the agent for servers created from that image.
    This opens up the path to using config drive or the metadata
    service with cloud-init to perform tasks like file injection

    It uses the image properties that are copied into system metadata
    to detect if "xenapi_agent_present"="true" on the image the server
    was created from.
    If the tag is not present, it defaults to the value
    of the new conf setting "xenapi_agent_present_default".

    Becuase the above setting defaults to False, it means that
    the xenapi driver no longer waits for the agent by default.

    DocImpact
    fixes bug 1178223
    part of blueprint xenapi-guest-agent-cloud-init-interop
    Change-Id: Ie51a9f54e5b2e85fe4ebebb0aff975db296ba996

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → havana-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: havana-1 → 2013.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.