remove single-nova-consoleauth configuration mode

Bug #1781620 reported by Chris Gregan
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Nova Cloud Controller Charm
Fix Released
Medium
Unassigned
OpenStack Nova Compute Charm
Invalid
Undecided
Unassigned

Bug Description

Still trying to figure out what happened here.

Deploy failed because several hooks continued to run and the deployment timed out.

image-service-relation-changed
hanode-relation-changed
quantum-network-service-relation-joined
cluster-relation-joined

Nova-compute log is full of this:
2018-07-13 04:30:36.018 1080384 WARNING nova.conductor.api [req-943a2756-beaf-408b-be2f-be274ea0e53b - - - - -] Timed out waiting for nova-conductor. Is it running? Or did this service start before nova-conductor? Reattempting establishment of nova-conductor connection...: MessagingTimeout: Timed out waiting for a reply to message ID 1a964ff0904a443195e57f69c9eb9859
2018-07-13 04:31:36.032 1080384 WARNING nova.conductor.api [req-943a2756-beaf-408b-be2f-be274ea0e53b - - - - -] Timed out waiting for nova-conductor. Is it running? Or did this service start before nova-conductor? Reattempting establishment of nova-conductor connection...: MessagingTimeout: Timed out waiting for a reply to message ID f15e70c5419a4494adc45b1296fa3e51
2018-07-13 04:32:36.046 1080384 WARNING nova.conductor.api [req-943a2756-beaf-408b-be2f-be274ea0e53b - - - - -] Timed out waiting for nova-conductor. Is it running? Or did this service start before nova-conductor? Reattempting establishment of nova-conductor connection...: MessagingTimeout: Timed out waiting for a reply to message ID 6168251176a74eb8b95e787ae9474654
2018-07-13 04:33:36.059 1080384 WARNING nova.conductor.api [req-943a2756-beaf-408b-be2f-be274ea0e53b - - - - -] Timed out waiting for nova-conductor. Is it running? Or did this service start before nova-conductor? Reattempting establishment of nova-conductor connection...: MessagingTimeout: Timed out waiting for a reply to message ID 8f5112d5fbca4e2d99bcf291fe942053
2018-07-13 04:34:36.070 1080384 WARNING nova.conductor.api [req-943a2756-beaf-408b-be2f-be274ea0e53b - - - - -] Timed out waiting for nova-conductor. Is it running? Or did this service start before nova-conductor? Reattempting establishment of nova-conductor connection...: MessagingTimeout: Timed out waiting for a reply to message ID 4dbaaf55215c43b58bbf6bb20e3d088e
2018-07-13 04:35:36.084 1080384 WARNING nova.conductor.api [req-943a2756-beaf-408b-be2f-be274ea0e53b - - - - -] Timed out waiting for nova-conductor. Is it running? Or did this service start before nova-conductor? Reattempting establishment of nova-conductor connection...: MessagingTimeout: Timed out waiting for a reply to message ID 3904dd3119d446b087f90c183e1aa7d5
2018-07-13 04:36:36.097 1080384 WARNING nova.conductor.api [req-943a2756-beaf-408b-be2f-be274ea0e53b - - - - -] Timed out waiting for nova-conductor. Is it running? Or did this service start before nova-conductor? Reattempting establishment of nova-conductor connection...: MessagingTimeout: Timed out waiting for a reply to message ID 8233a1d510134b138d77f77988931296

Revision history for this message
Chris Gregan (cgregan) wrote :
Revision history for this message
James Page (james-page) wrote :

The nova-compute messages are just due to incomplete configuration as hooks have not completed execution

Revision history for this message
James Page (james-page) wrote :

I think this deploy is using:

  nova-cloud-controller: single-nova-consoleauth=True

which is the default for this configuration - however the ocf resource that manages the nova-consoleauth process is borking as messaging is not yet configured, so the resource gets marked as failed.

This appears to be blocking the entire deployment whilst pacemaker resource configuration spins.

This is a race condition but not one I think we should be fixing - instead lets move the FCE builds to use nova-cloud-controller with memcached (for releases up to Queens) with this option set to false and drop the nova-consoleauth process from Rocky onwards, where its no longer needed.

Changed in charm-nova-compute:
status: New → Invalid
James Page (james-page)
Changed in charm-nova-cloud-controller:
status: New → Triaged
importance: Undecided → Medium
milestone: none → 19.04
summary: - Nova Fails to come up completely.
+ remove single-nova-consoleauth configuration mode
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-nova-cloud-controller (master)

Reviewed: https://review.openstack.org/641329
Committed: https://git.openstack.org/cgit/openstack/charm-nova-cloud-controller/commit/?id=b6e314077fa352ba58c346919a1e1cd4f6593226
Submitter: Zuul
Branch: master

commit b6e314077fa352ba58c346919a1e1cd4f6593226
Author: James Page <email address hidden>
Date: Wed Mar 6 10:56:50 2019 +0000

    Drop support for single-nova-consoleauth

    Remove support for single-nova-consoleauth operation; this option
    managed a single instance of the nova-consoleauth process across
    a cluster nova-cloud-controller application using the hacluster
    charm. This proves somewhat racey on deployment as the ocf resource
    deep checks the operation of nova-consoleauth including connectivity
    to AMQP etc.. If the clustering of the service occurs before
    other principle relations have been completed, the resource will
    fail to start and the hook execution will spin, never returning.

    HA deployments should always use memcached to share tokens between
    instances of the nova-consolauth daemon; If the 'ha' relation is
    detected, then ensure that a memcache relation is then required
    for charm operation.

    To support evaluation of the memcache relation completeness
    the memcache specific code in InstanceConsoleContext was split out
    into a new memcache specific class RemoteMemcacheContext.

    Existing pacemaker resources will be deleted on upgrade; units will
    move into a blocked state until a relation is added to memcached.

    The nova-consoleauth service is resumed on upgrade to ensure that
    instances run on all nova-cloud-controller units.

    Change-Id: I2ac91b2bd92269b761befeb7563ad01cc5431151
    Closes-Bug: 1781620

Changed in charm-nova-cloud-controller:
status: Triaged → Fix Committed
David Ames (thedac)
Changed in charm-nova-cloud-controller:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.