nova-compute fails with DBNotAllowed error

Bug #1839360 reported by Dmitriy Rabotyagov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Matt Riedemann
Rocky
Fix Committed
Medium
Matt Riedemann
Stein
Fix Committed
Medium
Matt Riedemann

Bug Description

Description
===========

During routine operations or things like running regular tempest checks nova-compute tries to reach database and fails with DBNotAllowed error:
https://logs.opendev.org/33/660333/10/check/openstack-ansible-deploy-aio_metal-ubuntu-bionic/97d8bc3/logs/host/nova-compute.service.journal-23-20-40.log.txt.gz#_Aug_06_22_51_25

Steps to reproduce
==================

This might be reproduced with deploying all nova components (like api, scheduler, conductor, compute) on the same host (OSA all-in-one deployment). During such setup single configuration file is used (nova.conf).

As a solution it's possible to log more helpful information why this happens and add some description into docs.

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: New → In Progress
Revision history for this message
Matt Riedemann (mriedem) wrote :

https://review.opendev.org/#/q/Icddbe4760eaff30e4e13c1e8d3d5d3f489dac3c4 goes back to stable/rocky so this should go back that far as well.

Changed in nova:
importance: Undecided → Medium
tags: added: docs serviceability
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/675148
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7d7d58509d5e60ec19c6310931dc62eeff033595
Submitter: Zuul
Branch: master

commit 7d7d58509d5e60ec19c6310931dc62eeff033595
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 7 12:23:15 2019 -0400

    Add useful error log when _determine_version_cap raises DBNotAllowed

    Change Icddbe4760eaff30e4e13c1e8d3d5d3f489dac3c4 was intended for the
    API service to check all cells for the minimum nova-compute service
    version when [upgrade_levels]/compute=auto.

    That worked in the gate with devstack because we don't configure
    nova-compute with access to the database and run nova-compute with
    a separate nova-cpu.conf so even if nova-compute is on the same
    host as the nova-api service, they aren't using the same config
    file (nova-api runs with nova.conf which has access to the API DB
    obviously).

    The problem is when nova-compute is configured with
    [upgrade_levels]/compute=auto and an [api_database]/connection,
    there are flows that can try to hit the API database directly
    because of the _determine_version_cap method. For example, the
    _sync_power_states periodic task trying to stop an instance,
    or even simple inter-compute communication over RPC like during
    a resize.

    This change simply catches the DBNotAllowed exception, logs a more
    useful error message, and re-raises the exception. In addition,
    the config help for the [api_database] group and "configuration"
    option specifically are updated to mention they should not be set
    on the nova-compute service.

    Change-Id: Iac2911a7a305a9d14bc6dadb364998f3ecb9ce42
    Related-Bug: #1807044
    Closes-Bug: #1839360

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/675714

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/stein)

Reviewed: https://review.opendev.org/675714
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bd03723a0c1a16d67433658fb486a84bb1bddf02
Submitter: Zuul
Branch: stable/stein

commit bd03723a0c1a16d67433658fb486a84bb1bddf02
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 7 12:23:15 2019 -0400

    Add useful error log when _determine_version_cap raises DBNotAllowed

    Change Icddbe4760eaff30e4e13c1e8d3d5d3f489dac3c4 was intended for the
    API service to check all cells for the minimum nova-compute service
    version when [upgrade_levels]/compute=auto.

    That worked in the gate with devstack because we don't configure
    nova-compute with access to the database and run nova-compute with
    a separate nova-cpu.conf so even if nova-compute is on the same
    host as the nova-api service, they aren't using the same config
    file (nova-api runs with nova.conf which has access to the API DB
    obviously).

    The problem is when nova-compute is configured with
    [upgrade_levels]/compute=auto and an [api_database]/connection,
    there are flows that can try to hit the API database directly
    because of the _determine_version_cap method. For example, the
    _sync_power_states periodic task trying to stop an instance,
    or even simple inter-compute communication over RPC like during
    a resize.

    This change simply catches the DBNotAllowed exception, logs a more
    useful error message, and re-raises the exception. In addition,
    the config help for the [api_database] group and "configuration"
    option specifically are updated to mention they should not be set
    on the nova-compute service.

    NOTE(mriedem): The test was modified to set the LAST_VERSION
    global to None since change I48109d5e32a2e9635c240da1c77f7f6cc7e3c76d
    is not in Stein.

    Change-Id: Iac2911a7a305a9d14bc6dadb364998f3ecb9ce42
    Related-Bug: #1807044
    Closes-Bug: #1839360
    (cherry picked from commit 7d7d58509d5e60ec19c6310931dc62eeff033595)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/679449

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 20.0.0.0rc1

This issue was fixed in the openstack/nova 20.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.3

This issue was fixed in the openstack/nova 19.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.opendev.org/679449
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7732f0e6f3691025a06f2abfc5a57c71d83e5e72
Submitter: Zuul
Branch: stable/rocky

commit 7732f0e6f3691025a06f2abfc5a57c71d83e5e72
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 7 12:23:15 2019 -0400

    Add useful error log when _determine_version_cap raises DBNotAllowed

    Change Icddbe4760eaff30e4e13c1e8d3d5d3f489dac3c4 was intended for the
    API service to check all cells for the minimum nova-compute service
    version when [upgrade_levels]/compute=auto.

    That worked in the gate with devstack because we don't configure
    nova-compute with access to the database and run nova-compute with
    a separate nova-cpu.conf so even if nova-compute is on the same
    host as the nova-api service, they aren't using the same config
    file (nova-api runs with nova.conf which has access to the API DB
    obviously).

    The problem is when nova-compute is configured with
    [upgrade_levels]/compute=auto and an [api_database]/connection,
    there are flows that can try to hit the API database directly
    because of the _determine_version_cap method. For example, the
    _sync_power_states periodic task trying to stop an instance,
    or even simple inter-compute communication over RPC like during
    a resize.

    This change simply catches the DBNotAllowed exception, logs a more
    useful error message, and re-raises the exception. In addition,
    the config help for the [api_database] group and "configuration"
    option specifically are updated to mention they should not be set
    on the nova-compute service.

    Change-Id: Iac2911a7a305a9d14bc6dadb364998f3ecb9ce42
    Related-Bug: #1807044
    Closes-Bug: #1839360
    (cherry picked from commit 7d7d58509d5e60ec19c6310931dc62eeff033595)
    (cherry picked from commit bd03723a0c1a16d67433658fb486a84bb1bddf02)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.2.3

This issue was fixed in the openstack/nova 18.2.3 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.