nova-compute fails with DBNotAllowed error

Bug #1839360 reported by Dmitriy Rabotyagov on 2019-08-07
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Matt Riedemann
Rocky
Medium
Matt Riedemann
Stein
Medium
Matt Riedemann

Bug Description

Description
===========

During routine operations or things like running regular tempest checks nova-compute tries to reach database and fails with DBNotAllowed error:
https://logs.opendev.org/33/660333/10/check/openstack-ansible-deploy-aio_metal-ubuntu-bionic/97d8bc3/logs/host/nova-compute.service.journal-23-20-40.log.txt.gz#_Aug_06_22_51_25

Steps to reproduce
==================

This might be reproduced with deploying all nova components (like api, scheduler, conductor, compute) on the same host (OSA all-in-one deployment). During such setup single configuration file is used (nova.conf).

As a solution it's possible to log more helpful information why this happens and add some description into docs.

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: New → In Progress
Matt Riedemann (mriedem) wrote :

https://review.opendev.org/#/q/Icddbe4760eaff30e4e13c1e8d3d5d3f489dac3c4 goes back to stable/rocky so this should go back that far as well.

Changed in nova:
importance: Undecided → Medium
tags: added: docs serviceability

Reviewed: https://review.opendev.org/675148
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7d7d58509d5e60ec19c6310931dc62eeff033595
Submitter: Zuul
Branch: master

commit 7d7d58509d5e60ec19c6310931dc62eeff033595
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 7 12:23:15 2019 -0400

    Add useful error log when _determine_version_cap raises DBNotAllowed

    Change Icddbe4760eaff30e4e13c1e8d3d5d3f489dac3c4 was intended for the
    API service to check all cells for the minimum nova-compute service
    version when [upgrade_levels]/compute=auto.

    That worked in the gate with devstack because we don't configure
    nova-compute with access to the database and run nova-compute with
    a separate nova-cpu.conf so even if nova-compute is on the same
    host as the nova-api service, they aren't using the same config
    file (nova-api runs with nova.conf which has access to the API DB
    obviously).

    The problem is when nova-compute is configured with
    [upgrade_levels]/compute=auto and an [api_database]/connection,
    there are flows that can try to hit the API database directly
    because of the _determine_version_cap method. For example, the
    _sync_power_states periodic task trying to stop an instance,
    or even simple inter-compute communication over RPC like during
    a resize.

    This change simply catches the DBNotAllowed exception, logs a more
    useful error message, and re-raises the exception. In addition,
    the config help for the [api_database] group and "configuration"
    option specifically are updated to mention they should not be set
    on the nova-compute service.

    Change-Id: Iac2911a7a305a9d14bc6dadb364998f3ecb9ce42
    Related-Bug: #1807044
    Closes-Bug: #1839360

Changed in nova:
status: In Progress → Fix Released

Reviewed: https://review.opendev.org/675714
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bd03723a0c1a16d67433658fb486a84bb1bddf02
Submitter: Zuul
Branch: stable/stein

commit bd03723a0c1a16d67433658fb486a84bb1bddf02
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 7 12:23:15 2019 -0400

    Add useful error log when _determine_version_cap raises DBNotAllowed

    Change Icddbe4760eaff30e4e13c1e8d3d5d3f489dac3c4 was intended for the
    API service to check all cells for the minimum nova-compute service
    version when [upgrade_levels]/compute=auto.

    That worked in the gate with devstack because we don't configure
    nova-compute with access to the database and run nova-compute with
    a separate nova-cpu.conf so even if nova-compute is on the same
    host as the nova-api service, they aren't using the same config
    file (nova-api runs with nova.conf which has access to the API DB
    obviously).

    The problem is when nova-compute is configured with
    [upgrade_levels]/compute=auto and an [api_database]/connection,
    there are flows that can try to hit the API database directly
    because of the _determine_version_cap method. For example, the
    _sync_power_states periodic task trying to stop an instance,
    or even simple inter-compute communication over RPC like during
    a resize.

    This change simply catches the DBNotAllowed exception, logs a more
    useful error message, and re-raises the exception. In addition,
    the config help for the [api_database] group and "configuration"
    option specifically are updated to mention they should not be set
    on the nova-compute service.

    NOTE(mriedem): The test was modified to set the LAST_VERSION
    global to None since change I48109d5e32a2e9635c240da1c77f7f6cc7e3c76d
    is not in Stein.

    Change-Id: Iac2911a7a305a9d14bc6dadb364998f3ecb9ce42
    Related-Bug: #1807044
    Closes-Bug: #1839360
    (cherry picked from commit 7d7d58509d5e60ec19c6310931dc62eeff033595)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers