Missing exception handling mechanism in 'schedule_and_build_instances' for DBError at line 1180 of nova/conductor/manager.py

Bug #1800508 reported by Wallace Cardoso
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Opinion
Low
Unassigned

Bug Description

Description
==============
If an error occurs during instance creation, the user won't be able to know what exactly happened with the VM that remains always building. As usual, the workflow of creating a VM was interrupted by an exception in the method schedule_and_build_instances, so the result would be the VM is in 'error' state.

Steps to reproduce
=====================
1) Create a VM;
2) Inject an out-of-range value in "schedule_and_build_instances.args.build_requests->'nova_object.data'.instance.'nova_object.data'.instance_type_id", this will be enough to cause a DBError. For instance, it can be used the 1E+22 value.
3) An exception will be thrown, but seems there no exist an appropriate action when this DBError happens.

Expected result
==================
The VM is put in 'error' state

Actual result
================
The VM is in 'build' state indeterminately, and the user never will know (without searching in the logs) what happened with the VM.

Environment
==============
Devstack/Stable/Queens.

Logs & Configs
=================
Logs attached.

Revision history for this message
Wallace Cardoso (wallacec) wrote :
description: updated
summary: Missing exception handling mechanism in 'schedule_and_build_instances'
- for DBError at line 1180
+ for DBError at line 1180 of nova/conductor/manager.py
Revision history for this message
Matt Riedemann (mriedem) wrote :

Is this just an example of how you're recreating the fault?

"""
Inject an out-of-range value in "schedule_and_build_instances.args.build_requests->'nova_object.data'.instance.'nova_object.data'.instance_type_id", this will be enough to cause a DBError. For instance, it can be used the 1E+22 value.
"""

Because unless I'm missing something, that's not a real fault that can be injected externally from a user because the instance_type_id comes from the flavors.id which isn't going to be 1E+22.

tags: added: fault serviceability
Revision history for this message
Matt Riedemann (mriedem) wrote :

In other words, the DB can fail literally *anywhere* that we are writing data (or even reading for that matter) so I'm not sure how helpful this is, except to just have a blanket "try/except Excetion" block when creating an instance record in the database and if that fails, set the instance to error state - not that we can't set it to error state if we can't insert the record in the DB in the first place...

Revision history for this message
Matt Riedemann (mriedem) wrote :

*note that we can't

Changed in nova:
importance: Undecided → Low
status: New → Triaged
Revision history for this message
sean mooney (sean-k-mooney) wrote :

the admin could set the flavor id by hand to 1E+22 via the api when they create the flavor
but in that case the flavor create should fail for the same reason and therefor you should never
be able to boot a vm with that flavor so ya im not following how
this could happen in practice.

Matt Riedemann (mriedem)
tags: added: fault-injection
removed: fault
Changed in nova:
status: Triaged → Opinion
Revision history for this message
Wallace Cardoso (wallacec) wrote :

It is just an example that any database error can cause this, fortunately, I found it by injecting faults through the nova interface in the scenario described.

We can also not discard that a future change in nova cause this same error. Why not prevent it now? This bug causes the instance running in 'build' state until the user can note and delete it. Otherwise, the instance will run forever. Could this impact on resource allocation (consuming resources)? The user could be anything else, for instance, an application, a person, and so on. It is worth discussing what is the intended behavior in that case.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.