[api] API doesn't detect deployment failure

Bug #1311755 reported by Steve McLellan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Murano
Invalid
Undecided
Stan Lagun

Bug Description

If a failure occurs while an environment's being deployed, there's no feedback (the API is returning state:running so the dashboard things it's still in progress too.

From a _very_ brief look at the engine code it looks like errors should be handled through the RPC mechanism.

To reproduce, apply the patch below (which causes the engine to fail to load classes):

diff --git a/muranoapi/engine/package_loader.py b/muranoapi/engine/package_loader.py
index e294ecb..11658cc 100644
--- a/muranoapi/engine/package_loader.py
+++ b/muranoapi/engine/package_loader.py
@@ -64,6 +64,7 @@ class ApiPackageLoader(PackageLoader):
         self._client = self._get_murano_client(token_id, tenant_id)

     def get_package_by_class(self, name):
+ raise Exception("REMOVEME")
         filter_opts = {'class_name': name, 'limit': 1}

         try:

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :
Changed in murano:
assignee: nobody → Timur Nurlygayanov (tnurlygayanov)
Timur Sufiev (tsufiev-x)
summary: - API doesn't detect deployment failure
+ [api] API doesn't detect deployment failure
Revision history for this message
Steve McLellan (sjmc7) wrote :

This is still an issue. The packageloader was modified such that the reproduction case doesn't cause it any more, but raising an Exception in common/engine.py still demonstrates it; the engine reports the error through the status reporter, but no error is indicated in the UI. The only Status entries in the database are 'deployment scheduled' and 'deployment finished'.

Changed in murano:
assignee: Timur Nurlygayanov (tnurlygayanov) → nobody
status: New → Confirmed
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

it is important bug and we should fix it, but looks like we will fix it olne in the next release.

Changed in murano:
milestone: none → juno-1
importance: Undecided → Medium
ruhe (ruhe)
Changed in murano:
importance: Medium → High
ruhe (ruhe)
Changed in murano:
assignee: nobody → Stan Lagun (slagun)
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

<sjmc7> oo, this was me
<sjmc7> we absolutely need to make sure the engine can report errors back to the API
<tnurlygayanov> yes, it is important
<sjmc7> especially since at the moment, the heat and murano records can be inconsistent
-*- ruhe rised importance to high
<sergmelikyan> I am not sure that we will fix this during juno-1. We have only 10 days left
<sjmc7> we need to at least have a good idea of how to handle it
<slagun> This was fixed. At least partially. If unhandled exception occurs during deployment API receives notification
<sjmc7> i.e. through the RPC path
<sergmelikyan> slagun, yeah, it is mentioned in comments to the issue. We need to redesign a little bit this part of interaction between API and Engine
<sergmelikyan> slagun, how do you think - you can fix this during juno-1?
<tnurlygayanov> slagun, could you please write the comments to this bug about this fix and how we can implement this is in future?
<sergmelikyan> as part of debuggability initiative?
<slagun> What exactly is left to do in that bug?
<tnurlygayanov> slagun looks like we don't know
<tnurlygayanov> and we should perform some investigation on this
<sergmelikyan> Add error state for environment? And rewrite part of the code that is responsible for sending notifications in case of exception.
<slagun> Me either. Need to think on that
<ruhe> slagun: will you please take an AI to work on this bug to figure out next steps?
<sergmelikyan> slagun, https://github.com/stackforge/murano/blob/master/murano/common/engine.py#L73
<tnurlygayanov> so, ok, I will add the comments to the description of this issue

Revision history for this message
ruhe (ruhe) wrote :

slagun to figure out next steps

Stan Lagun (slagun)
Changed in murano:
milestone: juno-1 → juno-2
Changed in murano:
assignee: Stan Lagun (slagun) → nobody
Changed in murano:
milestone: juno-2 → juno-3
Changed in murano:
importance: High → Critical
Stan Lagun (slagun)
Changed in murano:
assignee: nobody → Stan Lagun (slagun)
Revision history for this message
Steve McLellan (sjmc7) wrote :

Attached error is after i force the packageloader to fail by raising an exception (which could happen if a package was invalid, network issue, or a bunch of reasons). The Status is 'Ready', which gives no indication of the error, and I'm unable to try to deploy again even if i know something went wrong.

Steve McLellan (sjmc7)
Changed in murano:
status: Confirmed → Invalid
ruhe (ruhe)
Changed in murano:
importance: Critical → Undecided
milestone: juno-3 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.