Internal server error when attempting to perform an action when the cluster is down

Bug #1374321 reported by Raphaël Badin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Graham Binns

Bug Description

When trying to deploy a node in the UI with the cluster service down, I get a "Internal server error page". I was expecting an error but the error I'm getting is not really helpful: a blank page with "Internal server error" (i.e. no indication of what the problem actually is). Also, the only error I can see is in /var/log/maas/maas-django.log (it should be in /var/log/maas/maas.log).

/var/log/maas/maas-django.log:

  File "/usr/lib/python2.7/dist-packages/maasserver/clusterrpc/dhcp.py", line 183, in remove_host_maps
    for nodegroup in removal_mappings
  File "/usr/lib/python2.7/dist-packages/maasserver/clusterrpc/dhcp.py", line 183, in <dictcomp>
    for nodegroup in removal_mappings
  File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/twisted.py", line 106, in wrapper
    return func_in_reactor(*args, **kwargs).wait()
  File "/usr/lib/python2.7/dist-packages/crochet/_eventloop.py", line 219, in wait
    result.raiseException()
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 577, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/usr/lib/python2.7/dist-packages/maasserver/rpc/regionservice.py", line 514, in cancelled
    uuid)
NoConnectionsAvailable: Unable to connect to cluster 06985604-940f-43d0-8b4c-701cc2e46c4f; no connections available.

Tags: rpc

Related branches

Revision history for this message
Raphaël Badin (rvb) wrote :

I haven't tested it, but I suspect the same kind of error happens when calling methods from the API. Here again, a 5XX is expected but the error message should contains an explanation about the failure's cause.

Changed in maas:
milestone: none → 1.7.0
Revision history for this message
Graham Binns (gmb) wrote :

This is something that we need to handle when sending power commands to nodes; do we have a nice, general way in mind of catching and handling this error? I'm playing pretty close to this code right now and would love to not find myself duplicating stuff.

Revision history for this message
Raphaël Badin (rvb) wrote :

I think using a middleware is the nicest way to catch and deal with exceptions that arise from different places in the code. We already have this in place for the API: see APIErrorsMiddleware. The same pattern can be used for non-APi exceptions although in this case we could: issue a warning using Django's message framework and then reload the page.

Gavin Panella (allenap)
Changed in maas:
assignee: nobody → Graham Binns (gmb)
status: Triaged → In Progress
Revision history for this message
Raphaël Badin (rvb) wrote :

QAed this today: works fine. Like I said on IRC, it's a bit weird to display the cluster's uuid and not the cluster's name in the message. Plus I'd prefix the error message with "Error: " but that's just me :).

Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.