Regiond crashes: sequence item 0: expected str instance, ConnectionLost found

Bug #2060839 reported by Alexander Balderson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Triaged
Undecided
Unassigned
3.5
Triaged
Undecided
Unassigned

Bug Description

Running HA MAAS using MAAS 3.5 while trying to add new bare-metal nodes regiond crashes on one of the maas hosts because the connection to the bare-metal host was closed unexpectedly.

the machine is being created in MAAS with:
maas root machines create hostname=beartic power_type=ipmi architecture=amd64/generic mac_addresses=BA:DB:AD:BA:DB:AD power_parameters_power_address=10.246.56.11 zone=zone3 power_parameters_power_user=Administrator power_parameters_power_pass=insecure power_parameters_power_driver=LAN_2_0 power_parameters_power_boot_type=efi power_parameters_cipher_suite_id=3 power_parameters_k_g=

and this is the 6th machine being added to the MAAS, the first 5 all succeeded.

from the attached logs, on host 10.244.40.32 you can see the regiond going down at the same time the call to create the machine:

Apr 9 18:56:51 swoobat regiond[1297846]: maasserver.utils.views: [error] 500 Internal Server Error @ /MAAS/api/2.0/machines/
Apr 9 18:56:51 swoobat regiond[1297846]: Traceback (most recent call last):
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/maasserver/utils/views.py", line 248, in handle_uncaught_exception
Apr 9 18:56:51 swoobat regiond[1297846]: raise exc from exc.__cause__
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/maasserver/utils/views.py", line 317, in get_response
Apr 9 18:56:51 swoobat regiond[1297846]: with post_commit_hooks:
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/maasserver/utils/orm.py", line 636, in __exit__
Apr 9 18:56:51 swoobat regiond[1297846]: self.fire()
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 203, in wrapper
Apr 9 18:56:51 swoobat regiond[1297846]: result = func(*args, **kwargs)
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/maasserver/utils/asynchronous.py", line 210, in fire
Apr 9 18:56:51 swoobat regiond[1297846]: self._fire_in_reactor(hook).wait(LONGTIME)
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/crochet/_eventloop.py", line 198, in wait
Apr 9 18:56:51 swoobat regiond[1297846]: result.raiseException()
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/twisted/python/failure.py", line 475, in raiseException
Apr 9 18:56:51 swoobat regiond[1297846]: raise self.value.with_traceback(self.tb)
Apr 9 18:56:51 swoobat regiond[1297846]: twisted.internet.error.ConnectionLost: <unprintable ConnectionLost object>
Apr 9 18:56:51 swoobat regiond[1297846]: During handling of the above exception, another exception occurred:
Apr 9 18:56:51 swoobat regiond[1297846]: Traceback (most recent call last):
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/maasserver/utils/views.py", line 250, in handle_uncaught_exception
Apr 9 18:56:51 swoobat regiond[1297846]: response = self.process_exception_by_middleware(exc, request)
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/django/core/handlers/base.py", line 339, in process_exception_by_middleware
Apr 9 18:56:51 swoobat regiond[1297846]: response = middleware_method(request, exception)
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/maasserver/middleware.py", line 195, in process_exception
Apr 9 18:56:51 swoobat regiond[1297846]: self.log_exception(exception)
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/maasserver/middleware.py", line 205, in log_exception
Apr 9 18:56:51 swoobat regiond[1297846]: logger.error(" Exception: %s ".center(79, "#") % str(exception))
Apr 9 18:56:51 swoobat regiond[1297846]: File "/usr/lib/python3/dist-packages/twisted/internet/error.py", line 205, in __str__
Apr 9 18:56:51 swoobat regiond[1297846]: s.append(" ".join(self.args))
Apr 9 18:56:51 swoobat regiond[1297846]: TypeError: sequence item 0: expected str instance, ConnectionLost found
Apr 9 18:56:51 swoobat regiond[1297846]: regiond: [info] 127.0.0.1 POST /MAAS/api/2.0/machines/ HTTP/1.1 --> 500 INTERNAL_SERVER_ERROR (referrer: -; agent: Python-httplib2/0.20.2 (gzip))

I am unsure why the connection to the host was lost, but the error needs to be handled in a way that doesnt take down all of regiond, which comes up about 30 seconds later.

The logs are attached, but the testrun can be found at:
https://solutions.qa.canonical.com/testruns/f45ceaa9-3fd0-4cb9-8b7e-255157f74c31/
and the maas logs can also be found at:
https://oil-jenkins.canonical.com/artifacts/f45ceaa9-3fd0-4cb9-8b7e-255157f74c31/generated/generated/maas/logs-2024-04-09-18.59.34.tgz

Revision history for this message
Alexander Balderson (asbalderson) wrote :
Jacopo Rota (r00ta)
Changed in maas:
milestone: none → 3.6.0
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.