MSCM power drivers throwing EOFError intermittently
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Fix Released
|
Medium
|
Newell Jensen | ||
1.10 |
Fix Released
|
Medium
|
Newell Jensen | ||
1.9 |
Fix Released
|
Medium
|
Newell Jensen |
Bug Description
Problem Description:
MAAS power drivers for the HP Moonshot can no longer query the Node power status.
In the maas logs we will see many Errors like the ones below across all of the systems.
==> /var/log/
Feb 11 17:22:11 maas-devel maas.power: [ERROR] ms10-39-mcdivitt: Failed to refresh power state:
Feb 11 17:22:13 maas-devel maas.power: [INFO] ms10-05-avaton: Power state has changed from error to off.
Feb 11 17:22:14 maas-devel maas.power: [INFO] ms10-18n2-slayton: Power state has changed from error to off.
Feb 11 17:23:40 maas-devel maas.power: [ERROR] Power state could not be queried:
Feb 11 17:23:40 maas-devel maas.power: [ERROR] ms10-01-avaton.1ss: Failed to refresh power state:
Feb 11 17:23:42 maas-devel maas.power: [INFO] ms10-18n3-slayton: Power state has changed from error to off.
Feb 11 17:27:25 maas-devel maas.power: [ERROR] Power state could not be queried:
Feb 11 17:27:25 maas-devel maas.power: [ERROR] ms10-39-mcdivitt: Failed to refresh power state:
Feb 11 17:27:26 maas-devel maas.power: [ERROR] Power state could not be queried:
Feb 11 17:27:26 maas-devel maas.power: [ERROR] ms10-18n1-slayton: Failed to refresh power state:
This does not appear to break the actual commissioning/
Failed to query node's BMC - Power state could not be queried: Thu, 11 Feb. 2016 17:27:26
Failed to query node's BMC - Power state could not be queried: Thu, 11 Feb. 2016 16:34:28
I have tried, removing and adding the host back into Maas, restarting the maas-clusterd. This was an upgrade from maas 1.8.3 -> 1.9, this was not a fresh install of 1.9.
maas:
Installed: 1.9.0+bzr4533-
Candidate: 1.9.0+bzr4533-
Version table:
*** 1.9.0+bzr4533-
500 http://
100 /var/lib/
Maas Installed from ppa:maas/stable
ii maas 1.9.0+bzr4533-
ii maas-cli 1.9.0+bzr4533-
ii maas-cluster-
ii maas-common 1.9.0+bzr4533-
ii maas-dhcp 1.9.0+bzr4533-
ii maas-dns 1.9.0+bzr4533-
ii maas-proxy 1.9.0+bzr4533-
ii maas-region-
ii maas-region-
ii python-django-maas 1.9.0+bzr4533-
ii python-maas-client 1.9.0+bzr4533-
ii python-
Related branches
- Blake Rouse (community): Approve
-
Diff: 1201 lines (+453/-670)5 files modifiedsrc/provisioningserver/drivers/hardware/mscm.py (+0/-217)
src/provisioningserver/drivers/hardware/tests/test_mscm.py (+0/-366)
src/provisioningserver/drivers/power/mscm.py (+185/-26)
src/provisioningserver/drivers/power/tests/test_mscm.py (+267/-60)
src/provisioningserver/rpc/clusterservice.py (+1/-1)
Changed in maas: | |
assignee: | nobody → Newell Jensen (newell-jensen) |
Changed in maas: | |
milestone: | none → 2.0.0 |
Changed in maas: | |
status: | Confirmed → Fix Committed |
Changed in maas: | |
status: | Fix Committed → Fix Released |
Additional Crashes from maas clusterd.log
==> /var/log/ maas/maas. log <==
Feb 12 12:48:10 maas-devel maas.power: [ERROR] Power state could not be queried:
Feb 12 12:48:10 maas-devel maas.power: [ERROR] ms10-03-avaton.1ss: Failed to refresh power state:
==> /var/log/ maas/clusterd. log <== client] Failed to refresh power state. python2. 7/dist- packages/ twisted/ internet/ defer.py" , line 423, in errback _startRunCallba cks(fail) python2. 7/dist- packages/ twisted/ internet/ defer.py" , line 490, in _startRunCallbacks _runCallbacks( ) python2. 7/dist- packages/ twisted/ internet/ defer.py" , line 577, in _runCallbacks current. result, *args, **kw) python2. 7/dist- packages/ twisted/ internet/ defer.py" , line 1155, in gotResult lbacks( r, g, deferred) python2. 7/dist- packages/ twisted/ internet/ defer.py" , line 1097, in _inlineCallbacks throwExceptionI ntoGenerator( g) python2. 7/dist- packages/ twisted/ python/ failure. py", line 389, in throwExceptionI ntoGenerator python2. 7/dist- packages/ provisioningser ver/power/ query.py" , line 126, in get_power_state python2. 7/dist- packages/ twisted/ internet/ defer.py" , line 1097, in _inlineCallbacks throwExceptionI ntoGenerator( g) python2. 7/dist- packages/ twisted/ python/ failure. py", line 389, in throwExceptionI ntoGenerator python2. 7/dist- packages/ provisioningser ver/drivers/ power/_ _init__ .py", line 246, in query power_query, system_id, context) python2. 7/dist- packages/ twisted/ python/ threadpool. py", line 191, in _worker python2. 7/dist- packages/ twisted/ python/ context. py", line 118, in callWithContext text(). callWithContext (ctx, func, *args, **kw) python2. 7/dist- packages/ twisted/ python/ context. py", line 81, in callWithContext python2. 7/dist- packages/ provisioningser ver/drivers/ power/mscm. py", line 57, in power_query mscm(host, username, password, node_id) python2. 7/dist- packages/ provisioningser ver/drivers/ hardware/ mscm.py" , line 184, in power_state_mscm node_power_ state(node_ id) python2. 7/dist- packages/ provisioningser ver/drivers/ hardware/ mscm.py" , line 143, in get_node_ power_state cli_command( "show node power %s" % node_id) python2. 7/dist- packages/ provisioningser ver/drivers/ hardware/ mscm.py" , line 74, in _run_cli_command self.username, password= self.password) python2. 7/dist- packages/ paramiko/ client. py", line 306, in connect start_client( )
2016-02-12 12:48:10-0500 [ClusterClient,
Traceback (most recent call last):
File "/usr/lib/
self.
File "/usr/lib/
self.
File "/usr/lib/
current.result = callback(
File "/usr/lib/
_inlineCal
--- <exception caught here> ---
File "/usr/lib/
result = result.
File "/usr/lib/
return g.throw(self.type, self.value, self.tb)
File "/usr/lib/
system_id, hostname, power_type, context)
File "/usr/lib/
result = result.
File "/usr/lib/
return g.throw(self.type, self.value, self.tb)
File "/usr/lib/
self.
File "/usr/lib/
result = context.call(ctx, function, *args, **kwargs)
File "/usr/lib/
return self.currentCon
File "/usr/lib/
return func(*args,**kw)
File "/usr/lib/
return power_state_
File "/usr/lib/
power_state = mscm.get_
File "/usr/lib/
power_state = self._run_
File "/usr/lib/
self.host, username=
File "/usr/lib/
t.
...