We've hit a few failed to release errors while builds were releasing:
ubuntu@maas-trusty-back-may22:/var/log/maas$ grep -i "releasing to failed" /var/log/maas/maas.log /var/log/maas/maas.log.1
/var/log/maas/maas.log:Jul 7 13:04:58 maas-trusty-back-may22 maas.node: [INFO] hayward-11: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:04:58 maas-trusty-back-may22 maas.node: [INFO] hayward-7: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:04:58 maas-trusty-back-may22 maas.node: [INFO] hayward-36: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:04:58 maas-trusty-back-may22 maas.node: [INFO] hayward-16: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:04:59 maas-trusty-back-may22 maas.node: [INFO] hayward-04: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:17:32 maas-trusty-back-may22 maas.node: [INFO] noma.local: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:17:32 maas-trusty-back-may22 maas.node: [INFO] kobusch.local: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:17:32 maas-trusty-back-may22 maas.node: [INFO] hayward-19: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:17:32 maas-trusty-back-may22 maas.node: [INFO] hayward-13: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:17:32 maas-trusty-back-may22 maas.node: [INFO] everitt.local: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:17:32 maas-trusty-back-may22 maas.node: [INFO] hayward-15: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:17:32 maas-trusty-back-may22 maas.node: [INFO] hayward-18: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:17:32 maas-trusty-back-may22 maas.node: [INFO] emerson.local: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:17:32 maas-trusty-back-may22 maas.node: [INFO] everitt.local: Status transition from RELEASING to FAILED_RELEASING
/var/log/maas/maas.log:Jul 7 13:17:33 maas-trusty-back-may22 maas.node: [INFO] hayward-43: Status transition from RELEASING to FAILED_RELEASING
ubuntu@maas-trusty-back-may22:/var/log/maas$ head -n 1 /var/log/maas/maas.log.1
Jul 6 06:40:41 maas-trusty-back-may22 maas.thread: [INFO] 9193 queue=0, waiters=10, working=0
ubuntu@maas-trusty-back-may22:/var/log/maas$
In the clusterd.log, we see these exceptions:
2015-07-07 13:00:02+0000 [TFTP (UDP)] Datagram received from ('10.245.0.233', 2073): <RRQDatagram(filename=pxelinux.0, mode=octet, options={'tsize': '0'})>
2015-07-07 13:00:02+0000 [-] Unhandled error in Deferred:
2015-07-07 13:00:02+0000 [-] Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/tftp/protocol.py", line 67, in _startSession
context, self.backend.get_reader, datagram.filename)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
return func(*args,**kw)
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/twisted.py", line 68, in wrapper
return maybeDeferred(func, *args, **kwargs)
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 139, in maybeDeferred
result = f(*args, **kw)
File "/usr/lib/python2.7/dist-packages/provisioningserver/pserv_services/tftp.py", line 254, in get_reader
mac_address = get_remote_mac()
File "/usr/lib/python2.7/dist-packages/provisioningserver/boot/__init__.py", line 108, in get_remote_mac
return find_mac_via_arp(remote_host)
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/network.py", line 101, in find_mac_via_arp
output = call_and_check(['ip', 'neigh'], env={'LC_ALL': 'C'})
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/shell.py", line 142, in call_and_check
process = Popen(command, *args, stdout=PIPE, stderr=PIPE, **kwargs)
File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1223, in _execute_child
self.pid = os.fork()
exceptions.OSError: [Errno 12] Cannot allocate memory
2015-07-07 13:00:10+0000 [TFTP (UDP)] Datagram received from ('10.245.0.233', 2074): <RRQDatagram(filename=pxelinux.0, mode=octet, options={'tsize': '0'})>
2015-07-07 13:00:10+0000 [-] Unhandled error in Deferred:
2015-07-07 13:00:10+0000 [-] Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/tftp/protocol.py", line 67, in _startSession
context, self.backend.get_reader, datagram.filename)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
return func(*args,**kw)
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/twisted.py", line 68, in wrapper
return maybeDeferred(func, *args, **kwargs)
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 139, in maybeDeferred
result = f(*args, **kw)
File "/usr/lib/python2.7/dist-packages/provisioningserver/pserv_services/tftp.py", line 254, in get_reader
mac_address = get_remote_mac()
File "/usr/lib/python2.7/dist-packages/provisioningserver/boot/__init__.py", line 108, in get_remote_mac
return find_mac_via_arp(remote_host)
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/network.py", line 101, in find_mac_via_arp
output = call_and_check(['ip', 'neigh'], env={'LC_ALL': 'C'})
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/shell.py", line 142, in call_and_check
process = Popen(command, *args, stdout=PIPE, stderr=PIPE, **kwargs)
File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1223, in _execute_child
self.pid = os.fork()
exceptions.OSError: [Errno 12] Cannot allocate memory
and a lot more of these starting at 13:07::
2015-07-07 13:06:01+0000 [-] Stopping protocol <tftp.bootstrap.RemoteOriginReadSession instance at 0x7f9188626050>
2015-07-07 13:07:26+0000 [-] Failed to refresh power state.
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 423, in errback
self._startRunCallbacks(fail)
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 490, in _startRunCallbacks
self._runCallbacks()
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1155, in gotResult
_inlineCallbacks(r, g, deferred)
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1097, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "/usr/lib/python2.7/dist-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/provisioningserver/rpc/power.py", line 340, in get_power_state
perform_power_query, system_id, hostname, power_type, context)
File "/usr/lib/python2.7/dist-packages/twisted/python/threadpool.py", line 191, in _worker
result = context.call(ctx, function, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
return func(*args,**kw)
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/twisted.py", line 154, in wrapper
return func(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/provisioningserver/rpc/power.py", line 313, in perform_power_query
return action.execute(power_change='query', **context)
File "/usr/lib/python2.7/dist-packages/provisioningserver/power/poweraction.py", line 139, in execute
context = self.update_context(context)
File "/usr/lib/python2.7/dist-packages/provisioningserver/power/poweraction.py", line 95, in update_context
ip_address = find_ip_via_arp(mac_address)
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/network.py", line 77, in find_ip_via_arp
output = call_and_check(['arp', '-n']).split('\n')
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/shell.py", line 142, in call_and_check
process = Popen(command, *args, stdout=PIPE, stderr=PIPE, **kwargs)
File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1223, in _execute_child
self.pid = os.fork()
exceptions.OSError: [Errno 12] Cannot allocate memory
2015-07-07 13:07:26+0000 [-] Failed to refresh power state.
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 423, in errback
self._startRunCallbacks(fail)
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 490, in _startRunCallbacks
self._runCallbacks()
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1155, in gotResult
_inlineCallbacks(r, g, deferred)
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1097, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "/usr/lib/python2.7/dist-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "/usr/lib/python2.7/dist-packages/provisioningserver/rpc/power.py", line 340, in get_power_state
perform_power_query, system_id, hostname, power_type, context)
File "/usr/lib/python2.7/dist-packages/twisted/python/threadpool.py", line 191, in _worker
result = context.call(ctx, function, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
return func(*args,**kw)
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/twisted.py", line 154, in wrapper
return func(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/provisioningserver/rpc/power.py", line 313, in perform_power_query
return action.execute(power_change='query', **context)
File "/usr/lib/python2.7/dist-packages/provisioningserver/power/poweraction.py", line 139, in execute
context = self.update_context(context)
File "/usr/lib/python2.7/dist-packages/provisioningserver/power/poweraction.py", line 95, in update_context
ip_address = find_ip_via_arp(mac_address)
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/network.py", line 77, in find_ip_via_arp
output = call_and_check(['arp', '-n']).split('\n')
File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/shell.py", line 142, in call_and_check
process = Popen(command, *args, stdout=PIPE, stderr=PIPE, **kwargs)
File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1223, in _execute_child
self.pid = os.fork()
exceptions.OSError: [Errno 12] Cannot allocate memory
This is the pmap -d output for maas processes:
https://pastebin.canonical.com/134758/
We are recreating this on a weekly basis it seems. We had another occurence yesterday. The previous one was the week before.