MaaS 2.0 beta 5 fails to assign IP address to nodes when multiple nodes go into deploying at once

Bug #1586540 reported by Bert JW Regeer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Undecided
Bert JW Regeer

Bug Description

While using JuJu 2.0 to deploy a bundle we hit a snag.

Our machines have the following set up for interfaces:

bond0: auto assign from an IPv6 address pool/IPv4 address pool
bond1: auto assign from an IPv6 address pool/IPv4 address pool
bond2: auto assign from an IPv6 address pool
enp5s0f0: PXE boot/management interface

The bonds are setup to each be part of a different "fabric" and each of them have a /64 (IPv6) or /25 (IPv4) assigned to them.

When using JuJu to deploy a bundle, it attempts to have all of our nodes (~20) go into the deploying stage at the same time and it errors out on trying to assign an IP address, as evidenced by the following traceback:

2016-05-27 06:51:19 [-] Error on request (154) machine.action:
 Traceback (most recent call last):
   File "/usr/lib/python3.5/threading.py", line 862, in run
     self._target(*self._args, **self._kwargs)
   File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 904, in worker
     return target()
   File "/usr/lib/python3/dist-packages/twisted/_threads/_threadworker.py", line 46, in work
     task()
   File "/usr/lib/python3/dist-packages/twisted/_threads/_team.py", line 190, in doWork
     task()
 --- <exception caught here> ---
   File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 246, in inContext
     result = inContext.theWork()
   File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 262, in <lambda>
     inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
   File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 118, in callWithContext
     return self.currentContext().callWithContext(ctx, func, *args, **kw)
   File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 81, in callWithContext
     return func(*args,**kw)
   File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 937, in callInContext
     return func(*args, **kwargs)
   File "/usr/lib/python3/dist-packages/maasserver/utils/orm.py", line 516, in call_within_transaction
     return func_outside_txn(*args, **kwargs)
   File "/usr/lib/python3/dist-packages/maasserver/utils/orm.py", line 351, in retrier
     return func(*args, **kwargs)
   File "/usr/lib/python3.5/contextlib.py", line 30, in inner
     return func(*args, **kwds)
   File "/usr/lib/python3/dist-packages/maasserver/websockets/handlers/machine.py", line 675, in action
     return action.execute(**extra_params)
   File "/usr/lib/python3/dist-packages/maasserver/node_action.py", line 320, in execute
     self.node.start(self.user)
   File "/usr/lib/python3/dist-packages/maasserver/utils/orm.py", line 500, in call_within_transaction
     return func_within_txn(*args, **kwargs)
   File "/usr/lib/python3.5/contextlib.py", line 30, in inner
     return func(*args, **kwds)
   File "/usr/lib/python3/dist-packages/maasserver/models/node.py", line 2753, in start
     return self._start(user, user_data)
   File "/usr/lib/python3/dist-packages/maasserver/utils/orm.py", line 500, in call_within_transaction
     return func_within_txn(*args, **kwargs)
   File "/usr/lib/python3.5/contextlib.py", line 30, in inner
     return func(*args, **kwds)
   File "/usr/lib/python3/dist-packages/maasserver/models/node.py", line 2822, in _start
     self.claim_auto_ips()
   File "/usr/lib/python3/dist-packages/maasserver/models/node.py", line 2388, in claim_auto_ips
     exclude_addresses=exclude_addresses)
   File "/usr/lib/python3/dist-packages/maasserver/models/interface.py", line 894, in claim_auto_ips
     auto_ip, exclude_addresses=exclude_addresses)
   File "/usr/lib/python3/dist-packages/maasserver/models/interface.py", line 915, in _claim_auto_ip
     exclude_addresses=exclude_addresses)
   File "/usr/lib/python3/dist-packages/maasserver/models/staticipaddress.py", line 214, in allocate_new
     raise make_serialization_failure()
 django.db.utils.OperationalError:

After this the server goes into the allocated state, and doesn't finish deploying.

We did not see issues with deploying 20 nodes without the extra configuration of IP's on the interfaces, and I believe it may be related to the fact that MaaS is having to generate IP's for all of the machines and there are race conditions on who wins what IP address.

Tags: cpec juju maas2.0
Revision history for this message
Bert JW Regeer (bertjwregeer) wrote :

Scratch this, it was completely unrelated to the multiple interfaces on a single node. Filing a new bug report for the actual issue.

Changed in maas:
assignee: nobody → Bert JW Regeer (bertjwregeer)
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.