Failed download of squashfs while enlisting a node
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Ubuntu version: 18.04.1 LTS
MAAS version: 2.5.0~rc2 (7433-gea48d302
Enlisting any new node hangs at:
root=squash:http://
mount_squash downloading http://
Connecting to 10.203.0.2:5248 (10.203.0.2:5248)
On the server side, strace indicates the connection is made, and the file is partially transferred before hanging and eventually timing out (see attached). We've tried toggling nginx options sendfile and tcp_nopush, but changes have no effect.
$ dpkg -l '*maas*'|cat
Desired=
| Status=
|/ Err?=(none)
||/ Name Version Architecture Description
+++-===
ii maas 2.5.0~rc2-
ii maas-cli 2.5.0~rc2-
un maas-cluster-
ii maas-common 2.5.0~rc2-
ii maas-dhcp 2.5.0~rc2-
un maas-dns <none> <none> (no description available)
ii maas-proxy 2.5.0~rc2-
ii maas-rack-
ii maas-region-api 2.5.0~rc2-
ii maas-region-
un maas-region-
un python-django-maas <none> <none> (no description available)
un python-maas-client <none> <none> (no description available)
un python-
ii python3-django-maas 2.5.0~rc2-
ii python3-maas-client 2.5.0~rc2-
ii python3-
Changed in maas: | |
milestone: | 2.5.1 → 2.5.2 |
Changed in maas: | |
milestone: | 2.5.2 → none |
Hi Adam,
Thank you for your bug report. Could you please provide a bit more information on your environment?
1. How is this MAAS configured? Is this a single region/rack? 1 region with multiple racks? How was this isntalled? Was this an upgrade?
2. Who is 10.203.0.5 ? Who is 10.203.0.2 ?
3. What client machine is failing? Do you happen to have MAC/IP to investigate in the logs ?
4. Can you provide a console log of the booting process?
5. Do all machines fail? Is it just one?
6. Lastly, is there any specific MTU configuration in the underlying vlan ? Anything relevant wrt networking? e.g. it could be due to the underlying network being the blocker?
And from looking at the logs, the only weird thing I see is this:
2018-12-06 18:30:23 provisioningser ver.rackdservic es.tftp: [info] ubuntu/ amd64/ga- 18.04/bionic/ daily/boot- kernel requested by 10.203.0.218 ver.rackdservic es.tftp: [info] ubuntu/ amd64/ga- 18.04/bionic/ daily/boot- initrd requested by 10.203.0.218 internet. defer: [critical] Unhandled error in Deferred: internet. defer: [critical] python3/ dist-packages/ twisted/ internet/ asyncioreactor. py", line 267, in run _asyncioEventlo op.run_ forever( ) python3/ dist-packages/ twisted/ internet/ asyncioreactor. py", line 290, in run python3/ dist-packages/ twisted/ internet/ defer.py" , line 459, in callback _startRunCallba cks(result) python3/ dist-packages/ twisted/ internet/ defer.py" , line 567, in _startRunCallbacks _runCallbacks( ) python3/ dist-packages/ twisted/ internet/ defer.py" , line 653, in _runCallbacks current. result, *args, **kw) python3/ dist-packages/ twisted/ internet/ task.py" , line 865, in <lambda> addCallback( lambda ignored: callable(*args, **kw)) python3/ dist-packages/ tftp/session. py", line 274, in sendData transport. write(bytes) python3/ dist-packages/ twisted/ internet/ udp.py" , line 269, in write send(datagram) BlockingIOError : [Errno 11] Resource temporarily unavailable
2018-12-06 18:30:24 provisioningser
2018-12-06 18:30:25 twisted.
2018-12-06 18:30:25 twisted.
Traceback (most recent call last):
File "/usr/lib/
self.
File "/usr/lib/
f(*args, **kwargs)
File "/usr/lib/
self.
File "/usr/lib/
self.
--- <exception caught here> ---
File "/usr/lib/
current.result = callback(
File "/usr/lib/
d.
File "/usr/lib/
self.
File "/usr/lib/
return self.socket.
builtins.
2018-12-06 18:30:48 provisioningser ver.rackdservic es.http: [info] /images/ ubuntu/ amd64/ga- 18.04/bionic/ daily/squashfs requested by 10.203.0.218 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\00\ 00\00\002018- 12-06 18:34:49 twisted.scripts: [info] twistd 17.9.0 (/usr/bin/python3 3.6.7) starting up. internet. asyncioreactor. AsyncioSelector Reactor. ver.utils. services: [info] Starting beaconing for interfaces: {'ens7'}
\00\00\
2018-12-06 18:34:49 twisted.scripts: [info] reactor class: twisted.
2018-12-06 18:34:49 provisioningser
Which would apparently show that something chrased and then the rackd was restar...