MAAS UI reports "Deployment Failed" when Deploying Windows Hyperv from MAAS

Bug #1424846 reported by Sean Feole
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Undecided
Unassigned
cloudbase-init
Invalid
Undecided
Unassigned

Bug Description

The Hyperscale team has started to deploy windows images on MAAS while working on some enablement projects.

While starting the node configured to boot windows the MAAS UI will never update the node state from "Deploying" -> "Deployed"

After a few minutes the end user can successfully RDP into the windows host and configure the password. The MAAS UI will eventually report the Node as "Failed Deployment" This does not reflect the true nature of the system since it is accessible via the network.

MAAS Version we are using is: 1.8.0~alpha4+bzr3532

Tags: hyperscale
Sean Feole (sfeole)
Changed in maas:
status: New → Confirmed
tags: added: hyperscale
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Have a look at the cloud-init logs on the system - it may be that windows booted but cloud-init didn't successfully finish running and setting things up.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Jason is correct please provide the cloudbase-init logs so we can determine why the Windows node was unable to contact the MAAS server after deployment.

I assume the Windows machines hostname was also incorrect and not set?

The files are located in:

C:\Program Files <x86>\Cloudbase\Cloudbase-init\logs

Changed in maas:
status: Confirmed → Incomplete
Revision history for this message
Sean Feole (sfeole) wrote :
Download full text (6.1 KiB)

Here is the output from Cloud-init-anattended

2015-02-24 17:25:38.907 2384 DEBUG cloudbaseinit.utils.classloader [-] Loading class 'cloudbaseinit.osutils.windows.WindowsUtils' load_class C:\Program Files (x86)\Cloudbase Solutions\Cloudbase-Init\Python27\lib\site-packages\cloudbaseinit\utils\classloader.py:27
2015-02-24 17:25:39.716 2384 DEBUG cloudbaseinit.utils.classloader [-] Loading class 'cloudbaseinit.metadata.services.maasservice.MaaSHttpService' load_class C:\Program Files (x86)\Cloudbase Solutions\Cloudbase-Init\Python27\lib\site-packages\cloudbaseinit\utils\classloader.py:27
2015-02-24 17:25:39.815 2384 DEBUG cloudbaseinit.metadata.services.maasservice [-] Getting metadata from: http://10.229.32.26/MAAS/metadata/2012-03-01/meta-data/ _get_data C:\Program Files (x86)\Cloudbase Solutions\Cloudbase-Init\Python27\lib\site-packages\cloudbaseinit\metadata\services\maasservice.py:101
2015-02-24 17:25:39.861 2384 ERROR cloudbaseinit.metadata.services.maasservice [-] <urlopen error [Errno 10051] A socket operation was attempted to an unreachable network>
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice Traceback (most recent call last):
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice File "C:\Program Files (x86)\Cloudbase Solutions\Cloudbase-Init\Python27\lib\site-packages\cloudbaseinit\metadata\services\maasservice.py", line 61, in load
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice self._get_data('%s/meta-data/' % self._metadata_version)
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice File "C:\Program Files (x86)\Cloudbase Solutions\Cloudbase-Init\Python27\lib\site-packages\cloudbaseinit\metadata\services\maasservice.py", line 103, in _get_data
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice response = self._get_response(req)
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice File "C:\Program Files (x86)\Cloudbase Solutions\Cloudbase-Init\Python27\lib\site-packages\cloudbaseinit\metadata\services\maasservice.py", line 71, in _get_response
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice return request.urlopen(req)
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice File "C:\Program Files (x86)\Cloudbase Solutions\Cloudbase-Init\Python27\lib\urllib2.py", line 126, in urlopen
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice return _opener.open(url, data, timeout)
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice File "C:\Program Files (x86)\Cloudbase Solutions\Cloudbase-Init\Python27\lib\urllib2.py", line 400, in open
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice response = self._open(req, data)
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice File "C:\Program Files (x86)\Cloudbase Solutions\Cloudbase-Init\Python27\lib\urllib2.py", line 418, in _open
2015-02-24 17:25:39.861 2384 TRACE cloudbaseinit.metadata.services.maasservice '_open', req)
2015-02-...

Read more...

Revision history for this message
Sean Feole (sfeole) wrote :

The same output appears to be in cloud-init.log

Changed in maas:
status: Incomplete → Confirmed
Revision history for this message
Sean Feole (sfeole) wrote :

Some more information discovered,

After viewing this log, the system was sitting for at least 8 minutes idle. I was able to manually open python prmpt on the windows prompt

import urllib2;
urrlib2.urlopen("http://10.229.32.26/MAAS/metadata/2012-03-01/meta-data/")

This time we had a 401 Unauthorized message appear across the console.

After discussing with a few of the juju folks could this be perhaps the cloutinit-base service is coming up before the networking services have started, which is essentially causing

"<urlopen error [Errno 10051] A socket operation was attempted to an unreachable network>"

Just a thought?

Revision history for this message
Blake Rouse (blake-rouse) wrote :

After some debugging it looks like that cloudbase-init is coming up before networking has completely came up on Windows. By restarting the cloudbase-init service on Windows it is able to complete successfully.

Revision history for this message
Gabriel Samfira (gabriel-samfira) wrote :

just a shot in the dark here, but, can you run the following command in curtin's finalize:

hwclock --systohc --utc

and see if thi s gets fixed?

Revision history for this message
Horacio Durán (hduran-8) wrote :

I just submitted a patch for review to cloudbase_init https://review.openstack.org/#/c/158889/

Revision history for this message
Gabriel Samfira (gabriel-samfira) wrote :

This issue has been fixed. See comments in above merge request.

Changed in maas:
status: Confirmed → Fix Released
Changed in cloudbase-init:
status: New → Invalid
Changed in maas:
status: Fix Released → Invalid
Revision history for this message
Thawngzapum Lian (thawngv) wrote :

What version of cloudbase-init this issue was fixed?
Looking at Github source code did not seem to contain fix proposed in https://review.openstack.org/#/c/158889/

Revision history for this message
Adrian Vladu (avladu) wrote :

Hello,

The fix has been released as a configurable retry for every http metadata service.

There are two config options to tweak the MAAS metadata retry:
# These are the defaults
retry_count = 5
retry_count_interval = 4

Make sure though that the network drivers are properly installed on your Windows images.
Let me know if you need additional information.

Thanks,
Adrian Vladu

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.