Our maas environment is 2.6.1
I will give context into the environment in order to assist any engineers' attempts at reproducing this bug.
The environment was first brought up with full egress Internet connectivity. All machines in the node pool were initially commissioned and deployed using externally hosted default Ubuntu repos to us.archive.ubuntu.com.
Months later, a project required this lab to be air-gapped. All egress connectivity to the Internet was blocked.
We realized that the package repository and the deploy/commission of machines were tightly coupled, which we really didn't know until egress was turned down.
During our various tests to get an internal repo properly mirrored, I stumbled into a bug where failed deploying machines set to rescue mode would not easily exit rescue mode. At first, I deleted machines (currently two of them) to get them out of rescue mode until I found/learned a couple tricks to remove them from rescue mode.
So now I have two machines which require adding to the node pool again. This is where things get harry. However, I should mention that we now have a fully replicated apt repo in our environment which is an identical replica of archive.ubuntu.com (rsynced).
Currently:
* maas setup to use internal apt repo; an identical replica of archive.ubuntu.com
* maas 2.6.1
* existing commissioned machines are deployed and released continually with no errors.
* only two machines need to be re-added to maas and
* the two machines needing to be recommissioned fail to commission 100% of the time.
If I were to offer an observation or two of what the root cause might be:
1. either apt update is not happening in the commission stage prior to apt install or
2. the failed call to the api.snapcraft.io is volatile and we need to mirror api.snapcraft.io in our air-gapped environment.
Snippet of what I've been able to capture for the failure:
{code}stateengine.go:102: state ensure error: Get https://api.snapcraft.io/api/v1/snaps/sections: net/http: request canceled while waiting fo
r connection (Client.Timeout exceeded while awaiting headers){code}
and possibly
{code}2019-10-25T22:38:12+00:00 maas-enlisting-node cloud-init[2128]: Generating locales (this might take a while)...
2019-10-25T22:38:13+00:00 maas-enlisting-node cloud-init[2128]: en_US.UTF-8... done
2019-10-25T22:38:13+00:00 maas-enlisting-node cloud-init[2128]: Generation complete.
2019-10-25T22:38:13+00:00 maas-enlisting-node cloud-init[2128]: Cloud-init v. 18.5-45-g3554ffe8-0ubuntu1~16.04.1 running 'modules:config' at Fri, 25 Oct 2019 22:38:12 +0000. Up 64.78 seconds.
2019-10-25T22:38:13+00:00 maas-enlisting-node systemd[1]: Started Apply the settings specified in cloud-config.
2019-10-25T22:38:13+00:00 maas-enlisting-node systemd[1]: Starting Execute cloud user/final scripts...
2019-10-25T22:38:14+00:00 maas-enlisting-node cloud-init[2202]: Hit:1 http://archive.ubuntu.com/ubuntu xenial InRelease
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: Reading package lists...
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: Reading package lists...
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: Building dependency tree...
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: Reading state information...
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: Package ipmitool is not available, but is referred to by another package.
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: This may mean that the package is missing, has been obsoleted, or
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: is only available from another source
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: E: Package 'ipmitool' has no installation candidate
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: E: Unable to locate package sshpass
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: E: Unable to locate package jq
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: Cloud-init v. 18.5-45-g3554ffe8-0ubuntu1~16.04.1 running 'modules:final' at Fri, 25 Oct 2019 22:38:14 +0000. Up 66.65 seconds.
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: 2019-10-25 22:38:15,360 - util.py[WARNING]: Failed to install packages: ['python3-yaml', 'python3-oauthlib', 'freeipmi-tools', 'ip
mitool', 'sshpass', 'archdetect-deb', 'jq']
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: 2019-10-25 22:38:15,370 - cc_package_update_upgrade_install.py[WARNING]: 1 failed with exceptions, re-raising the last one
2019-10-25T22:38:15+00:00 maas-enlisting-node cloud-init[2202]: 2019-10-25 22:38:15,371 - util.py[WARNING]: Running module package-update-upgrade-install (<module 'cloudinit.config.cc_package_up{code}
which lead to:
{code}2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: Traceback (most recent call last):
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: File "/tmp/user_data.sh.at67mX/bin/maas-ipmi-autodetect-tool", line 57, in <module>
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: main()
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: File "/tmp/user_data.sh.at67mX/bin/maas-ipmi-autodetect-tool", line 50, in main
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: if is_host_moonshot():
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: File "/tmp/user_data.sh.at67mX/bin/maas-ipmi-autodetect-tool", line 36, in is_host_moonshot
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: output = subprocess.check_output(['ipmitool', 'raw', '06', '01'])
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: File "/usr/lib/python3.5/subprocess.py", line 626, in check_output
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: **kwargs).stdout
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: File "/usr/lib/python3.5/subprocess.py", line 693, in run
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: with Popen(*popenargs, **kwargs) as process:
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: File "/usr/lib/python3.5/subprocess.py", line 947, in __init__
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: restore_signals, start_new_session)
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: File "/usr/lib/python3.5/subprocess.py", line 1551, in _execute_child
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: raise child_exception_type(errno_num, err_msg)
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: FileNotFoundError: [Errno 2] No such file or directory: 'ipmitool'
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: % Total % Received % Xferd Average Speed Time Time Time Current
2019-10-25T22:38:16+00:00 maas-enlisting-node cloud-init[2202]: Dload Upload Total Spent Left Speed
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: #015 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0#015100 988 100 638 100 350 2334 1280 --:--:-- --:--:-- --:--:-- 2336
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: /var/lib/cloud/instance/scripts/user_data.sh: line 192: jq: command not found
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: /var/lib/cloud/instance/scripts/user_data.sh: line 200: jq: command not found
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: Traceback (most recent call last):
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: File "/tmp/user_data.sh.at67mX/bin/maas-signal", line 105, in <module>
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: main()
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: File "/tmp/user_data.sh.at67mX/bin/maas-signal", line 84, in main
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: read_config(args.config, creds)
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: File "/tmp/user_data.sh.at67mX/bin/maas_api_helper.py", line 196, in read_config
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: if 'datasource' in cfg:
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: TypeError: argument of type 'NoneType' is not iterable
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: Traceback (most recent call last):
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: File "/tmp/user_data.sh.at67mX/bin/maas-run-remote-scripts", line 790, in <module>
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: main()
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: File "/tmp/user_data.sh.at67mX/bin/maas-run-remote-scripts", line 736, in main
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: read_config(args.config, creds)
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: File "/tmp/user_data.sh.at67mX/bin/maas_api_helper.py", line 196, in read_config
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: if 'datasource' in cfg:
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: TypeError: argument of type 'NoneType' is not iterable
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: 2019-10-25 22:38:17,402 - util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/user_data.sh [1]
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: 2019-10-25 22:38:17,403 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
2019-10-25T22:38:17+00:00 maas-enlisting-node cloud-init[2202]: 2019-10-25 22:38:17,404 - util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_scripts_user.py'>) failed
{code}
Hi, how did you configure MAAS to point to your mirror?
Looking at the logs, the machines being deployed are still trying to reach archive.ubuntu.com.