Failed to start Initial cloud-init job (pre-networking)

Bug #1531880 reported by jean-christophe manciot
72
This bug affects 16 people
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
Undecided
Unassigned

Bug Description

Linux Ubuntu-Gnome-Server 4.2.0-22-generic #27-Ubuntu SMP Thu Dec 17 22:57:08 UTC 2015 x86_64

Cloud-init 0.7.7

systemctl -l status cloud-init-local
● cloud-init-local.service - Initial cloud-init job (pre-networking)
   Loaded: loaded (/lib/systemd/system/cloud-init-local.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Thu 2016-01-07 15:34:33 CET; 6min ago
  Process: 961 ExecStart=/usr/bin/cloud-init init --local (code=exited, status=1/FAILURE)
 Main PID: 961 (code=exited, status=1/FAILURE)

Jan 07 15:34:33 Ubuntu-Gnome-Server cloud-init[961]: File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1547, in del_file
Jan 07 15:34:33 Ubuntu-Gnome-Server cloud-init[961]: raise e
Jan 07 15:34:33 Ubuntu-Gnome-Server cloud-init[961]: File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1544, in del_file
Jan 07 15:34:33 Ubuntu-Gnome-Server cloud-init[961]: os.unlink(path)
Jan 07 15:34:33 Ubuntu-Gnome-Server cloud-init[961]: IsADirectoryError: [Errno 21] Is a directory: '/var/lib/cloud/instance'
Jan 07 15:34:33 Ubuntu-Gnome-Server cloud-init[961]: ------------------------------------------------------------
Jan 07 15:34:33 Ubuntu-Gnome-Server systemd[1]: cloud-init-local.service: Main process exited, code=exited, status=1/FAILURE
Jan 07 15:34:33 Ubuntu-Gnome-Server systemd[1]: Failed to start Initial cloud-init job (pre-networking).
Jan 07 15:34:33 Ubuntu-Gnome-Server systemd[1]: cloud-init-local.service: Unit entered failed state.
Jan 07 15:34:33 Ubuntu-Gnome-Server systemd[1]: cloud-init-local.service: Failed with result 'exit-code'.

Tags: landscape
Revision history for this message
jean-christophe manciot (manciot-jeanchristophe) wrote :
Revision history for this message
Pav Smirnoff (itwise-net) wrote :

Similar problem:

2016-06-15 13:37:43,762 - util.py[WARNING]: failed of stage init
failed run of stage init
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/bin/cloud-init", line 520, in status_wrapper
    ret = functor(name, args)
  File "/usr/bin/cloud-init", line 250, in main_init
    init.fetch(existing=existing)
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 322, in fetch
    return self._get_data_source(existing=existing)
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 241, in _get_data_source
    util.del_file(self.paths.instance_link)
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1567, in del_file
    raise e
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1564, in del_file
    os.unlink(path)
IsADirectoryError: [Errno 21] Is a directory: '/var/lib/cloud/instance'
------------------------------------------------------------

Revision history for this message
GrzegorzKoper (grzegorz-koper) wrote :

Hey Guys,
Have same problems with Mitaka Openstack and Xenial Image from Ubuntu

[ 6.209014] cloud-init[509]: Cloud-init v. 0.7.7 running 'init-local' at Wed, 06 Jul 2016 09:46:57 +0000. Up 6.09 seconds.
[ 6.213503] cloud-init[509]: 2016-07-06 09:46:58,013 - util.py[WARNING]: failed of stage init-local
[ 6.224405] cloud-init[509]: failed run of stage init-local
[ 6.225373] cloud-init[509]: ------------------------------------------------------------
[ 6.232045] cloud-init[509]: Traceback (most recent call last):
[ 6.232995] cloud-init[509]: File "/usr/bin/cloud-init", line 520, in status_wrapper
[ 6.236208] cloud-init[509]: ret = functor(name, args)
[ 6.237103] cloud-init[509]: File "/usr/bin/cloud-init", line 250, in main_init
[ 6.240202] cloud-init[509]: init.fetch(existing=existing)
[ 6.244195] cloud-init[509]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 322, in fetch
[ 6.248193] cloud-init[509]: return self._get_data_source(existing=existing)
[ 6.249240] cloud-init[509]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 241, in _get_data_source
[ 6.252194] cloud-init[509]: util.del_file(self.paths.instance_link)
[ 6.260191] cloud-init[509]: File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1567, in del_file
[ 6.261333] cloud-init[509]: raise e
[ 6.262113] cloud-init[509]: File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1564, in del_file
[ 6.264204] cloud-init[509]: os.unlink(path)
[ 6.268191] cloud-init[509]: IsADirectoryError: [Errno 21] Is a directory: '/var/lib/cloud/instance'
[ 6.272195] cloud-init[509]: ------------------------------------------------------------
[[0;1;31mFAILED[0m] Failed to start Initial cloud-init job (pre-networking).
See 'systemctl status cloud-init-local.service' for details.

Revision history for this message
Reshma (reshmaprathap) wrote :

Am also getting same error .Deleted the Directory /var/lib/cloud/instance in the image and re-deployed but still no luck. It again created the same directory instead of symlink. Has anyone solved this ?

Revision history for this message
krath (thorsten-krause) wrote :

For me I started using the qcow2-cloud images from ubuntu.
Also cloud-init is very sensitive to a correct yaml-file.
For me this one-liner helped to identify potential wrong cloud-init configuration files:
https://liquidat.wordpress.com/2016/01/21/short-tip-verify-yaml-in-shell-via-python-one-liner/

Revision history for this message
David Britton (dpb) wrote :

Hit this as well, I think I have a repro step for you:

1) Shut down metadata service in openstack
2) boot new xenial instance
3) cloud-init will fail contacting the metadata service to get the instance-id
4) turn on metadata service
5) hard reboot instance
6) next boot you will hit this error.

tags: added: landscape
Revision history for this message
Scott Moser (smoser) wrote :

If you hit this bug, and you're able to do so. Please
attach /var/log/cloud-init.log, /var/log/cloud-init-output.log
and a tarball of /var/lib/cloud (tar -C /var/lib -cf /tmp/var-lib-cloud.tar cloud )

Revision history for this message
Alex Le (emoc1989) wrote :
Download full text (3.5 KiB)

Same issue here.

cloud-init.log

Apr 26 00:25:33 [CLOUDINIT] util.py[DEBUG]: Cloud-init v. 0.7.8 running 'init-local' at Wed, 26 Apr 2017 00:25:33 +0000. Up 6.65 seconds.
Apr 26 00:25:33 [CLOUDINIT] util.py[DEBUG]: Writing to /var/log/cloud-init.log - ab: [420] 0 bytes
Apr 26 00:25:33 [CLOUDINIT] util.py[DEBUG]: Changing the ownership of /var/log/cloud-init.log to 104:4
Apr 26 00:25:33 [CLOUDINIT] util.py[DEBUG]: Attempting to remove /var/lib/cloud/instance/boot-finished
Apr 26 00:25:33 [CLOUDINIT] util.py[DEBUG]: Attempting to remove /var/lib/cloud/data/no-net
Apr 26 00:25:33 [CLOUDINIT] handlers.py[DEBUG]: start: init-local/check-cache: attempting to read from cache [check]
Apr 26 00:25:33 [CLOUDINIT] util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False)
Apr 26 00:25:33 [CLOUDINIT] stages.py[DEBUG]: no cache found
Apr 26 00:25:33 [CLOUDINIT] handlers.py[DEBUG]: finish: init-local/check-cache: SUCCESS: no cache found
Apr 26 00:25:33 [CLOUDINIT] util.py[DEBUG]: Attempting to remove /var/lib/cloud/instance
Apr 26 00:25:33 [CLOUDINIT] util.py[WARNING]: failed stage init-local
Apr 26 00:25:33 [CLOUDINIT] util.py[DEBUG]: failed stage init-local#012Traceback (most recent call last):#012 File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 521, in status_wrapper#012 ret = functor(name, args)#012 File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 247, in main_init#012 init.fetch(existing=existing)#012 File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 358, in fetch#012 return self._get_data_source(existing=existing)#012 File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 259, in _get_data_source#012 util.del_file(self.paths.instance_link)#012 File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1660, in del_file#012 raise e#012 File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1657, in del_file#012 os.unlink(path)#012IsADirectoryError: [Errno 21] Is a directory: '/var/lib/cloud/instance'
Apr 26 00:25:33 [CLOUDINIT] util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
Apr 26 00:25:33 [CLOUDINIT] util.py[DEBUG]: Read 10 bytes from /proc/uptime
Apr 26 00:25:33 [CLOUDINIT] util.py[DEBUG]: cloud-init mode 'init' took 0.084 seconds (0.08)
Apr 26 00:25:33 [CLOUDINIT] handlers.py[DEBUG]: finish: init-local: SUCCESS: searching for local datasources

cloud-init-output.log

------------------------------------------------------------
Cloud-init v. 0.7.8 running 'init-local' at Wed, 26 Apr 2017 00:25:33 +0000. Up 6.65 seconds.
2017-04-26 00:25:33,158 - util.py[WARNING]: failed stage init-local
failed run of stage init-local
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 521, in status_wrapper
    ret = functor(name, args)
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 247, in main_init
    init.fetch(existing=existing)
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 358, in fetch
    return self._get_data_source(existing=existing)
  File "/usr/lib/python3/dist-packages/c...

Read more...

Revision history for this message
Charles (bityard) wrote :

I saw this happen on a new AWS instance, created just a couple weeks ago. I started looking into this because a second EBS volume wasn't getting mounted on boot even though it's listed in /etc/fstab. The AMI ID is "ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-20170721 (ami-cd0f5cb6)".

The traceback I get is identical to those posted above. For some reason, something is creating /var/lib/cloud/instance as a directory and the cloud-init python code can't deal with that.

My workaround is to downgrade the cloud-init package. The faulty package version is "0.7.9-233-ge586fe35-0ubuntu1~16.04.1". Another very similar instance has version "0.7.9-153-g16a7302f-0ubuntu1~16.04.2" installed and that one works fine however I can't find it on any mirrors. The closest I could come was cloud-init_0.7.9-153-g16a7302f-0ubuntu1~16.10.1_all.deb. Once this is installed, no more tracebacks and all volumes are mounted.

Revision history for this message
Scott G. Miller (sgmiller) wrote :

I'm hitting this too, Xenial ami-58045e3d.

Attaching requested info...

Revision history for this message
Scott G. Miller (sgmiller) wrote :
Revision history for this message
Scott G. Miller (sgmiller) wrote :
Revision history for this message
Scott G. Miller (sgmiller) wrote :
Revision history for this message
Scott G. Miller (sgmiller) wrote :

In my case, it breaks some other systemd services we run which transitively depend on cloud-config completing.

Revision history for this message
valdemar pavesi (valdemarpavesi) wrote :

hello,

we got similar problem.

[ 20.005134] cloud-init[842]: Cloud-init v. 0.7.6 running 'init-local' at Wed, 14 Nov 2018 18:01:14 +0000. Up 19.50 seconds.

RRIP_1991A
[[32m OK [0m] Started Initial cloud-init job (pre-networking).
Starting Initial cloud-init job (metadata service crawler)...
[ 21.072371] cloud-init[938]: Cloud-init v. 0.7.6 running 'init' at Wed, 14 Nov 2018 18:01:16 +0000. Up 21.05 seconds.

[ 21.088164] cloud-init[938]: 2018-11-14 18:01:16,259 - util.py[WARNING]: Route info failed: Unexpected error while running command.

[ 21.089474] cloud-init[938]: Command: ['netstat', '-rn']
[ 21.090296] cloud-init[938]: Exit code: 1
[ 21.090972] cloud-init[938]: Reason: -
[ 21.091765] cloud-init[938]: Stdout: 'Kernel IP routing table\nDestination Gateway Genmask Flags MSS Window irtt Iface\n'
[ 21.093049] cloud-init[938]: Stderr: ''
[ 21.093865] cloud-init[938]: ci-info: +++++++++++++++++Net device info++++++++++++++++++
[ 21.094892] cloud-init[938]: ci-info: ----------------------------------------
[ 21.095891] cloud-init[938]: ci-info: | Device | Up | Address | Mask | Hw-Address |
[ 21.096915] cloud-init[938]: ci-info: ----------------------------------------
[ 21.097876] cloud-init[938]: ci-info: | lo: | True | 127.0.0.1 | 255.0.0.0 | . |
[ 21.098836] cloud-init[938]: ci-info: ----------------------------------------
[ 21.099758] cloud-init[938]: ci-info: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!Unable to render embedded object: File ( File (Route info failed) not found.) not found.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

+++++

[ 21.088164] cloud-init[938]: 2018-11-14 18:01:16,259 - util.py[WARNING]: Route info failed: Unexpected error while running command.

++

/usr/lib/python2.7/site-packages/cloudinit/util.py

regards!
Valdemar

Revision history for this message
StevenZeng (stevenzeng) wrote :

I solved the problem, Linux starts the services that include cloud-config, cloud-final, cloud-init, cloud-init-local in sequence. while the service of cloud-final is running, it created a directory that name is instance, so that lead to cloud-init failed.
  you must change the sequence of cloud-init services correctly, like cloud-init, cloud-init-local, cloud-config, cloud-final

Dan Watkins (oddbloke)
Changed in cloud-init:
status: New → Confirmed
Revision history for this message
rngadam (rngadam) wrote :

Encountered the same problem. What helped me work around the issue is to reinitialize cloud-init as per the doc:

https://cloudinit.readthedocs.io/en/latest/topics/faq.html#faq

How can I re-run datasource detection and cloud-init?
If a user is developing a new datasource or working on debugging an issue it may be useful to re-run datasource detection and the initial setup of cloud-init.

To do this, force ds-identify to re-run, clean up any logs, and re-run cloud-init:

$ sudo DI_LOG=stderr /usr/lib/cloud-init/ds-identify --force
$ sudo cloud-init clean --logs
$ sudo cloud-init init --local
$ sudo cloud-init init

Revision history for this message
Ob Rzwo (obr2pd) wrote :
Download full text (8.5 KiB)

Same here. Ubuntu 20.04 after sudo apt update && sudo apt dist-upgrade (some Python packages IIRC).

Solution was to reboot trice (the first two attempts stopped at booting) and `sudo rmdir /var/lib/cloud/instance`.

/var/log/cloud-init.log
<snip>
2020-09-03 16:16:08,090 - util.py[DEBUG]: Cloud-init v. 20.2-45-g5f7825e2-0ubuntu1~20.04.1 running 'init' at Thu, 03 Sep 2020 16:16:08 +0000. Up 9.14 seconds.
2020-09-03 16:16:08,091 - main.py[DEBUG]: No kernel command line url found.
2020-09-03 16:16:08,091 - main.py[DEBUG]: Closing stdin.
2020-09-03 16:16:08,099 - util.py[DEBUG]: Writing to /var/log/cloud-init.log - ab: [644] 0 bytes
2020-09-03 16:16:08,101 - util.py[DEBUG]: Changing the ownership of /var/log/cloud-init.log to 104:4
2020-09-03 16:16:08,152 - util.py[DEBUG]: Reading from /etc/os-release (quiet=False)
2020-09-03 16:16:08,152 - util.py[DEBUG]: Read 382 bytes from /etc/os-release
2020-09-03 16:16:08,155 - util.py[DEBUG]: Running command ['ip', 'addr', 'show'] with allowed return codes [0] (shell=False, capture=True)
2020-09-03 16:16:08,175 - util.py[DEBUG]: Running command ['ip', '-o', 'route', 'list'] with allowed return codes [0] (shell=False, capture=True)
2020-09-03 16:16:08,184 - util.py[DEBUG]: Running command ['ip', '--oneline', '-6', 'route', 'list', 'table', 'all'] with allowed return codes [0, 1] (shell=False, capture=True)
2020-09-03 16:16:08,193 - main.py[DEBUG]: Checking to see if files that we need already exist from a previous run that would allow us to stop early.
2020-09-03 16:16:08,194 - main.py[DEBUG]: Execution continuing, no previous run detected that would allow us to stop early.
2020-09-03 16:16:08,194 - handlers.py[DEBUG]: start: init-network/check-cache: attempting to read from cache [trust]
2020-09-03 16:16:08,195 - util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False)
2020-09-03 16:16:08,195 - stages.py[DEBUG]: no cache found
2020-09-03 16:16:08,195 - handlers.py[DEBUG]: finish: init-network/check-cache: SUCCESS: no cache found
2020-09-03 16:16:08,196 - util.py[DEBUG]: Attempting to remove /var/lib/cloud/instance
2020-09-03 16:16:08,197 - util.py[WARNING]: failed stage init
2020-09-03 16:16:08,197 - util.py[DEBUG]: failed stage init
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 653, in status_wrapper
    ret = functor(name, args)
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 323, in main_init
    init.fetch(existing=existing)
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 350, in fetch
    return self._get_data_source(existing=existing)
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 251, in _get_data_source
    util.del_file(self.paths.instance_link)
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1853, in del_file
    raise e
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1850, in del_file
    os.unlink(path)
IsADirectoryError: [Errno 21] Is a directory: '/var/lib/cloud/instance'
2020-09-03 16:16:08,205 - atomic_helper.py[DEBUG]: Atomically writing to file /var/lib/cloud/data/status.json (via temporary file /var/lib/cloud/data/tmpxajv6_y2)...

Read more...

Revision history for this message
Ob Rzwo (obr2pd) wrote :

Oops, that does not work. Every third reboot or so it doesn't boot up. The solution from rngadam has helped! However, MobaXterm comes with a warning that the host key has changed.

Revision history for this message
Paride Legovini (paride) wrote :

Hi,

If you are still able to reproduce the issue, could you please run:

  cloud-init collect-logs

after the failure and attach the generated tarball to this bug report? Those logs will help us better understand what's going on. It's very likely that the failure is triggered by the metadata service being unavailable, but still cloud-init should handle it better.

Thanks!

Revision history for this message
Dan Watkins (oddbloke) wrote :

Hey folks, I believe that this has been fixed in cloud-init 20.3 (specifically this commit: https://github.com/canonical/cloud-init/commit/0755cff078d5931e1d8e151bdcb84afb92bc0f02) so I'm going to move this to Fix Released.

If you are seeing this error on an earlier version of cloud-init, it generally indicates that cloud-init failed on a _previous_ boot (because before that fix, a failed boot meant it would incorrectly create a directory where a symlink should have been, leading to the `IsADirectoryError` when the old not-actually-a-symlink is `rm`d on subsequent boots). If you can identify the cause of this earlier failure, please feel free to file a bug for it!

Thanks,

Dan

Changed in cloud-init:
status: Confirmed → Fix Released
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.