cloud-final.service: "cloud-init modules --mode final" exit with "KeyError: 'modules-init'" after upgrade to version 18.2

Bug #1815109 reported by Antonio Romito
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
Medium
Antonio Romito

Bug Description

Description of problem:

After the upgrade of cloud-init to version 18.2 cloud-final.service do not start due to the following error and the service remains in not running state

-----
# service cloud-final status
Redirecting to /bin/systemctl status cloud-final.service
● cloud-final.service - Execute cloud user/final scripts
   Loaded: loaded (/usr/lib/systemd/system/cloud-final.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Fri 2019-02-01 13:14:31 CET; 28min ago
  Process: 21927 ExecStart=/usr/bin/cloud-init modules --mode=final (code=exited, status=1/FAILURE)
 Main PID: 21927 (code=exited, status=1/FAILURE)
-----

Version-Release number of selected component (if applicable):

Red Hat Enterprise Linux Server release 7.6 (Maipo)
cloud-init-18.2-1.el7_6.1.x86_64

How reproducible:

Steps to Reproduce:
1. [root@rhvm ~]# cloud-init modules --mode=final

Actual results:

[root@rhvm ~]# cloud-init modules --mode final
Cloud-init v. 18.2 running 'modules:final' at Wed, 06 Feb 2019 20:00:14 +0000. Up 10634.29 seconds.
Cloud-init v. 18.2 finished at Wed, 06 Feb 2019 20:00:15 +0000. Datasource DataSourceNoCloud [seed=/dev/sr0][dsmode=net]. Up 10634.40 seconds
Traceback (most recent call last):
  File "/usr/bin/cloud-init", line 9, in <module>
    load_entry_point('cloud-init==18.2', 'console_scripts', 'cloud-init')()
  File "/usr/lib/python2.7/site-packages/cloudinit/cmd/main.py", line 882, in main
    get_uptime=True, func=functor, args=(name, args))
  File "/usr/lib/python2.7/site-packages/cloudinit/util.py", line 2388, in log_time
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/cloudinit/cmd/main.py", line 679, in status_wrapper
    if v1[m]['errors']:
KeyError: 'modules-init'

Expected results:

[root@rhvm ~]# cloud-init modules --mode final
Cloud-init v. 18.2 running 'modules:final' at Wed, 06 Feb 2019 19:41:50 +0000. Up 9530.23 seconds.
Cloud-init v. 18.2 finished at Wed, 06 Feb 2019 19:41:50 +0000. Datasource DataSourceNoCloud [seed=/dev/sr0][dsmode=net]. Up 9530.34 seconds

Additional info:

This problem do not happens with previous cloud-init version:

cloud-init.x86_64 0:0.7.9-24.el7_5.1 will be updated

Related branches

summary: - cloud-init modules --mode final exit with "KeyError: 'modules-init'"
- after upgrade to version 18.2
+ cloud-final.service: cloud-init modules --mode final exit with
+ "KeyError: 'modules-init'" after upgrade to version 18.2
summary: - cloud-final.service: cloud-init modules --mode final exit with
+ cloud-final.service: "cloud-init modules --mode final" exit with
"KeyError: 'modules-init'" after upgrade to version 18.2
Revision history for this message
Ryan Harper (raharper) wrote :

Could you attach /var/log/cloud-init.log of the failure?

Changed in cloud-init:
status: New → Incomplete
Revision history for this message
Antonio Romito (antonioromito) wrote :

Hi Ryan, sorry if I didn't provided all the information to the case.

Attached you can find the /var/lg/cloud-init.log file.

As you can see from the log the package has been upgraded from version 0.7.9 to 18.2 between 2018-09-11 and 2018-11-16:

~~~
2018-09-11 16:39:32,351 - util.py[DEBUG]: Cloud-init v. 0.7.9 finished at Tue, 11 Sep 2018 20:39:32 +0000. Datasource DataSourceNoCloud [seed=/dev/sr0][dsmode=net]. Up 24.44 seconds
...
2018-11-16 17:41:35,920 - util.py[DEBUG]: Cloud-init v. 18.2 finished at Fri, 16 Nov 2018 17:41:35 +0000. Datasource DataSourceNoCloud [seed=/dev/sr0][dsmode=net]. Up 26.62 seconds
...
~~~

Furthermore as you can see from the log the "modules:init" mode has never been executed. For this reason into the file:

/var/lib/cloud/data/status.json

the json path related to "modules-init" is missing.

When the cloud-final.service execute the command "/usr/bin/cloud-init modules --mode=final" and the main.py script goes into the if statement:

~~~
...
    if mode == "modules-final":
        # write the 'finished' file
        errors = []
        for m in modes:
            if m in v1:
                if v1[m]['errors']:
                    errors.extend(v1[m].get('errors', []))
...
~~~

the for loop cycle all the "modes" ('init', 'init-local', 'modules-init', 'modules-config', 'modules-final').
When it arrives to the case where m == modules-init:

~~~
...
if v1["modules-init"]['errors']:
...
~~~

it fails with "KeyError: 'modules-init'" because the key "modules-init" does not exist in v1 array.

This is because the v1 array is populated with the data present into "status.json" file.

Revision history for this message
Antonio Romito (antonioromito) wrote :

Probably I found the reason:

In this peace of code where is checked if status variable is populated or not, in case it is it's not checked if all modes keys are present into the array, and in case a not set as "null".

~~~
...
    if status is None:
        status = {'v1': {}}
        for m in modes:
            status['v1'][m] = nullstatus.copy()
        status['v1']['datasource'] = None
    else:
        for m in modes:
            if m not in status['v1']:
               status['v1'][m] = nullstatus.copy()
...
~~~

I proposed the following fix (tested)

# diff /usr/lib/python2.7/site-packages/cloudinit/cmd/main.py.orig /usr/lib/python2.7/site-packages/cloudinit/cmd/main.py
643,644c643,646
< elif mode not in status['v1']:
< status['v1'][mode] = nullstatus.copy()
---
> else:
> for m in modes:
> if m not in status['v1']:
> status['v1'][m] = nullstatus.copy()

Revision history for this message
Antonio Romito (antonioromito) wrote :

Sorry original code in code #3 is:

~~~
...
    if status is None:
        status = {'v1': {}}
        for m in modes:
            status['v1'][m] = nullstatus.copy()
        status['v1']['datasource'] = None
    elif mode not in status['v1']:
        status['v1'][mode] = nullstatus.copy()
...
~~~

that should be modified in:

~~~
...
    if status is None:
        status = {'v1': {}}
        for m in modes:
            status['v1'][m] = nullstatus.copy()
        status['v1']['datasource'] = None
    else:
        for m in modes:
            if m not in status['v1']:
               status['v1'][m] = nullstatus.copy()
...
~~~

Changed in cloud-init:
status: Incomplete → New
Chad Smith (chad.smith)
Changed in cloud-init:
status: New → In Progress
importance: Undecided → Medium
Revision history for this message
Antonio Romito (antonioromito) wrote :

As per last suggestion of Dan Watkins I merged my last fix. Do you need some more information? Any other task to perform on my side?

Thanks,
Antonio

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

This bug is fixed with commit bdd9c0ac to cloud-init on branch master.
To view that commit see the following URL:
https://git.launchpad.net/cloud-init/commit/?id=bdd9c0ac

Changed in cloud-init:
status: In Progress → Fix Committed
Revision history for this message
Chad Smith (chad.smith) wrote : Fixed in cloud-init version 19.1.

This bug is believed to be fixed in cloud-init in version 19.1. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-init:
status: Fix Committed → Fix Released
Changed in cloud-init:
assignee: nobody → Antonio Romito (aromito)
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.