azure locks existing user if instance id changes

Bug #1849677 reported by Sam Eiderman on 2019-10-24
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Medium
Sam Eiderman

Bug Description

The same bug was actually reported by someone else as a waagent bug here:
https://github.com/Azure/WALinuxAgent/issues/454

But was closed due to no followup of original user.

Cloud Provider: Azure
VM: Ubuntu 14.04 (And probably all higher versions)

When provisioning a VM on Azure, cloud-init uses /dev/sr0 to find ovf-env.xml.
Since the instance is new, cc_users_groups which runs "per instance" and adds my user which is configured with a password (not ssh-key) to the system.

Now cloud-init copies ovf-env.xml to /var/lib/waagent/ to be used as a cache.
But the password is changed to REDACTED.

Notice that on following boots, when cloud-init loads DataSourceAzure, it uses /var/lib/waagent/ovf-env.xml and the password is REDACTED, and therefore is considered as no password:
https://github.com/cloud-init/cloud-init/commit/8af1802c9971ec1f2ebac23e9b42d5b42f43afae#diff-e0eb215db26e21dbe2d98455fea68595R601
So DataSourceAzure does not configure defuser["lock_passwd"] = False, it is True by default and now the defuser configuration contains a directive to lock this user account.

Usually everything works and the the user never gets locked since we are using the same instance, and cc_users_groups never gets invoked (which is a per instance action), but when the instance id does change (when exporting the disks to a different machine) the user will get locked by create_user() with defuser["lock_passwd"] = True.

I guess the correct logic should have been:

    if password:
        defuser['lock_passwd'] = False
        if DEF_PASSWD_REDACTION != password:
            defuser['passwd'] = encrypt_pass(password)

In this case create_user() will be invoked, add_user() will not do anything since the user exists and no locking will occur later on in create_user().

Related branches

Changed in cloud-init:
status: New → Triaged
Ryan Harper (raharper) wrote :

Thanks for filing the bug.

Would you be able to run cloud-init collect-logs and attach the tarball it creates?

Alternatively, if you could provide a sanitized version of /var/log/cloud-init.log; that should be sufficient to see what's going on.

Lastly, when creating a new instance with captured disks, shouldn't cloud-init find a new ovf-xml from the attached iso versus loading the previously saved xml file? Otherwise, this would set the created user's password to the redacted value rather than the original password when you created the first instance?

Changed in cloud-init:
importance: Undecided → Medium
status: Triaged → Incomplete
Sam Eiderman (sameid) wrote :

Hi,

I will attach logs soon.

The password will never be set to REDACTED, instead it will be as if the user did not supply a password and the specified username will instead be locked:
https://github.com/cloud-init/cloud-init/commit/8af1802c9971ec1f2ebac23e9b42d5b42f43afae#diff-e0eb215db26e21dbe2d98455fea68595R601

It is true that when using the same disks on Azure, if we attach them to a new instance, new values should be copied from /dev/sr0.

But there are two scenarios where /dev/sr0 does not exist.

1. The new instance already booted before on Azure and the disks were swapped. /dev/sr0 only exists on the first boot. (This behavior can also be simulated on Azure by editing the instance id file manually, although this is not a "real behavior" case)

2. The disks are exported outside Azure, /dev/sr0 does not exist, DataSourceAzure still loads and finds /var/lib/waagent/ovf-env.xml.

Regarding "1" - I guess if you use Azure "correctly" as you said yourself, this should not happen to you.

Regarding "2" - This happens in Ubuntu 14 but not in Ubuntu 16 due to the following commit:
https://github.com/cloud-init/cloud-init/commit/5fb49bacf7441d8d20a7b4e0e7008ca586f5ebab
which does not allow DataSourceAzure to run outside Azure, however this was not backported to cloud-init 0.7.5 which is available for Ubuntu.

I think that by correcting the code to:

    if password:
        defuser['lock_passwd'] = False
        if DEF_PASSWD_REDACTION != password:
            defuser['passwd'] = encrypt_pass(password)

We fix the following configuration:

First boot:
    defuser = {
        'name': username,
        'passwd': encrypt_pass(password),
        'lock_passwd': False
    }
Subsequent boots:
    defuser = {
        'name': username,
        'lock_passwd': True
    }

to:

First boot:
    defuser = {
        'name': username,
        'passwd': encrypt_pass(password),
        'lock_passwd': False
    }
Subsequent boots:
    defuser = {
        'name': username,
        'lock_passwd': False
    }

Sam

Ryan Harper (raharper) wrote :

Thanks for the clarification; I understand the code change now.

It seems that in create_user() even if the user already exists (as in your disk-reuse scenario) create_user continues to make "new" user modifications. We can't really know if the specified user already existed in the image, or is a case of previously booted instance, which means that having Azure Datasource ensure the user account isn't locked is likely the best course of action here.

Changed in cloud-init:
assignee: nobody → Sam Eiderman (sameid)

This bug is fixed with commit e1b4b8c9 to cloud-init on branch master.
To view that commit see the following URL:
https://git.launchpad.net/cloud-init/commit/?id=e1b4b8c9

Changed in cloud-init:
status: Incomplete → Fix Committed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers