cloud-init wait for waagent on Azure CentOS 7.4 - no sshd start

Bug #1720160 reported by Maik Stübner on 2017-09-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Undecided
Unassigned
cloud-init (CentOS)
Unknown
Unknown

Bug Description

Hello,
after update a CentOS 7.3-VM on Azure to CentOS 7.4, you can not connet via ssh because cloud-init try to start the waagent and the boot process hang. So sshd is stopped.

We install a fresh CentOS 7.4 in the Azure cloud to provide a base image template for our company and this will also happens in this VM.

#######
# yum info cloud-init:
Name : cloud-init
Arch : x86_64
Version : 0.7.9
Release : 9.el7.centos.2
Size : 2.1 M
Repo : installed
From repo : base

In CentOS 7.3 the cloud-init version is 0.7.5-10.el7.centos.1
waagent is Package-Version 2.2.14-1.el7 in both CentOS versions witch is internal updated to 2.2.17 from waagent it self.

#######
To debug the failure I had to install rlogin before update:

yum remove firewalld -y
yum install rsh-server -y
systemctl enable rsh.socket
systemctl enable rlogin.socket
systemctl enable rexec.socket
echo "root:123" | chpasswd
echo "+ root" > ~/.rlogin
cat << EOF >> /etc/securetty
rsh
rexec
rlogin
EOF

reboot

yum update -y
reboot

#######
to unblock the process I have connect via rlogin and kill the waagent start:

# ps -ef | grep "waagent\|cloud"
root 993 1 0 14:52 ? 00:00:02 /usr/bin/python /usr/bin/cloud-init init
root 1134 993 0 14:52 ? 00:00:00 /bin/systemctl start waagent.service
root 1337 1222 0 15:56 pts/2 00:00:00 grep --color=auto waagent\|cloud

# kill 1134

Then cloud-init do magic and on the next reboot sshd start without any trouble.

#######
To fail the VM again you can clear the config and reboot:
yum remove cloud-init WALinuxAgent -y
rm -f /etc/waagent.con*
rm -fr /etc/cloud/
rm -fr /var/lib/cloud/
rm -fr /var/lib/waagent/
rm -fr /var/log/waagent.lo*
rm -fr /var/log/cloud-init*
yum install cloud-init WALinuxAgent -y

cp -a /etc/waagent.conf /etc/waagent.conf.rpmsave
sed -i -e "s/Provisioning.Enabled.*/Provisioning.Enabled=n/g" /etc/waagent.conf
sed -i -e "s/Provisioning.UseCloudInit.*/Provisioning.UseCloudInit=y/g" /etc/waagent.conf
sed -i -e "s/Logs.Verbose.*/Logs.Verbose=y/g" /etc/waagent.conf

cp -a /etc/cloud/cloud.cfg /etc/cloud/cloud.cfg.rpmsave
cat << EOF >> /etc/cloud/cloud.cfg

# From cloud-init docs
datasource:
  Azure:
    agent_command: [service, waagent, start]

debug:
  verbose: True

EOF

diff /etc/waagent.conf.rpmsave /etc/waagent.conf
diff /etc/cloud/cloud.cfg.rpmsave /etc/cloud/cloud.cfg

reboot

#######
I didn't know why the system hang.
Can you please review this.

Maik Stübner (mstuebner) wrote :
Maik Stübner (mstuebner) wrote :
Maik Stübner (mstuebner) wrote :
Maik Stübner (mstuebner) wrote :

Hello,
after trying with WALinuxAgent 2.2.18 and the following Config, it works for me. Cloud-Init finds Azure as datasource, WALinuxAgent starts and give Azure a running state of the VM. So the ticket can be closed.
/etc/waagent.conf
    Provisioning.Enabled=n

/etc/cloud/cloud.cfg
    -> uncanged

systemctl enable waagent

Scott Moser (smoser) wrote :

Closing per comment from Maik.

If you find this is not working, please feel free to re-open.

Scott

Changed in cloud-init:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.