cloud-init wait for waagent on Azure CentOS 7.4 - no sshd start
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
cloud-init |
Fix Released
|
Undecided
|
Unassigned | ||
cloud-init (CentOS) |
Unknown
|
Unknown
|
Bug Description
Hello,
after update a CentOS 7.3-VM on Azure to CentOS 7.4, you can not connet via ssh because cloud-init try to start the waagent and the boot process hang. So sshd is stopped.
We install a fresh CentOS 7.4 in the Azure cloud to provide a base image template for our company and this will also happens in this VM.
#######
# yum info cloud-init:
Name : cloud-init
Arch : x86_64
Version : 0.7.9
Release : 9.el7.centos.2
Size : 2.1 M
Repo : installed
From repo : base
In CentOS 7.3 the cloud-init version is 0.7.5-10.
waagent is Package-Version 2.2.14-1.el7 in both CentOS versions witch is internal updated to 2.2.17 from waagent it self.
#######
To debug the failure I had to install rlogin before update:
yum remove firewalld -y
yum install rsh-server -y
systemctl enable rsh.socket
systemctl enable rlogin.socket
systemctl enable rexec.socket
echo "root:123" | chpasswd
echo "+ root" > ~/.rlogin
cat << EOF >> /etc/securetty
rsh
rexec
rlogin
EOF
reboot
yum update -y
reboot
#######
to unblock the process I have connect via rlogin and kill the waagent start:
# ps -ef | grep "waagent\|cloud"
root 993 1 0 14:52 ? 00:00:02 /usr/bin/python /usr/bin/cloud-init init
root 1134 993 0 14:52 ? 00:00:00 /bin/systemctl start waagent.service
root 1337 1222 0 15:56 pts/2 00:00:00 grep --color=auto waagent\|cloud
# kill 1134
Then cloud-init do magic and on the next reboot sshd start without any trouble.
#######
To fail the VM again you can clear the config and reboot:
yum remove cloud-init WALinuxAgent -y
rm -f /etc/waagent.con*
rm -fr /etc/cloud/
rm -fr /var/lib/cloud/
rm -fr /var/lib/waagent/
rm -fr /var/log/
rm -fr /var/log/
yum install cloud-init WALinuxAgent -y
cp -a /etc/waagent.conf /etc/waagent.
sed -i -e "s/Provisioning
sed -i -e "s/Provisioning
sed -i -e "s/Logs.
cp -a /etc/cloud/
cat << EOF >> /etc/cloud/
# From cloud-init docs
datasource:
Azure:
agent_command: [service, waagent, start]
debug:
verbose: True
EOF
diff /etc/waagent.
diff /etc/cloud/
reboot
#######
I didn't know why the system hang.
Can you please review this.
WALinuxAgent-Bug: https:/ /github. com/Azure/ WALinuxAgent/ issues/ 902