Compute node instances unable to reach metadata endpoint
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MicroStack |
New
|
Undecided
|
Unassigned |
Bug Description
Getting the following error in a multinode setup in instances running on compute nodes. Instances scheduled on the control node are able to connect to http://
failed to read iid from metadata. tried 20
The above error results in nodes being unable to get SSH key which makes juju impossible to install. Also using these instances is somewhat harder because the SSH key does not get injected into the system. Instances are able to communicate to each other though so the network seems to be working just fine.
Below are logs from an instance running on a compute node. Control node runs on Pop!_OS 20.04 LTS. Compute nodes run on focal and hirsute (pop os).
info: initramfs: up at 0.56
modprobe: module virtio_pci not found in modules.dep
modprobe: module virtio_blk not found in modules.dep
modprobe: module virtio_net not found in modules.dep
modprobe: module vfat not found in modules.dep
modprobe: module nls_cp437 not found in modules.dep
info: copying initramfs to /dev/vda1
info: initramfs loading root from /dev/vda1
info: /etc/init.
info: container: none
Starting logging: OK
modprobe: module virtio_pci not found in modules.dep
modprobe: module virtio_blk not found in modules.dep
modprobe: module virtio_net not found in modules.dep
modprobe: module vfat not found in modules.dep
modprobe: module nls_cp437 not found in modules.dep
WARN: /etc/rc3.
Initializing random number generator... [ 0.849150] random: dd urandom read with 17 bits of entropy available
done.
Starting acpid: OK
Starting network...
udhcpc (v1.23.2) started
Sending discover...
Sending select for 192.168.222.253...
Lease of 192.168.222.253 obtained, lease time 43200
route: SIOCADDRT: File exists
WARN: failed: route add -net "0.0.0.0/0" gw "192.168.222.1"
checking http://
failed 1/20: up 0.78. request failed
failed 2/20: up 12.78. request failed
failed 3/20: up 24.79. request failed
failed 4/20: up 36.80. request failed
failed 5/20: up 48.81. request failed
failed 6/20: up 60.82. request failed
failed 7/20: up 72.82. request failed
failed 8/20: up 84.83. request failed
failed 9/20: up 96.84. request failed
failed 10/20: up 108.85. request failed
failed 11/20: up 120.86. request failed
failed 12/20: up 132.86. request failed
failed 13/20: up 144.87. request failed
failed 14/20: up 156.88. request failed
failed 15/20: up 168.89. request failed
failed 16/20: up 180.90. request failed
failed 17/20: up 192.91. request failed
failed 18/20: up 204.91. request failed
failed 19/20: up 216.92. request failed
[ 230.442452] random: nonblocking pool is initialized
failed 20/20: up 228.93. request failed
failed to read iid from metadata. tried 20
failed to get instance-id of datasource
Top of dropbear init script
Starting dropbear sshd: failed to get instance-id of datasource
OK
GROWROOT: CHANGED: partition=1 start=18432 old: size=71647 end=90079 new: size=2078687,
=== system information ===
Platform: OpenStack Foundation OpenStack Nova
Container: none
Arch: x86_64
CPU(s): 1 @ 3593.250 MHz
Cores/Sockets/
Virt-type: AMD-V
RAM Size: 488MB
Disks:
NAME MAJ:MIN SIZE LABEL MOUNTPOINT
vda 253:0 1073741824
vda1 253:1 1064287744 cirros-rootfs /
vda15 253:15 8388608
=== sshd host keys ===
-----BEGIN SSH HOST KEY KEYS-----
ssh-rsa AAAAB3NzaC1yc2E
ssh-dss AAAAB3NzaC1kc3M
-----END SSH HOST KEY KEYS-----
=== network info ===
if-info: lo,up,127.0.0.1,8,,
if-info: eth0,up,
ip-route:default via 192.168.222.1 dev eth0
ip-route:
ip-route:
ip-route6:fe80::/64 dev eth0 metric 256
ip-route6:
ip-route6:ff00::/8 dev eth0 metric 256
ip-route6:
=== datasource: None None ===
=== cirros: current=0.4.0 uptime=241.67 ===
____ ____ ____
/ __/ __ ____ ____ / __ \/ __/
/ /__ / // __// __// /_/ /\ \
\___//_//_/ /_/ \____/___/
http://
/dev/root resized successfully [took 0.13s]
login as 'cirros' user. default password: 'gocubsgo'. use 'sudo' for root.
cirros login:
I had the same issue: Instances took IP, but not hostname and key pair. The error in instances logs:
checking http:// 169.254. 169.254/ 2009-04- 04/instance- id
failed 1/20: up 0.78. request failed
failed 2/20: up 12.78. request failed
failed 3/20: up 24.79. request failed
...
It got solved by adding `--config-drive true` to the server creation command, or, `force_config_drive = true` into the nova.conf file on the compute node.