Grenade jobs failing due to failed ping to the test instance

Bug #2007357 reported by Slawek Kaplonski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Confirmed
High
Unassigned
Revision history for this message
yatin (yatinkarel) wrote :
Changed in neutron:
importance: Critical → High
Revision history for this message
yatin (yatinkarel) wrote :
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :
Revision history for this message
yatin (yatinkarel) wrote :

Pushed https://review.opendev.org/c/openstack/grenade/+/874417 to collect server console log if ping fails.

Revision history for this message
yatin (yatinkarel) wrote :

Ok it reproduced and we have console logs now:-
https://79513b8e59d721f7ba79-3211a8aa52f1a73b6f8e95a0078f36dc.ssl.cf1.rackcdn.com/874232/1/gate/neutron-ovs-grenade-multinode/82d0db0/controller/logs/grenade.sh_log.txt

So based on this:-
- Server ACTIVE at 13:49:02.205
- It tried 30 seconds waiting for ping succeed and timed out at 13:49:39.975
- From console log dhcp and metadata requests succeed, public-key fetch failed
    udhcpc: lease of 10.1.1.31 obtained, lease time 86400
    successful after 1/20 tries: up 39.70. iid=i-0000000b
    failed to get http://169.254.169.254/2009-04-04/meta-data/public-keys(expected as keypair is not passed)
- From neutron dhcp 13:49:41.463183 np0033217712 dnsmasq-dhcp[95819]: DHCPACK(tap080dcbda-71) 10.1.1.31 fa:16:3e:31:64:be nova-server1

So seems guest vm just slow to boot and increasing timeout from 30s to 40+second should help, but can check few more fails to see if it's always the same issue or another.

Revision history for this message
yatin (yatinkarel) wrote :

Check few more failures:-
- [1][2][3] all same issue as comment #5(info: /etc/init.d/rc.sysinit: up at 33.36)
info: /etc/init.d/rc.sysinit: up at 40.04
info: /etc/init.d/rc.sysinit: up at 40.58
info: /etc/init.d/rc.sysinit: up at 35.83

[4] slight difference server ACTIVE at 15:42:16.096 and pinged timeout at 15:42:53.729
info: /etc/init.d/rc.sysinit: up at 14.80
udhcpc: sending select for 10.1.3.219

no more messages after ^ in console log run at 15:42:56.518 even DHCPACK was 15 seconds before.

Feb 22 15:42:38.531241 np0033220059 dnsmasq-dhcp[93999]: DHCPACK(tap060de802-ca) 10.1.3.219 fa:16:3e:2c:6d:b1 nova-server1

[1] https://0fe40d3c8f5f65d37e16-935e188dfa3a1a49945ee8787d93930a.ssl.cf5.rackcdn.com/871104/6/check/manila-grenade/f4ff59f/controller/logs/grenade.sh_log.txt
[2] https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_9d5/874811/1/check/grenade/9d5bd12/controller/logs/grenade.sh_log.txt
[3] https://81b633e2c5fe858f8400-d324a81a71d524d51ede3dc5aee27774.ssl.cf5.rackcdn.com/850499/21/check/grenade-skip-level/c068cb6/controller/logs/grenade.sh_log.txt
[4] https://cf3dfb14062b562daedb-2ff8f9467c39881ef83876a3ff9f514b.ssl.cf1.rackcdn.com/873699/4/check/neutron-ovs-grenade-multinode/c5f681f/controller/logs/grenade.sh_log.txt

Revision history for this message
yatin (yatinkarel) wrote :

Pushed https://review.opendev.org/c/openstack/grenade/+/874822 to bump ping timeout to 60 seconds.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.