Comment 13 for bug 1680167

Revision history for this message
Oliver Walsh (owalsh) wrote :

Hmmm, not 100% sure the infrastructure is the culprit...

Looking at the logs here:
http://logs.openstack.org/56/471956/11/check/gate-tripleo-ci-centos-7-containers-multinode-upgrades-nv/2258e1b/

traceroute succeeds when the job begins:
http://logs.openstack.org/56/471956/11/check/gate-tripleo-ci-centos-7-containers-multinode-upgrades-nv/2258e1b/console.html#_2017-06-28_13_06_29_028535
2017-06-28 13:06:29.028535 | traceroute to git.openstack.org (104.130.246.128), 30 hops max, 60 byte packets
2017-06-28 13:06:34.035102 | 1 15.184.64.1 9.740 ms 0.747 ms 0.718 ms
...

but ping fails from the controller around 1 hour later: http://logs.openstack.org/56/471956/11/check/gate-tripleo-ci-centos-7-containers-multinode-upgrades-nv/2258e1b/logs/subnode-2/var/log/messages.txt.gz#_Jun_28_14_54_25

Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:54:25,155] (heat-config) [INFO] {"deploy_stdout": "Trying to ping 192.168.24.10 for local network 192.168.24.0/24.\nPing to 192.168.24.10 succeeded.\nSUCCESS\nTrying to ping default gateway 15.184.64.1...Ping to 15.184.64.1 failed. Retrying...\nPing to 15.184.64.1 failed. Retrying...\nPing to 15.184.64.1 failed. Retrying...\nPing to 15.184.64.1 failed. Retrying...\nPing to 15.184.64.1 failed. Retrying...\nPing to 15.184.64.1 failed. Retrying...\nPing to 15.184.64.1 failed. Retrying...\nPing to 15.184.64.1 failed. Retrying...\nPing to 15.184.64.1 failed. Retrying...\nPing to 15.184.64.1 failed. Retrying...\nFAILURE\n15.184.64.1 is not pingable.\n", "deploy_stderr": "", "deploy_status_code": 1}
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:54:25,158] (heat-config) [DEBUG] [2017-06-28 14:04:24,048] (heat-config) [INFO] ping_test_ips=192.168.24.10 192.168.24.10 192.168.24.10 192.168.24.10 192.168.24.10 192.168.24.10
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:04:24,048] (heat-config) [INFO] validate_fqdn=False
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:04:24,048] (heat-config) [INFO] deploy_server_id=2012629a-13ca-43e2-9a4f-2818ecc705d9
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:04:24,048] (heat-config) [INFO] deploy_action=CREATE
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:04:24,049] (heat-config) [INFO] deploy_stack_id=overcloud-ControllerAllNodesValidationDeployment-sjlbjpxkkq3z/fb669aab-6a19-4f56-b156-3c6352b4a928
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:04:24,049] (heat-config) [INFO] deploy_resource_name=0
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:04:24,049] (heat-config) [INFO] deploy_signal_transport=CFN_SIGNAL
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:04:24,049] (heat-config) [INFO] deploy_signal_id=http://192.168.24.1:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3A1a6779e490614d8f8aaee69b08761314%3Astacks%2Fovercloud-ControllerAllNodesValidationDeployment-sjlbjpxkkq3z%2Ffb669aab-6a19-4f56-b156-3c6352b4a928%2Fresources%2F0?Timestamp=2017-06-28T14%3A04%3A18Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=df9ee8c74b564c4986e751999f78e24d&SignatureVersion=2&Signature=tGkZxInloRVRy1Gu%2BPdA9jb%2FrEUc43y8yLKO1l%2BnXLs%3D
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:04:24,049] (heat-config) [INFO] deploy_signal_verb=POST
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:04:24,049] (heat-config) [DEBUG] Running /var/lib/heat-config/heat-config-script/f109f3ec-80e5-40a2-aa28-2be7541fe69f
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:54:25,145] (heat-config) [INFO] Trying to ping 192.168.24.10 for local network 192.168.24.0/24.
Jun 28 14:54:25 localhost os-collect-config: Ping to 192.168.24.10 succeeded.
Jun 28 14:54:25 localhost os-collect-config: SUCCESS
Jun 28 14:54:25 localhost os-collect-config: Trying to ping default gateway 15.184.64.1...Ping to 15.184.64.1 failed. Retrying...
Jun 28 14:54:25 localhost os-collect-config: Ping to 15.184.64.1 failed. Retrying...
Jun 28 14:54:25 localhost os-collect-config: Ping to 15.184.64.1 failed. Retrying...
Jun 28 14:54:25 localhost os-collect-config: Ping to 15.184.64.1 failed. Retrying...
Jun 28 14:54:25 localhost os-collect-config: Ping to 15.184.64.1 failed. Retrying...
Jun 28 14:54:25 localhost os-collect-config: Ping to 15.184.64.1 failed. Retrying...
Jun 28 14:54:25 localhost os-collect-config: Ping to 15.184.64.1 failed. Retrying...
Jun 28 14:54:25 localhost os-collect-config: Ping to 15.184.64.1 failed. Retrying...
Jun 28 14:54:25 localhost os-collect-config: Ping to 15.184.64.1 failed. Retrying...
Jun 28 14:54:25 localhost os-collect-config: Ping to 15.184.64.1 failed. Retrying...
Jun 28 14:54:25 localhost os-collect-config: FAILURE
Jun 28 14:54:25 localhost os-collect-config: 15.184.64.1 is not pingable.
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:54:25,146] (heat-config) [DEBUG]
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:54:25,146] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-script/f109f3ec-80e5-40a2-aa28-2be7541fe69f. [1]
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:54:25,158] (heat-config) [INFO] Completed /usr/libexec/heat-config/hooks/script
Jun 28 14:54:25 localhost os-collect-config: [2017-06-28 14:54:25,159] (heat-config) [DEBUG] Running heat-config-notify /var/lib/heat-config/deployed/f109f3ec-80e5-40a2-aa28-2be7541fe69f.json < /var/lib/heat-config/deployed/f109f3ec-80e5-40a2-aa28-2be7541fe69f.notify.json

Looks like ping -w fails quickly. The docs say it will on a network error.