As seen in http://logs.openstack.org/46/444746/4/check-tripleo/gate-tripleo-ci-centos-7-ovb-updates/186b755/console.html
Deployment started at approximately 16:55, the entire job timed out at ~18:40, 105 minutes later by my math. The deploy process never timed out the way it should have so we get no logs to debug the hang.
Here's a snippet of the log in case it expires:
2017-03-22 16:54:09.780248 | Started Mistral Workflow tripleo.deployment.v1.deploy_plan. Execution ID: f801016d-2d9e-4451-b409-1dc19f31ad7b
2017-03-22 16:54:49.497060 | 2017-03-22 16:54:37.662874 [overcloud]: CREATE_IN_PROGRESS Stack CREATE started
2017-03-22 16:54:49.497493 | 2017-03-22 16:54:37.753079 [overcloud.MysqlRootPassword]: CREATE_IN_PROGRESS state changed
...
2017-03-22 16:55:52.409300 | 2017-03-22 16:55:50.186774 [overcloud.ControllerServiceChain.ServiceChain.83]: CREATE_IN_PROGRESS state changed
2017-03-22 16:55:52.532207 | 2017-03-22 16:55:50.178450 [overcloud.ControllerServiceChain.ServiceChain.44]: CREATE_IN_PROGRESS Stack CREATE started
2017-03-22 16:55:52.536552 | 2017-03-22 16:55:50.258160 [overcloud.ControllerServiceChain.ServiceChain.3.CephBase]: CREATE_IN_PROGRESS Stack CREATE started
2017-03-22 18:39:59.020136 | /home/jenkins/workspace/gate-tripleo-ci-centos-7-ovb-updates/devstack-gate/functions.sh: line 1074: 15643 Killed timeout -s 9 ${REMAINING_TIME}m bash -c "source $WORKSPACE/devstack-gate/functions.sh && $cmd"
If heat-engine OOM'd, is it possible that the stack timeout would not take effect? I've heard reports that 8 GB underclouds are no longer sufficient in some developer environments, so it's possible we're hitting the same problem in ci now.