[System Tests]Provisioning failed on compute node. Timeout reached. Error: Heartbeat read failed from 'stomp://mcollective
Bug #1423487 reported by
Alexander Kurenyshev
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Steps to reproduce:
1. Create cluster in Ha mode with 1 controller
2. Add 1 node with controller role
3. Add 1 node with compute role
4. Add 1 node with cinder role
5. Deploy the cluster
Expected behaviour:
Deploy is successful
Actual behaviour:
Provisioning failed. Timeout is exceeded.
On mcollective log on the compute node:
06:24:48.959181 #1226] ERROR -- : rabbitmq.rb:50:in `on_hbread_fail' Heartbeat read failed from 'stomp:
To post a comment you must log in.
That on_hbread_fail error is not a problem, there are succesfull tasks executed by mcollective server on node-1 after the Heartbeat read error:
2015-02- 18T06:25: 56.365667+ 00:00 debug: 06:25:55.966214 #1226] DEBUG -- : rabbitmq.rb:66:in `on_hbfire' Publishing heartbeat to stomp:/ /mcollective@ 10.109. 26.2:61613: send_fire, curt1424240755. 96599last_ sleep30. 4995858669281 18T06:26: 17.948403+ 00:00 debug: 06:26:17.467105 #1226] DEBUG -- : rabbitmq.rb:64:in `on_hbfire' Received heartbeat from stomp:/ /mcollective@ 10.109. 26.2:61613: receive_fire, curt1424240777. 46687
2015-02-
I can see a normal shutdown process for node-1 in bootstrap logs. So according to diagnostic snapshot node-1 did not come back up from the reboot at the start of provisioning stage for some reason (fuel-snapshot- 2015-02- 19_09-36- 23/node- 1.test. domain. local/commands/ has None in all commands as well).
So it looks like the problem was on the host system level. Marking this bug as invalid.