p_heat-engine cannot be started by pacemaker once heat-engine returned an error
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Invalid
|
High
|
Bogdan Dobrelya |
Bug Description
Scenario:
1. Create cluster
2. Add 1 controller node
3. Deploy the cluster
4. Add 2 controller nodes
5. Deploy changes
6. Check crm status
Expected result: all resources are functional.
Actual result: 'p_heat-engine' failed to start on the primary controller.
According to the logs, on the steps 2 and 3, 'p_heat-engine' started successfully.
But after the step 5, heat-engine service was trying to start when 'rabbitmq' was not started:
================ (primary controller) node-2: /var/log/
2014-12-02 09:29:51.853 3187 ERROR oslo.messaging.
...
Since several minutes 'rabbitmq' was started, but pacemaker couldn't start the 'heat-engine' :
=============== (primary controller) node-2 $ crm status
...
Failed actions:
p_heat-
, queued=60001ms, exec=0ms
): unknown error
...
================ (primary controller) node-2: /var/log/
<29>Dec 2 09:29:50 node-2 pengine[2657]: notice: LogActions: Start p_heat-engine:0 (node-2)
<28>Dec 2 09:30:49 node-2 lrmd[2655]: warning: child_timeout_
<28>Dec 2 09:30:49 node-2 lrmd[2655]: warning: operation_finished: p_heat-
<28>Dec 2 09:30:49 node-2 pengine[2657]: warning: unpack_rsc_op: Processing failed op start for p_heat-engine:0 on node-2: unknown error (1)
...
<28>Dec 2 09:33:20 node-2 pengine[2657]: warning: unpack_rsc_op: Processing failed op start for p_heat-engine:0 on node-2: unknown error (1)
<28>Dec 2 09:34:55 node-2 pengine[2657]: warning: unpack_rsc_op: Processing failed op start for p_heat-engine:0 on node-2: unknown error (1)
<28>Dec 2 09:34:55 node-2 pengine[2657]: warning: unpack_rsc_op: Processing failed op start for p_heat-engine:0 on node-2: unknown error (1)
<28>Dec 2 09:35:32 node-2 pengine[2657]: warning: unpack_rsc_op: Processing failed op start for p_heat-engine:0 on node-2: unknown error (1)
<28>Dec 2 09:37:34 node-2 pengine[2657]: warning: unpack_rsc_op: Processing failed op start for p_heat-engine:0 on node-2: unknown error (1)
<28>Dec 2 09:37:34 node-2 pengine[2657]: warning: unpack_rsc_op: Processing failed op start for p_heat-engine:0 on node-2: unknown error (1)
These errors are continued until I run:
crm_resource --resource p_heat-engine --cleanup
Changed in fuel: | |
importance: | Undecided → High |
Changed in fuel: | |
assignee: | Fuel Library Team (fuel-library) → Bogdan Dobrelya (bogdando) |
According to the logs snapshot, the final state of heat engine and rabbit was ok: p_rabbitmq- server [p_rabbitmq-server]
Last updated: Tue Dec 2 10:00:05 2014
Last change: Tue Dec 2 09:59:37 2014 via crm_attribute on node-2
...
Clone Set: clone_p_heat-engine [p_heat-engine]
Started: [ node-2 node-4 node-5 ]
Master/Slave Set: master_
Masters: [ node-2 ]
Slaves: [ node-4 node-5 ]
Are you sure this bag is valid?