[8.0] Pacemaker may to destroy on one of controllers after restart corosync
Bug #1576749 reported by
Vadim Rovachev
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Invalid
|
Medium
|
Oleksiy Molchanov | ||
8.0.x |
Won't Fix
|
Medium
|
Fuel Sustaining | ||
Mitaka |
Won't Fix
|
Medium
|
Fuel Sustaining |
Bug Description
Detailed bug description:
On swarm test:
https:/
we kill and run corosync on controllers 500 times in a row. But sometime this test hase failed.
Fail moment in test logs:
http://
fail moment on node-1(node with fail)
http://
failed job:
https:/
All snapshot attached.
Reproducibility:
sometime
https:/
Changed in fuel: | |
assignee: | nobody → Fuel Sustaining (fuel-sustaining-team) |
tags: | added: area-library |
tags: | added: ha |
To post a comment you must log in.
So, in logs we can see that actually node-1 have returned to cluster, it is also visible on node-2 (the node where from we run 'pcs status nodes' in tests). pcs_status command logs in diagnostic snapshot show that node-1 was in cluster. But test logs indicate that for 20 seconds node-1 was offline, so test failed.
I was trying to reproduce it, but didn't manage. So I am marking this as Incomplete, until we have environment to revert.