Instance HA needs a few tweaks for BM computes that take long time rebooting

Bug #1797041 reported by Michele Baldessari
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Michele Baldessari

Bug Description

An IHA environment might have a compute node not recovering automatically because the nova-compute service might be stuck in a force-down state:
+ echo 'Running command: '\''/var/lib/nova/instanceha/check-run-nova-compute '\'''
+ exec /var/lib/nova/instanceha/check-run-nova-compute
Running command: '/var/lib/nova/instanceha/check-run-nova-compute '
Waiting for fence-down flag to be cleared
Waiting for fence-down flag to be cleared
Waiting for fence-down flag to be cleared

This can happen because said compute nodes might take a bit long to reboot after a crash and pacemaker gives up on it and will never clear the force-down flag.

Revision history for this message
Michele Baldessari (michele) wrote :

puppet-pacemaker part of the fix: https://review.openstack.org/609351

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/609367

Changed in tripleo:
milestone: stein-1 → stein-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by Michele Baldessari (<email address hidden>) on branch: master
Review: https://review.openstack.org/609367

Changed in tripleo:
milestone: stein-2 → stein-3
Changed in tripleo:
milestone: stein-3 → stein-rc1
Changed in tripleo:
milestone: stein-rc1 → train-1
Changed in tripleo:
milestone: train-1 → train-2
Changed in tripleo:
milestone: train-2 → train-3
Changed in tripleo:
milestone: train-3 → ussuri-1
Changed in tripleo:
milestone: ussuri-1 → ussuri-2
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.