live-migration fails with two different CPU families in same availability-zone

Bug #1673547 reported by Drew Freiberger
40
This bug affects 7 people
Affects Status Importance Assigned to Milestone
OpenStack Nova Compute Charm
Triaged
Medium
Unassigned

Bug Description

Using xenial rev 135 of nova-compute charm (nova v13.1.2) we are experiencing issues with live-migration from HP gen9 processors to pre-existing HP gen8 servers due to incompatible cpu flags.

nova-compute charm options were enable-live-migration: True, and cpu-mode and cpu-model were at charm defaults.

Worked around by setting cpu-mode: custom and cpu-model: <output of virsh cpu-baseline all_node_<cpu>_features.xml>.

Would like to recommend that if enable-live-migration is set to True in the charm, peer relations of nova-compute units on different cpu generations should override/auto-configure lowest common denominator cpu into cpu-model based on cpu features of all related nova-compute units rather than operator having to compute the cpu-model to set in charm from cpu-baseline. Perhaps this could be a new cpu-mode of "auto-custom" or the like.

The following flags exist on the newly added nova-compute units that don't appear on previously existing nova-compute units.
> 3dnowprefetch
> adx
> cqm_mbm_local
> cqm_mbm_total
> hle
> intel_pt
> rdseed
> rtm
> smap

tags: added: canonical-bootstack
Revision history for this message
James Page (james-page) wrote :

It would be nice if OpenStack could figure this out across a deployment, and do something sane.

Changed in charm-nova-compute:
status: New → Triaged
importance: Undecided → Medium
milestone: none → 18.02
Ryan Beisner (1chb1n)
Changed in charm-nova-compute:
milestone: 18.02 → 18.05
Revision history for this message
do3meli (d-info-e) wrote :

I even see this behavior when live migrating instances from a HPE G9 to another HPE G9 with the same CPU model, but different CPU flags. After applying HPE SPP 2018.3 there seems to be one additional CPU flag on the upgraded G9: spec_ctrl.

I like the idea of having a "auto_custom" cpu-mode but wonder if this bug tracker is the right place. shouldn't this be more an upstream bug?

David Ames (thedac)
Changed in charm-nova-compute:
milestone: 18.05 → 18.08
James Page (james-page)
Changed in charm-nova-compute:
milestone: 18.08 → 18.11
David Ames (thedac)
Changed in charm-nova-compute:
milestone: 18.11 → 19.04
David Ames (thedac)
Changed in charm-nova-compute:
milestone: 19.04 → 19.07
David Ames (thedac)
Changed in charm-nova-compute:
milestone: 19.07 → 19.10
David Ames (thedac)
Changed in charm-nova-compute:
milestone: 19.10 → 20.01
James Page (james-page)
Changed in charm-nova-compute:
milestone: 20.01 → 20.05
David Ames (thedac)
Changed in charm-nova-compute:
milestone: 20.05 → 20.08
James Page (james-page)
Changed in charm-nova-compute:
milestone: 20.08 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.