AIO-SX: overrides for vcpu_pin_set not updated after Platform cpu assignment update and unlock

Bug #1825056 reported by Wendy Mitchell
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Won't Fix
Low
Jim Gauld

Bug Description

Brief Description
-----------------
overrides for vcpu_pin_set not updated after Platform cpu assignment update and unlock

Severity
--------
standard

Steps to Reproduce
------------------

1. Instance created launch (single host system) with flavor that has 4 vcpus, dedicated cpu policy.
2. Lock host
3. Update cpu allocation increasing the platform cpus
4. Unlock the host and confirm instance does not overlap platform cpu allocation

Expected Behavior
------------------
Shouldn't nova be aware of the change to the cpu assignment and follow the new application cpu allocation (override vcpu_pin_set)

Actual Behavior
----------------
The host unlocked successfully.

2019-04-16 20:02:50.381 922 INFO sysinv.api.controllers.v1.host [-] controller-0 _handle_unlock_action
...
2019-04-16 20:19:02.186 908 INFO sysinv.api.controllers.v1.host [-] controller-0 apply ihost_val {'vim_progress_status': 'services-enabled'}

System inventory reports the platform cpu updated as expected as follows.
However, nova ignores the change to vcpu_pin_set (and the instance starts up appearing to ignore this update)

/opt/platform/helm/19.01/openstack-nova.yaml
overrides:
        nova_compute:
          hosts:
          - conf:
              nova:
                DEFAULT:
                  my_ip: 192.168.206.3
                  shared_pcpu_map: '""'
                  vcpu_pin_set: '"4-7,12-15"'

[wrsroot@controller-0 ~(keystone_admin)]$ kubectl exec -it -n openstack nova-compute-controller-0-a762cb46-2rj5w -c nova-compute cat /etc/nova/nova.conf | grep vcpu_
vcpu_pin_set = "4-7,12-15"

system inventory reports this after unlock:
Platform Processor 0 : 0-4,8-12
vSwitch Processor 0 : 5-6,13-14
Applications Processor 0 : 7,15

 $ sudo virsh dumpxml instance-00000014
 <cputune>
    <shares>4096</shares>
    <vcpupin vcpu='0' cpuset='14'/>
    <vcpupin vcpu='1' cpuset='6'/>
    <vcpupin vcpu='2' cpuset='5'/>
    <vcpupin vcpu='3' cpuset='13'/>
    <emulatorpin cpuset='5-6,13-14'/>
  </cputune>

Reproducibility
---------------
yes

System Configuration
--------------------
simplex
(Lab: sm-2)

Branch/Pull Time/Commit
--------------------
BUILD_ID="20190415T233001Z"
Job: STX_build_master_master

description: updated
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → High
status: New → Triaged
assignee: nobody → Jim Gauld (jgauld)
tags: added: stx.2.0 stx.containers stx.retestneeded
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; high priority - related to new container capability to pin pods to cpus

Revision history for this message
Jim Gauld (jgauld) wrote :

This is a AIO Simplex day-one issue. If we re-configure the platform cores while an instance is running, there is nowhere to evacuate. There ends up being no scheduling operation (eg, spawn, resize, migrate, evacuate, etc), so the instance pinning becomes invalid.

Will need to get SME consensus on how to address this. Eg, should we prevent CPU configuration on AIO simplex when there are instances?

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Lowering priority to medium given that the issue is specific to AIO-SX

summary: - overrides for vcpu_pin_set not updated after Platform cpu assignment
- update and unlock
+ AIO-SX: overrides for vcpu_pin_set not updated after Platform cpu
+ assignment update and unlock
Changed in starlingx:
importance: High → Medium
Revision history for this message
Frank Miller (sensfan22) wrote :

After reviewing with the containers TL, removed the stx2.0 gate and replaced with stx3.0 for the following reasons:
1) This issue existed in previous releases.
2) This is an AIO-SX issue only.
3) The workaround is to not have any VMs running on the AIO when the user changes the # of platform cores.

tags: added: stx.3.0
removed: stx.2.0
Revision history for this message
Frank Miller (sensfan22) wrote :

This issue is not a high priority issue and similar to the above comment can be moved to stx.4.0.

tags: added: stx.4.0
removed: stx.3.0
Revision history for this message
Frank Miller (sensfan22) wrote :

Changed the priority to low and removed the stx.4.0 tag as this issue has a manual workaround and is a low runner scenario.

tags: removed: stx.4.0
Changed in starlingx:
importance: Medium → Low
Revision history for this message
Ramaswamy Subramanian (rsubrama) wrote :

No progress on this bug for more than 2 years. Candidate for closure.

If there is no update, this issue is targeted to be closed as 'Won't Fix' by March 8th.

Revision history for this message
Ramaswamy Subramanian (rsubrama) wrote :

Changing the status to 'Won't Fix' as there is no activity.

Changed in starlingx:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.