First unlock after changing cpu assignment does not update the affinity of platform cores

Bug #1823213 reported by Wendy Mitchell
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Jim Gauld

Bug Description

Brief Description
-----------------
After change cpu assignment and unlock the host the first time, the affinity of platform cores is not as expected

Severity
--------
standard

Steps to Reproduce
------------------
1. Locked the host (eg. controller-1)
2. Edit the cpu assignment for the platform so processor 1 has Platform cpu assignment:

Processor Logical Cores
Platform Processor 0 : 0,2
  Processor 1 : 1,3
vSwitch Processor 0 : 4,6
Shared
Applications Processor 0 : 8,10,12,14,16,18,20,22,24,26
Processor 1 : 5,7,9,11,13,15,17,19,21,23,25,27

3. Unlock the host
4. Confirm the new assignment

[wrsroot@controller-0 log(keystone_admin)]$ system host-cpu-list controller-1
+--------------------------------------+-------+-----------+-------+--------+-------------------------------------------+-------------------+
| uuid | log_c | processor | phy_c | thread | processor_model | assigned_function |
| | ore | | ore | | | |
+--------------------------------------+-------+-----------+-------+--------+-------------------------------------------+-------------------+
| 16446cb8-a8a6-43cd-9da6-24c8c92cdce7 | 0 | 0 | 0 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | Platform |
| cf0ccd99-b5d5-42a5-b5da-fddff37ac63f | 1 | 1 | 0 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | Platform |
| 5699a8bd-ecef-4ba0-9523-db3cfd005b7f | 2 | 0 | 1 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | Platform |
| cacef69f-9e9f-46df-8703-1e7dc60e7e18 | 3 | 1 | 1 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | Platform |
| 79ce6b16-592e-4f8a-af44-0fc138d3d80c | 4 | 0 | 2 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | vSwitch |
| 0ad7b84e-d280-4af3-8ed8-f153048bca94 | 5 | 1 | 2 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | Applications |
| f246d8cc-e5e0-4e25-934f-d44940088d7e | 6 | 0 | 3 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | vSwitch |
| 24e1e037-ad34-4f39-a6f1-43fda2ffc532 | 7 | 1 | 3 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | Applications |
| a77f76d3-3f9c-4a2b-9e46-10d8ba616466 | 8 | 0 | 4 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | Applications |
| 6dfd5e29-09a5-414a-9a6a-c2dd3a4cdfa4 | 9 | 1 | 4 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | Applications |
| 784cd99b-a073-4c37-8faf-d3087b453530 | 10 | 0 | 5 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | Applications |
| 779f4520-f9bf-4a86-8a3f-d6dbde77f893 | 11 | 1 | 5 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | Applications |
| eb8e7f4f-2bb1-4a6d-91ca-ec6b22b6e7b0 | 12 | 0 | 6 | 0 | Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz | Applications |
.

5. Confirm affinity platform cores

see log and output in affinity.txt file attached

Expected Behavior
------------------
Expected affinity to include 1,3 (as well as 0,2) for the platform but only appears to be 0,2
"Reaffining tasks to platform cores (0,2)...
"

Actual Behavior
----------------

platform.log (controller-1)
2019-04-04T17:34:47.000 controller-1 affine_tasks: info Starting affine_tasks. Reaffining tasks to platform cores...
2019-04-04T17:34:48.000 controller-1 affine_tasks: info Tasks re-affining done.

daemon.log (controller-1)

2019-04-04T17:34:47.795 controller-1 systemd[1]: info Starting StarlingX Affine Tasks...

2019-04-04T17:34:47.860 controller-1 affine-tasks.sh[16930]: info affine_tasks: Starting affine_tasks. Reaffining tasks to platform cores...

2019-04-04T17:34:47.908 controller-1 affine-tasks.sh[16930]: info /etc/init.d/affine-tasks.sh[1]: TASKAFFINITY: Reaffining tasks to platform cores (0,2)...

2019-04-04T17:34:48.034 controller-1 affine-tasks.sh[16930]: info taskset: failed to get pid 16791's affinity: No such process

2019-04-04T17:34:48.037 controller-1 affine-tasks.sh[16930]: info taskset: failed to get pid 16815's affinity: No such process

2019-04-04T17:34:48.045 controller-1 affine-tasks.sh[16930]: info taskset: failed to get pid 17144's affinity: No such process

2019-04-04T17:34:48.047 controller-1 affine-tasks.sh[16930]: info taskset: failed to get pid 17145's affinity: No such process

2019-04-04T17:34:48.049 controller-1 affine-tasks.sh[16930]: info taskset: failed to get pid 17146's affinity: No such process

2019-04-04T17:34:48.051 controller-1 affine-tasks.sh[16930]: info /etc/init.d/affine-tasks.sh[1]: TASKAFFINITY: 46 tasks were reaffined to platform cores.
2019-04-04T17:34:48.052 controller-1 affine-tasks.sh[16930]: info affine_tasks: Tasks re-affining done.

2019-04-04T17:34:48.058 controller-1 systemd[1]: info Started StarlingX Affine Tasks.

Reproducibility
---------------
First time unlock controller after changing the platform cpu assignment

System Configuration
--------------------
eg. duplex

Branch/Pull Time/Commit
--------------------
BUILD_ID="20190401T233000Z"
JOB="STX_build_master_master"

Last Pass

---------

Timestamp/Logs
--------------
see ~17:34:47

Test Activity
-------------
[Platform pinning Feature Testing]

Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :
Ghada Khalil (gkhalil)
summary: - First unlock after changing cpu assignment, unexpectedly does not update
- the affinity of platform cores
+ First unlock after changing cpu assignment does not update the affinity
+ of platform cores
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; issue with cpu affinity

Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
assignee: nobody → Jim Gauld (jgauld)
tags: added: stx.2.0 stx.config
Revision history for this message
Jim Gauld (jgauld) wrote :

I believe this issue will be solved when the following gets Merged: https://review.openstack.org/#/c/648511/

This feature also solves the following (which is the reported issue):
- sysinv puppet generation of platform_cpu_list did not have all threads
- worker_reserved.conf not generated on unlock, so that file which gets sourced by various things was not current in during early init
- sysinv puppet cpulist-to-ranges inline code was incorrect; so for certain hardware (with funky logical cpu enumeration patterns), the platform_cpu_list is wrong with commas stripped out

Ghada Khalil (gkhalil)
tags: added: stx.retestneeded
Jim Gauld (jgauld)
Changed in starlingx:
status: Triaged → Fix Released
Frank Miller (sensfan22)
Changed in starlingx:
assignee: Jim Gauld (jgauld) → Wendy Mitchell (wmitchellwr)
Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :

verified

### StarlingX
### Built from master
###

OS="centos"
SW_VERSION="19.01"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="20190501T013000Z"

JOB="STX_build_master_master"
<email address hidden>"
BUILD_NUMBER="86"
BUILD_HOST="starlingx_mirror"
BUILD_DATE="2019-05-01 01:30:00 +0000"

tags: removed: stx.retestneeded
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Re-assigning to Jim Gauld; bugs are left assigned to the developer who addressed them

Changed in starlingx:
assignee: Wendy Mitchell (wmitchellwr) → Jim Gauld (jgauld)
Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :

This was verified fixed per comment on May 1st.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.