StarlingX

hugepages not allocated after unlock

Bug #1830549 reported by Brent Rowsell on 2019-05-26

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	StarlingX	Fix Released	High	Tao Liu

Bug Description

Brief Description
-----------------
This was on AIO-SX. With the controller locked, I configured 10 1-G hugepages per numa node.
After the unlock was complete, the pages were still showing as pending.
I then locked the server and the pages were allocated ?
After the second unlock the pages were still allocated

Severity
--------
Major

Steps to Reproduce
------------------
See above

Expected Behavior
------------------
Pages are allocated

Actual Behavior
----------------
See above

Reproducibility
---------------
Not sure

System Configuration
--------------------
One node
But likely applicable to other configs

Branch/Pull Time/Commit
-----------------------
BUILD_DATE="2019-05-24 17:42:34 -0400"

Last Pass
---------
Don't know

Timestamp/Logs
--------------
Logs attached

Test Activity
-------------
Other

See original description

Tags:

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-05-27:

Marking as release gating; robustness issues with huge page allocation

summary:	- AIO-SX: hugepages not allocated after unlock + hugepages not allocated after unlock
tags:	added: stx.2.0 stx.config
Changed in starlingx:
importance:	Undecided → High
status:	New → Triaged
description:	updated

Ghada Khalil (gkhalil) on 2019-05-27

Changed in starlingx:
assignee:	nobody → Tao Liu (tliu88)

Revision history for this message

Brent Rowsell (brent-rowsell) wrote on 2019-05-29:

controller-0_20190526.224738.tar Edit (32.5 MiB, application/x-tar)

logs

Tao Liu (tliu88) on 2019-05-29

Changed in starlingx:
status:	Triaged → In Progress

Revision history for this message

Tao Liu (tliu88) wrote on 2019-05-30:

This is just clearing the pending fields, which was delayed.

The huge pages are allocated after the worker manifest is applied, and the pending fields are cleared after the conductor receives the first memory update from the agent after unlock. In this case, the first memory update was delayed due to the followings.

The sysinv agent normally starts 5 or 6 minutes after the host is unlock/rebooted which sends inventory update at startup. By this time, if the worker manifest apply has not been completed (via checking .worker_config_complete flag file), the memory update will be skipped. This is because the huge pages are allocated via puppet manifest.

After that, the memory update is triggered by periodical audit. The audit runs every minute, but the memory & lldp reports are sent after 5 audit interval (agent throttle update implemented for big lab performance issues in previous release). The audit throttling results in 5 minutes delay after sending the first inventory report, which adds around 11 minutes to clear the pending fields after reboot.

In this scenario, the second lock was less than 10 minutes after the first unlock. Once the host was locked, the sysinv-conductor would not clear the pending fields.

2019-05-26T17:37:15.000 controller-0 -sh: info HISTORY: PID=617529 UID=1875 system host-lock controller-0
2019-05-26T17:37:19.000 controller-0 -sh: info HISTORY: PID=617529 UID=1875 system host-memory-modify controller-0 0 -1G 10 -f application
2019-05-26T17:37:28.000 controller-0 -sh: info HISTORY: PID=617529 UID=1875 system host-memory-modify controller-0 1 -1G 10 -f application
2019-05-26T17:37:37.000 controller-0 -sh: info HISTORY: PID=617529 UID=1875 system host-unlock controller-0
2019-05-26T17:47:01.000 controller-0 -sh: info HISTORY: PID=126741 UID=1875 system host-lock controller-0

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-05-30:

This seems to be specific to All-in-one systems only

Revision history for this message

Tao Liu (tliu88) wrote on 2019-07-17:

The pending fields take longer time to clear on AIO, because the worker manifest is applied after the controller manifest has been applied. When sysinv-agent starts, it sends the host inventory update. By this time, if the worker manifest apply has not been completed, the memory report will be skipped. After the initial report, the memory update will be triggered by periodical audit. The audit runs every minute, but the memory report is sent after 5 audit interval. As a result, the huge pages settings could still show as pending on CLI/GUI after around 10 minutes (although the huge pages have been allocated in Linux after the manifest is applied).

After talked to John, we decided to send the memory report more frequently, e.g. every other minute (normal audit interval). This change will trigger the pending fields to be clear 1 minute after the worker manifest is applied.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-07-17: Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/671354

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-07-17: Fix merged to config (master)

Reviewed: https://review.opendev.org/671354
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=1c24b40ac9ede4836070d3c6e960f4c29efc3afa
Submitter: Zuul
Branch: master

commit 1c24b40ac9ede4836070d3c6e960f4c29efc3afa
Author: Tao Liu <email address hidden>
Date: Wed Jul 17 15:50:08 2019 -0400

Fix the huge pages fields showing as pending

    The user expects to see the pending fields to be clear
    after the huge pages modification and unlock is complete.
    When sysinv-agent starts, it sends the host inventory update.
    By this time, if the worker manifest apply has not been
    completed, the memory report will be skipped. After the
    initial report, the memory update will be triggered by
    periodical audit. The audit runs every minute, but the
    memory report is sent after 5 audit interval. As a result,
    the huge pages settings could still show as pending on
    CLI/GUI after around 10 minutes (more noticeable on AIO),
    and this led the user to believe something went wrong with
    the huge pages allocation (although the huge pages have
    been allocated in Linux after the manifest is applied).

After talking with John Kung, we decided to send the memory
report in normal audit interval (every other minute).

Closes-Bug: 1830549

Change-Id: Idf5067648031168078d99a5d84c8368cbd400508
Signed-off-by: Tao Liu <email address hidden>