StarlingX

Bug #1830549
Comment #3

Comment 3 for bug 1830549

Revision history for this message

Tao Liu (tliu88) wrote on 2019-05-30:

This is just clearing the pending fields, which was delayed.

The huge pages are allocated after the worker manifest is applied, and the pending fields are cleared after the conductor receives the first memory update from the agent after unlock. In this case, the first memory update was delayed due to the followings.

The sysinv agent normally starts 5 or 6 minutes after the host is unlock/rebooted which sends inventory update at startup. By this time, if the worker manifest apply has not been completed (via checking .worker_config_complete flag file), the memory update will be skipped. This is because the huge pages are allocated via puppet manifest.

After that, the memory update is triggered by periodical audit. The audit runs every minute, but the memory & lldp reports are sent after 5 audit interval (agent throttle update implemented for big lab performance issues in previous release). The audit throttling results in 5 minutes delay after sending the first inventory report, which adds around 11 minutes to clear the pending fields after reboot.

In this scenario, the second lock was less than 10 minutes after the first unlock. Once the host was locked, the sysinv-conductor would not clear the pending fields.

2019-05-26T17:37:15.000 controller-0 -sh: info HISTORY: PID=617529 UID=1875 system host-lock controller-0
2019-05-26T17:37:19.000 controller-0 -sh: info HISTORY: PID=617529 UID=1875 system host-memory-modify controller-0 0 -1G 10 -f application
2019-05-26T17:37:28.000 controller-0 -sh: info HISTORY: PID=617529 UID=1875 system host-memory-modify controller-0 1 -1G 10 -f application
2019-05-26T17:37:37.000 controller-0 -sh: info HISTORY: PID=617529 UID=1875 system host-unlock controller-0
2019-05-26T17:47:01.000 controller-0 -sh: info HISTORY: PID=126741 UID=1875 system host-lock controller-0