drbd resize needs to run after puppet manifest

Bug #1894003 reported by John Kung
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
John Kung

Bug Description

Brief Description
-----------------
drbd resize operations need to be run after the filesystem runtime manifest is applied.

An unrelated runtime operation, may trigger the drbd resize operation before the runtime manifest completed application

Severity
--------
Provide the severity of the defect.

Major: rare issue, however, requires manual intervention align filesystem sizes.

Steps to Reproduce
------------------
system controllerfs-modify dockerdistribution=32
system modify -p (any other runtime manifest command; or system dns-modify).

Expected Behavior
------------------
The drbd-overview and lvs for the updated filesystem should be aligned with same value.

Actual Behavior
----------------
The manifest apply updates the lvs filesystem, however, drbd-overview does not reflect
the increased size.

Reproducibility
---------------
intermittent
potential race condition if there are other runtime configurations being performed.

System Configuration
--------------------
duplex controllers

Branch/Pull Time/Commit
-----------------------
2020-08-05

Last Pass
---------
Intermittent issue

Timestamp/Logs
--------------

A request increase docker-distribution to 32GB (from 16GB)

sysinv 2020-07-29 02:22:35.713 103995 INFO sysinv.api.hooks.auditor [req-535d537b-6757-46bb-9400-e0743d57e3b3 8289e83bf8d64048b397b0d3841afea2 397cf48491ba4966a0e867a1222c84a6] fd00:4888::8e22:765f:6121:eb47 "PUT /v1/isystems/e8e08c1e-9022-4e1f-81c4-b7207fd01103/controller_fs/update_many HTTP/1.0" status: 200 len: 0 time: 6.9437251091 POST: [[{u'path': u'/size', u'value': 32, u'op': u'replace'}, {u'path': u'/name', u'value': u'docker-distribution', u'op': u'replace'}]] host:[2607:f160:10:9232:ce:406:0:3000]:6385 user: admin tenant: admin domain: Default

This sets the condition requiring drbd resize.

2. However, another runtime manifest was applied; and the current condition utilized to perform the drbd resize (config up to date) was met:

sysinv 2020-07-29 02:26:13.052 102841 INFO sysinv.conductor.manager [-] https config applied {'classes': ['openstack::keystone::endpoint::runtime', 'platform::firewall::runtime'], 'force': False, 'personalities': ['controller'], 'host_uuids': [u'12178647-157e-45e3-821a-73baae1689a9']}

sysinv 2020-07-29 02:26:13.052 102841 INFO sysinv.conductor.manager [-] admin endpoint config applied {'classes': ['openstack::keystone::endpoint::runtime', 'platform::firewall::runtime'], 'force': False, 'personalities': ['controller'], 'host_uuids':

sysinv 2020-07-29 02:26:13.094 102841 INFO sysinv.conductor.manager [-] Performed drbdadm resize all --- this operation was performed before the manifest applied and not run again.

sysinv 2020-07-29 02:26:13.189 102841 INFO sysinv.conductor.manager [-] drbd-overview: pgsql-20.0, platform-9.8, extension-0.96875, dc-vault-0, etcd-4.8, dockerdistribution-16.0

sysinv 2020-07-29 02:26:13.190 102841 INFO sysinv.conductor.manager [-] lvdisplay: pgsql-20.0, platform-10.0, extension-1.0, dc-vault-0, etcd-5.0, dockerdistribution-16.0

sysinv 2020-07-29 02:26:13.250 102841 INFO sysinv.conductor.manager [-]

This 'drbdadm resize all ' operation was only observed once in the logs but is needed following the lv puppet manifest apply operation.

3. puppet runtime manifest applies docker-distribution: lv size change

2020-07-29T02:28:23.732 ^[[mNotice: 2020-07-29 02:28:23 +0000 /Stage[main]/Platform::Drbd::Dockerdistribution/Platform::Drbd::Filesystem[drbd-dockerdistribution]/Logical_volume[dockerdistribution-lv]/size: size changed '16G' to '32G'^[[0m

Test Activity
-------------
Regression Testing

Workaround
----------
for docker-distribution: fsck /dev/drbd8 (or, for duplex: drbdadm resize all; resize2fs /dev/<drbd>')

Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → John Kung (john-kung)
importance: Undecided → Medium
tags: added: stx.5.0 stx.config stx.storage
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/749596
Committed: https://git.openstack.org/cgit/starlingx/stx-puppet/commit/?id=cfa70d82c1af1927721beda62173b65c77acc5e1
Submitter: Zuul
Branch: master

commit cfa70d82c1af1927721beda62173b65c77acc5e1
Author: John Kung <email address hidden>
Date: Wed Sep 2 17:41:04 2020 -0400

    Update drbd resize operation to run after puppet manifest

    There is a potential race condition in triggering the filesystem
    resize check. The resize check is triggered on the start of
    a controller filesystem drbd resize operation, however, if another
    configuration is applied before the resize check may be early,
    in which case the lv would be resized but not drbd.

    In the drbd manifest update to trigger the drbd filesystem resize
    check after the lv filesystem have been resized.

    Change-Id: I97038d0cc9d57ab639f85c2282677364b44249c0
    Closes-Bug: 1894003
    Signed-off-by: John Kung <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/749600
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=bfdefeec1b397ad5c71b69034c3c70198fbe99b6
Submitter: Zuul
Branch: master

commit bfdefeec1b397ad5c71b69034c3c70198fbe99b6
Author: John Kung <email address hidden>
Date: Wed Sep 2 17:52:22 2020 -0400

    Update drbd resize operation to run after puppet manifest

    There is a potential race condition in triggering the filesystem
    resize check. The resize check is triggered on the start of
    a controller filesystem drbd resize operation, however, if another
    configuration is applied before the resize check may be early,
    in which case the lv would be resized but not drbd.

    In the drbd manifest update to trigger the drbd filesystem resize
    check after the lv filesystem have been resized.

    Change-Id: Idd0629f77c414a79cc1969ff1110d5dbd3548cda
    Closes-Bug: 1894003
    Depends-On: https://review.opendev.org/#/c/749596/
    Signed-off-by: John Kung <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (f/centos8)

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/762919

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.