STX: lab setup, drbd_fs_resizing time out

Bug #1812682 reported by Peng Peng
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Critical
Wei Zhou

Bug Description

Brief Description
-----------------
This issue was only happened in Regular and storage system lab_setup. After "system controllerfs-modify ", datebase state was in "drbd_fs_resizing_in_progress" and never complete.

Severity
--------
Critical

Steps to Reproduce
------------------
system controllerfs-modify backup=70 img-conversions=30 database=20

Expected Behaviour
------------------
drbd_fs_resizing complete

Actual Behaviour
----------------
drbd_fs_resizing_in_progress

Reproducibility
---------------
Reproducible
100% in regular and storage system

System Configuration
--------------------
Multi-node system
Dedicated storage

Branch/Pull Time/Commit
-----------------------
master as of 2019-01-17_20-18-00

Timestamp/Logs
--------------
(2019-01-18 08:02:22) [INFO] [MainThread] Waiting for lab_setup.sh to complete
(2019-01-18 08:26:00) [INFO] [MainThread] echo [$?]
(2019-01-18 08:26:01) [INFO] [MainThread] Match: [1]

 RUNNING: system controllerfs-modify backup=70 img-conversions=30 database=20
+--------------------------------------+-----------------+-------------+--------------------+------------+------------------------------+
| UUID | FS Name | Size in GiB | Logical Volume | Replicated | State |
+--------------------------------------+-----------------+-------------+--------------------+------------+------------------------------+
| 8c1aaf52-0e57-4c44-9943-2a5fd38ce301 | backup | 70 | backup-lv | False | None |
| 32a2199d-5857-41b8-8220-e149b5ce1278 | database | 20 | pgsql-lv | True | drbd_fs_resizing_in_progress |
| 145a0f92-cc6b-413e-9abd-f6ebbbdd2829 | extension | 1 | extension-lv | True | None |
| 58bdbb52-5b4a-4bd5-a7ea-3ffb744ad87d | glance | 10 | cgcs-lv | True | None |
| 7ca596ff-5de8-437b-8ba1-129549574be8 | gnocchi | 5 | gnocchi-lv | False | None |
| 9598eb24-db67-4d55-9ed8-6425dac489d8 | img-conversions | 30 | img-conversions-lv | False | None |
| ab8282dc-5921-40cb-9321-504c2ac9e903 | scratch | 8 | scratch-lv | False | None |
+--------------------------------------+-----------------+-------------+--------------------+------------+------------------------------+

.Waiting for resizing to be done for filesystem database
Timed out waiting for drbd resizing

Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Wei Zhou (wzhou007)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; issue causing install failures and blocking automation runs

Changed in starlingx:
importance: Undecided → Critical
status: New → Triaged
tags: added: stx.2019.05 stx.config
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-config (master)

Fix proposed to branch: master
Review: https://review.openstack.org/632150

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-config (master)

Reviewed: https://review.openstack.org/632150
Committed: https://git.openstack.org/cgit/openstack/stx-config/commit/?id=a20de5f64c36f13766e299143e8a056a826947fb
Submitter: Zuul
Branch: master

commit a20de5f64c36f13766e299143e8a056a826947fb
Author: Wei Zhou <email address hidden>
Date: Mon Jan 21 14:14:37 2019 -0500

    DRBD Filesystem Resizing Stuck

    During controller filesystem resizing, the drbd filesystem sizes are
    not rounded correctly. This causes the resizing procedure to stuck.

    Change-Id: Ie105714db8fd98e90c82e5c1ec72b1a1d75b8604
    Closes-Bug: 1812682
    Signed-off-by: Wei Zhou <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-config (f/stein)
Download full text (16.9 KiB)

Reviewed: https://review.openstack.org/632809
Committed: https://git.openstack.org/cgit/openstack/stx-config/commit/?id=021d825d4df666e723b6538ecde22ed951152e11
Submitter: Zuul
Branch: f/stein

commit 6287c90d49bc8d68d935abaa28255e0bb59ddc01
Author: Daniel Badea <email address hidden>
Date: Wed Jan 23 15:12:17 2019 +0000

    fail to apply manifests when management ip is missing

    Commit to fix bug #1790159 causes retry handler to fail
    because of mismatched function arguments. Remove 'self'
    from retry handler and fix error message formatting.

    Change-Id: Iedeb41451acd0f32b944b49d45f0c4b30a79ebc2
    Closes-Bug: #1805678

commit 8976f659fb26a91949b927f3e33d11ca8f3a2232
Author: Teresa Ho <email address hidden>
Date: Wed Jan 23 09:14:03 2019 -0500

    Fix missing default route on worker nodes

    On the worker and storage nodes, the mgmt interface should be
    set to dhcp and the cluster host alias should be set to use static IP.
    With mgmt and cluster host interface sharing the same interface,
    the mgmt alias interface was incorrectly set to static instead of
    dhcp which causes the default route to be removed during host unlock.
    This commit is to set the address method of the alias interfaces
    correctly.

    Story: 2004273
    Task: 27826

    Change-Id: I6deee76a5ea25e7753bf0f53c499922bc5d66ec6
    Signed-off-by: Teresa Ho <email address hidden>

commit a0be71beaa137e9116ad55e55d0e5ab88db87e95
Author: Gerry Kopec <email address hidden>
Date: Wed Jan 9 07:18:29 2019 -0500

    Update nova helm overrides for cold migration

    Adds generation of public and private rsa ssh keys in nova overrides.
    These will be used by nova helm charts (see dependent commit) to fill
    appropriate files in all nova-compute pods in cluster. ssh keys are
    stored in sysinv db to maintain consistency.

    Also need to provide subnet used for ssh which will be cluster host
    network per recent commit (If6b918665131f01bc62687fbdc7978c5c103e3b7).

    Story: 2003909
    Task: 28925
    Depends-On: Id789ba051cec019e8b7564c713cf1b5296ecf9f6
    Change-Id: I13aa90b1204e698846d4402048b3ca7f544da551
    Signed-off-by: Gerry Kopec <email address hidden>

commit 74baed87deef6c5565581b71875aa44d07271663
Author: Steven Webster <email address hidden>
Date: Mon Dec 17 15:41:54 2018 -0500

    Enable configurable vswitch memory

    Currently, a DPDK enabled vswitch makes use of a fixed 1G hugepage to
    enable an optimized datapath.

    In the case of OVS-DPDK, this can cause an issue when changing the
    MTU of one or more interfaces, as a separate mempool is allocated
    for each size. If the minimal mempool size(s) cannot fit into the
    1G page, DPDK memory initialization will fail.

    This commit allows an operator to configure the amount of hugepage
    memory allocated to each socket on a host, which can enable
    jumboframe support for OVS-DPDK.

    The system memory command has been modified to accept vswitch
    hugepage configuration via the function flag. ie:

    system host-memory-modify -f vswitch -1G 4 <worker_name> ...

tags: added: in-f-stein
Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.