Emergency mode observed during Debian ISO installaion

Bug #1979105 reported by Bob Church
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Critical
Bob Church

Bug Description

Brief Description
-----------------
Occasionally upon ISO installation the installer halts with an emergency mode prompt

Severity
--------
Critical when occurs.

Steps to Reproduce
------------------
Installation of the latest Debian ISO in a H/W lab

Expected Behavior
------------------
Successful ISO boot/install/reboot/login prompt

Actual Behavior
----------------
ISO boot/install/reboot/emergency prompt

Reproducibility
---------------
Seen randomly

System Configuration
--------------------
AIO-SX

Branch/Pull Time/Commit
-----------------------
Numerous build over the last 60 days, randomly occurring

Last Pass
---------
N/A

Timestamp/Logs
--------------
N/A

Test Activity
-------------
Feature Testing/Designer Testing

Workaround
----------
Try to install the ISO again, typically works

Tags: stx.7.0 stx
Revision history for this message
Bob Church (rchurch) wrote :

Will be proposing a potential fix. Needs validation

description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to integ (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/integ/+/846453

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config-files (master)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tools (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/tools/+/846455

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to root (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/root/+/846456

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to integ (master)

Reviewed: https://review.opendev.org/c/starlingx/integ/+/846453
Committed: https://opendev.org/starlingx/integ/commit/52537418e65bf68e0dcd86be8dd323f604df0f54
Submitter: "Zuul (22348)"
Branch: master

commit 52537418e65bf68e0dcd86be8dd323f604df0f54
Author: Robert Church <email address hidden>
Date: Fri Jun 17 21:59:33 2022 -0400

    Debian: Add lvm2 upstream fix for event_activation

    Historically, with CentOS, we had issues related to LV activation
    occurring in a non-deterministic way and causing an assortment of
    provisioning issues. More deterministic LV activation was achieved by
    setting use_lvmetad = 0 in /etc/lvm/lvm.conf

    In the migration to Debian, a much more recent version of the lvm2
    package is in use. In the intervening versions of lvm2, lvmetad was
    removed making use_lvmetad and the associated behavior now obsolete.

    In some current random testing scenarios, emergency mode is seen when
    booting the Debian ISO. Reviewing the systemd dump of services
    initiating when emergency mode occurs, it is observed that LV activation
    is occurring at a different time and order vs. a successful boot.

    The Debian lvm2 version provides configuration parameter
    global/event_activation which when set to 0 will change LV activation
    behavior when a PV appears. No noticable change was observed when this
    variable is set.

    The current upstream version from Debian Bullseye is missing an lvm2
    upstream patch the should address this issue.

    Patch the Debian lvm2 version with this upstream patch to enable testing
    with this enabled.

    Test Plan:
    PASS - Build ISO, install/provision in AIO-SX virtual/hardware labs
    PASS - Perform numerous reboot cycles an observe no issues
    PASS - Test on H/W setup that has shown energency mode behavior to
           confirm that this version and associated config file change
           resolved emergency mode PV/LV activation issues

    Change-Id: If22e446126f33c2155bd70988ed9b0444d230730
    Partial-Bug: #1979105
    Signed-off-by: Robert Church <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config-files (master)

Reviewed: https://review.opendev.org/c/starlingx/config-files/+/846454
Committed: https://opendev.org/starlingx/config-files/commit/191e65f85286589bd0529f6dea32046698691b83
Submitter: "Zuul (22348)"
Branch: master

commit 191e65f85286589bd0529f6dea32046698691b83
Author: Robert Church <email address hidden>
Date: Mon Jun 6 03:27:40 2022 -0400

    Debian: Prevent automatic LV event activation

    Historically, with CentOS, we had issues related to LV activation
    occurring in a non-deterministic way and causing an assortment of
    provisioning issues. More deterministic LV activation was achieved by
    setting use_lvmetad = 0 in /etc/lvm/lvm.conf

    In the migration to Debian, a much more recent version of the lvm2
    package is in use. In one of the intervening versions of lvm2, lvmetad
    was removed making use_lvmetad and the associated behavior now obsolete.

    In some current random testing scenarios, emergency mode is seen when
    booting the Debian ISO. Reviewing the systemd dump of services
    initiating when emergency mode occurs, it is observed that LV activation
    is occurring at a different time vs. a successful boot.

    The Debian lvm2 version provides the configuration parameter
    global/event_activation which when set to 0 will change LV activation
    behavior so that when a PV appears pvscan is not called.

    Test Plan:
    PASS - Build ISO, install/provision in AIO-SX virtual/hardware labs
    PASS - Perform numerous reboot cycles an observe no issues
    PASS - Test on H/W setup that has shown emergency mode behavior to
           confirm that this version and associated config file change
           resolved emergency mode PV/LV activation issues

    Test Plan:
    PASS - Build ISO, install/provision in AIO-SX virtual/hardware labs

    Signed-off-by: Robert Church <email address hidden>
    Depends-On: https://review.opendev.org/c/starlingx/integ/+/846453
    Closes-Bug: #1979105
    Change-Id: Ibae56a1da6135ea2f0a1fac76f1ce505a95588f5

Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Bob Church (rchurch)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tools (master)

Reviewed: https://review.opendev.org/c/starlingx/tools/+/846455
Committed: https://opendev.org/starlingx/tools/commit/f1eac6566c4bdda4c5771d0e0ac81547fc8c4e9e
Submitter: "Zuul (22348)"
Branch: master

commit f1eac6566c4bdda4c5771d0e0ac81547fc8c4e9e
Author: Robert Church <email address hidden>
Date: Fri Jun 17 22:26:47 2022 -0400

    Debian: Add support for patching the LVM2 package

    This will enable patching the current upstream Debian LVM2 packages.

    This is needed to include an LVM2 upstream patch not contained in the
    Debian package version around event_activation which will control when
    LVs are activated when PVs appear.

    Test Plan:
    PASS - Build ISO, install/provision in AIO-SX virtual/hardware labs

    Change-Id: I2dae8e0766611a3864aab00fe478044e50f7b04f
    Depends-On: https://review.opendev.org/c/starlingx/integ/+/846453
    Partial-Bug: #1979105
    Signed-off-by: Robert Church <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to root (master)

Reviewed: https://review.opendev.org/c/starlingx/root/+/846456
Committed: https://opendev.org/starlingx/root/commit/a3afc6b45b674e8b10e46eaec15e614def169d1d
Submitter: "Zuul (22348)"
Branch: master

commit a3afc6b45b674e8b10e46eaec15e614def169d1d
Author: Robert Church <email address hidden>
Date: Fri Jun 17 21:44:24 2022 -0400

    Debian: Add lvm2 circular dependency resolver

    Now that we are patching the lvm2 package to pull in an upstream fix, a
    circular dependency was seen when attempting to build all the packages.
    Update circular_dep.conf to avoid the build failure.

    Test Plan:
    PASS - Build ISO, install/provision in AIO-SX virtual/hardware labs

    Change-Id: I3594e0d841ff6c32d757cf00801dcd769537b939
    Depends-On: https://review.opendev.org/c/starlingx/tools/+/846455
    Partial-Bug: #1979105
    Signed-off-by: Robert Church <email address hidden>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.