ceph runtime config is updated incorrectly with new uuid

Bug #1829090 reported by Bin Qian
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Ovidiu Poncea

Bug Description

In a few areas in conductor/manager.py, when updating runtime configure, a new config uuid is used (not the uuid set to target uuid). The new uuid is generaged randomly, this causes issues:
1. if update requires a reboot but the new uuid does not come with reboot required bit, it will looks like the config is up to date, but system is not.
2. if update does not require a reboot but the new uuid comes with reboot require bit, the node becomes "config changed reboot required", until a reboot occurs.

code snippet looks like:
        if utils.is_host_simplex_controller(active_controller):
            new_uuid = config_uuid
        else:
            new_uuid = str(uuid.uuid4())

        self._config_apply_runtime_manifest(context,
                                            config_uuid=new_uuid,
                                            config_dict=config_dict)

Bin Qian (bqian20)
description: updated
Bin Qian (bqian20)
description: updated
Changed in starlingx:
assignee: nobody → Ovidiu Poncea (ovidiu.poncea)
Ghada Khalil (gkhalil)
tags: added: stx.config
removed: stx-config
Ghada Khalil (gkhalil)
summary: - Update runtime config incorrectly with new uuid
+ ceph runtime config is updated incorrectly with new uuid
tags: added: stx.storage
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; appears relate to the ceph upversion.
The impact is that the system configuration is not applied properly (silent failure).

Changed in starlingx:
status: New → Triaged
importance: Undecided → High
tags: added: stx.2.0
Revision history for this message
Ovidiu Poncea (ovidiuponcea) wrote :

Except the couple of places related to Ceph configuration, there is another place, when updating the security feature config with the same issue:

    def update_security_feature_config(self, context):
        """Update the kernel options configuration"""
        personalities = constants.PERSONALITIES
        config_uuid = self._config_update_hosts(context, personalities, reboot=True)

        config_dict = {
            'personalities': personalities,
            'classes': ['platform::grub::runtime']
        }

        self._config_apply_runtime_manifest(context, config_uuid, config_dict, force=True)

Once runtime manifests get executed, the reboot required flag gets cleared. To keep it solution is to apply the manifests with same config_uuid but w/o the reboot flag:

    def update_security_feature_config(self, context):
        """Update the kernel options configuration"""
        personalities = constants.PERSONALITIES
        config_uuid = self._config_update_hosts(context, personalities, reboot=True)

        config_dict = {
            'personalities': personalities,
            'classes': ['platform::grub::runtime']
        }

        # Apply runtime config, keep reboot required flag.
        config_uuid = self._config_clear_reboot_required(config_uuid)
        self._config_apply_runtime_manifest(context, config_uuid, config_dict, force=True)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/660114

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/660114
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=8fe1e43c34f27ede6629e100b9ce71ef07d37dd1
Submitter: Zuul
Branch: master

commit 8fe1e43c34f27ede6629e100b9ce71ef07d37dd1
Author: Ovidiu Poncea <email address hidden>
Date: Mon May 20 16:22:19 2019 +0300

    Fix reboot required getting reset or incorrectly set

    Two cases:
    1. Node is reboot required yet flag gets cleared when runtime
    manifests are executed.
    2. Node is not reboot required yet sometimes flag gets set as
    new, randomly generated, uuid comes with reboot require bit,
    then node becomes "config changed reboot required", until a
    reboot occurs.

    Change-Id: I1e69f9698b42a961db055b89da12b2d84421490a
    Closes-Bug: 1829090
    Signed-off-by: Ovidiu Poncea <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.