Duplicate configdrive while DHCP-less ramdisk clean/redeploy

Bug #2032377 reported by Fedor Tarasenko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
In Progress
High
Julia Kreger

Bug Description

Hi!

I have Ironic Zed version and try to use DHCP-less ramdisk booting with network data from config drive.
I use simple-init element, and deployment works fine for me. It creates and boots target image with 4 partitions (BSP,ESP,OS and config drive).

After that, no other actions are possible with server - next boot of ironic-python-agent ISO gets 2 config-drives and fails to get correct one.

[root@server devuser]# blkid -t LABEL="config-2"
/dev/sr: UUID="2023-08-21-11-32-05-B0" LABEL-"config-2" TYPE="iso9668"
/dev/nume8n1p4: UUID="2823-08-21-18-57-58-89" LABEL=*
"config-2" TYPE="is09660" PARTUUID="f293483a-328e-4d3b-87e2-695cbab1c95f"

This happens because simple-init (glean) uses the same LABEL for config-drive, as an openstack config-drive of target image.
No possible workaround found.

1. https://github.com/openstack/ironic/blob/master/ironic/common/images.py#L172
2. https://opendev.org/opendev/glean/src/branch/master/glean/init/glean-early.sh#L37

Revision history for this message
Dmitry Tantsur (divius) wrote :

We've discussed this issue during and after our weekly meeting, and that's the solution we're leaning towards:

1) Modify Glean [1] to optionally accept a hint for the IPA's configdrive in a new kernel parameter, e.g. glean-configdrive=LABEL=config-foo or glean-configdrive=UUID=bar (note: I'd rather avoid committing to using labels in case we figure out how to use UUIDs).

2) Modify Ironic's ISO building process [2] to generate a random label, e.g. config-<something> and pass it to Glean. This is going to be a breaking change, so we need to add it behind a flag with the intention to make it the default later.

[1] https://opendev.org/opendev/glean/src/branch/master/glean/init/glean-early.sh#L32
[2] https://opendev.org/openstack/ironic/src/commit/e4a5691d331c620f1e21d761c7de95e346f3effc/ironic/drivers/modules/image_utils.py#L425

Changed in ironic:
status: New → Triaged
importance: Undecided → High
Changed in ironic:
status: Triaged → In Progress
Revision history for this message
Julia Kreger (juliaashleykreger) wrote :
Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Additionally, an additional diskimage-builder change is currently in review which should improve overall security when dealing with configuration drives. https://review.opendev.org/c/openstack/diskimage-builder/+/899886

Dmitry Tantsur (divius)
Changed in ironic:
assignee: nobody → Julia Kreger (juliaashleykreger)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic-python-agent (master)

Reviewed: https://review.opendev.org/c/openstack/ironic-python-agent/+/895519
Committed: https://opendev.org/openstack/ironic-python-agent/commit/33f01fa3c2f32f447ed36f00fea68321c3991c2e
Submitter: "Zuul (22348)"
Branch: master

commit 33f01fa3c2f32f447ed36f00fea68321c3991c2e
Author: Julia Kreger <email address hidden>
Date: Fri Sep 15 15:03:30 2023 -0700

    Fix vmedia network config drive handling

    When performing DHCP-less deployments, the agent can start and
    discover more than one configuration drive present on a host.

    For example, a host was previously deployed using Ironic, and
    is now being re-deployed again.

    If Glean was present in the ramdisk, the glean-early.sh would end
    mounting the folder based upon label.

    If cloud-init, somehow is still in the ramdisk, the other folder
    could somehow get mounted.

    This patch, which is intended to be backportable, causes the agent
    to unmount any configuration drive folders, mount the most likely
    candidate based upon device type, partition, and overall state of
    the machine, and then utilize that configuration, if present,
    to re-configure and reload networking.

    Thus allowing dhcp-less re-deployments to be fixed without
    forcing any breaking changes.

    It should also be noted that this fix was generated in concert
    with an additional tempest test case, because this overall failure
    case needed to be reproduced to ensure we had a workable non-breaking
    path forward.

    Closes-Bug: 2032377
    Change-Id: I9a3b3dbb9ca98771ce2decf893eba7a4c1890eee

Changed in ironic:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ironic-python-agent 9.9.0

This issue was fixed in the openstack/ironic-python-agent 9.9.0 release.

Changed in ironic:
status: Fix Released → In Progress
Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Re-opening since this got reverted.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to ironic (master)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/915022
Committed: https://opendev.org/openstack/ironic/commit/fb850e7f005e0ef4b5c489b8c2b245791d0d33eb
Submitter: "Zuul (22348)"
Branch: master

commit fb850e7f005e0ef4b5c489b8c2b245791d0d33eb
Author: Julia Kreger <email address hidden>
Date: Wed Apr 3 12:56:57 2024 -0700

    Inject a randomized publisher id

    To serve as a mechanism to allow an interlocking device identification
    this patch injects a publisher id value into ISO images *and* the kernel
    command line for any software running from the ISO image to match
    the ISO in use to the location of data housed locally from within the
    image.

    Related-Bug: 2032377
    Change-Id: I9b74ec977fabc0a7f8ed6f113595a3f1624f6ee6

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic (stable/2024.1)

Related fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/ironic/+/917735

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic (stable/2023.2)

Related fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/ironic/+/917861

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic (stable/2023.1)

Related fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/ironic/+/917862

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.