[OSSA-2024-003] Unvalidated image data passed to qemu-img (CVE-2024-44082)

Bug #2071740 reported by Jay Faulkner
278
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
Critical
Julia Kreger
OpenStack Security Advisory
Fix Released
Critical
Brian Rosmaita
ironic-lib
Triaged
Critical
Jay Faulkner
ironic-python-agent
Fix Released
Critical
Jay Faulkner

Bug Description

It recently came to light that processing untrusted image data with qemu-based tools, such as qemu-img, is not expected to be secure.

ironic-lib's qemu_img module should validate images are as expected, and don't contain potentially harmful things such as QCOW2 backing file.

Additionally, we should avoid calling qemu-img without specifying the image type specifically as this is an additional layer of validation.

https://bugs.launchpad.net/nova/+bug/2059809 and associated patches can be used as an example of the potential fix we should do.

CVE References

tags: added: bug
Revision history for this message
Jay Faulkner (jason-oldos) wrote (last edit ):

After a conversation with Dan, it's clear that simply calling `qemu-img *` on any untrusted image can lead to data being exfiltrated remotely from that machine (over the network) OR data (provided over the network) being written to the machine running qemu-img.

We must take the format_inspector class from nova/glance and validate incoming images, use the inspection module to determine what image it is, then run the safety check against it to make sure it contains no scary stuff for *any* possible format . This is a high severity vulnerability.

information type: Public → Private Security
Revision history for this message
Julia Kreger (juliaashleykreger) wrote :
Download full text (4.4 KiB)

I also had a discussion with Dan in order to validate perceptions and ensure my understanding is correct.

In order to avoid breakages, I also think it is important to detail out the overall entry path for this data and how it is then picked up and used.

qemu-img convert is invoked in the following locations:

repository - path - Description - Further context

And any further notes required pertinent to the aforementioned item.

------

ironic - ironic/drivers/modules/ansible/playbooks/roles/deploy/tasks/write.yaml - example and default ansible deploy interface playbook - Source field is Ironic node instance_info field parameter "image_source". Executes on the remote host being deployed via a temporary ramdisk. It appears to leverage ironic's stock image logic used for the agent to prepare an accessible image, and execute qemu-img when the identified disk format != 'raw'. The code shortens the name from image_url to image.url as a parameter, so rooted in whatever ironic saves as a known image_url. image_disk_format should also be consulted.

------

ironic-python-agent - ironic_python_agent/extensions/standby.py - not an actual invocation - The command is formatted for logging purposes for the user, but the actual call is via ironic_lib.qemu_img module.

In this specific case, it should be noted that in If8fae8210d85c61abb85c388b300e40a75d0531c this was previously ironic_lib.disk_utils which is where the code was invoked.

In I0caa65561948f4e0934943a7a0d3a209701b5a59, dated May 26, 2021, (meaning after the wallaby branch, sometime in the xena branch), the code was changed from an external script to in-python code in ironic-lib.

It should be noted the partition image code path for the agent *always* invoked through ironic_lib as well with upfront identified image_info context.

These operations utilize the image transferred/downloaded the ImageDownload class where the file is staged on disk (unless already raw), and then called to the write-out sequence. This uses the same format path as the image_source logic -> image_url logic in the conductor, and thus independently needs to have guard logic applied.

----- End of list of non-ironic lib invocations

For the purposes of level settings: The overall flow is image_source is set to a file path on conductor, glance image uuid, or a URL to the image to write.

Then the deploy_interface driver triggers setting the state to facilitate deploy. Two call paths generally exist: Validation flows and then flows triggered through deployment actions where interfaces are called to trigger deployment and which actually attempts to access the image and it's contents.

This uniformly happens through deploy_utils.build_instance_info_for_deploy for the AnacondaDeploy interface, Agent deploy interface, and Ansible deployment interface.

This means conductor side, the obvious place to perform any extra logic in ironic itself is deploy_utils._cache_and_convert_image along with deploy_utils.build_instance_info_for_deploy which rewrites the user_info and enumerates available URLs such as those from swift storage.

This may mean we also need to have a mandatory retrieval of the image in the event an agent would nor...

Read more...

Revision history for this message
Jeremy Stanley (fungi) wrote :

I recommend also subscribing Dan to this bug, since he's very familiar with the malicious image mitigations that have been put into place in other OpenStack projects in recent years. Ideally, Ironic will take the same precautions that are planned for inclusion in the upcoming image inspector planned for Oslo.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

Dan has been consulting with us on this already, I added him to the bug for completeness.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

As a note: for the integrated Ironic case, we only use public glance images -- we believe this normally requires admin privs to mark an image as public? This should limit the blast radius for the integrated ironic case.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

I chatted with JayF some more today, and I've been making good progress on a fix for ironic itself, but it has raised a couple questions we need to ask ourselves.

But first, the approach.

For ironic, I've wired in validation logic which calls the image format inspector, triggers it's safety (rejects if it is false) check, records the image format which we detect, unwires the qemu-img info call and routes it to the image format inspector, compares that with what we are told by a user or glance (depending on flow && rejects the request if they don't match), all controlled by a giant knob.

This unfortunately has meant some extra wiring through the image cache and deployment utilities logic which all drivers use, but in the grand scheme it is not *that* bad.

If at any point in this flow things don't look right, we will return with an InvalidImage error which will log a more verbose error in the logs. Ironic conductor will also inspect any image we tell a node to deploy *as well* which adds more conductor load, but it does provide a greater level of saftey and abort the overall process far before anything gets to the node or an instance of qemu-img.

For inspector, I suspect we'll take a very similar approach, check/compare with image_inspector, and fail the write request if things don't pass validation.

This leaves the outstanding major question of ironic-lib and the fact we've centralized all qemu-img call logic there. The basic issue is today, we've got to add another argument to inform the qemu-img execution of the expected file format. If we carry the qemu-img calls in the backports *instead*, that saves us the need to backport anything on ironic-lib, in other words saves us a third of backports, and saves us needing to have compatibility logic which could mask the fix from being in place in any changes we backport. Jay and I are both leaning towards lifting the ironic-lib qemu-img call stuffs directly out and placing in the same patch stream. This opens a window for us to determine the correct path for ironic-lib's future, but that is outside of this issue's scope. The plus side is ironic-lib's docs say "this is only for use by ironic, do not use".

As for where I'm at right now, I likely have another day or two of fixing and writing unit tests in order to get us into a preliminary patch position for Ironic, and then I'll start on ironic-python-agent. The open question of the actual qemu-img calls will guide things from there.

I have a business trip next week which will impact the time I can focus on this next week, but we're concurrently working to also get CI in a healthy shape for stable and open unmaintained branches.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

I have a patch almost done for ironic, and the ironic-python-agent seems fairly straight forward to implement, and I should have patches up for both on Monday. The outstanding question is largely the how to handle the ironic-lib portion which impacts both patches. I'm anticipating to have that discussion with another core (Dmitry Tantsur) on Monday morning to help solidify the plan there.

In regards to ironic itself, I am a little worried the check logic might be a little bit *too* aggressive and possibly cause problems with retrieval of admins supplied kernels and ramdisks because I've wired this all into the image caching in ironic.

Once we have patches posted, we can likely start testing efforts since we're going to need to do some extensive testing to ensure we didn't inadvertently break anything else and have appropriate guards in place.

Revision history for this message
Dan Smith (danms) wrote :

Nova had a regression handling ISO files and the VMDK restrictions in place in format_inspector right now preclude the use of some VMDK files with an amended header (like a multi-session ISO). Both of these are either resolved now or have patches. I think it's not unexpected that tightening things down a lot will have some unintended fallout here, which is arguably better than the alternative.

We also put a knob on the deep inspection so that someone who is willing to accept the risk (or mitigates it with another strategy, like airgap or disabling new image uploads) has a way out in the meantime while the fallout gets cleaned up. I expect the same would be appropriate for ironic.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Okay, I just realized I have a little more work to do on the ironic side because I also have to patch out the Ansible deployment interface as well. In the grand scheme of things, not that much more.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

I had a chat with Dmitry Tantsur where he basically guessed we were looking at qemu. Initially he was very much into the model of patching ironic-lib, but upon discussion and explanation of the variety of issues with the full understanding that this is an embargoed issue, I think he is onboard with just carrying the convert call. He definitely is not a fan of the approach, but understands. I think that is kind of the same situation we're all in, since trying to route around and enable ironic-lib backports also complicates shipping a complete fix overall.

As for patches, I doub't I'll have patches up today, but might. Basically my flight got cancelled because the crew would have timed out and fallout from the massive issues late last week, so I'm now booked to fly tonight.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

First pass patch on a fix. Have not tested in a running environment, but passes unit tests/pep8.

It is a centralized fix for ironic so attempts get caught before IPA is engaged. We still need to patch IPA, but that is relatively trivial to do. This is fairly invasive and this ultimately doesn't directly fix ironic-lib, but ceases ironic from using the ironic lib convert image code.

affects: ironic-lib → ironic
Changed in ironic-lib:
status: New → Triaged
importance: Undecided → Critical
Changed in ironic-python-agent:
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
Jay Faulkner (jason-oldos) wrote :

I have a concern after reviewing the code, but I'm unsure if we can resolve it or just document it -- AFAICT, the ansible driver does not use the image cache code, and so even though we do the safer thing of passing the image type to qemu-img, we aren't doing a safety check on that image. Given the ansible deploy driver is not intended for most use cases, we could document a limitation to only use it with trusted images and it'd potentially be fine?

That's the only super concerning thing I've seen while reviewing it. I'll do some testing in devstack with the bad images collection over the weekend to see if I can find a hole.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

I chatted with Jay Faulkner and pointed out where the code path for the ansible deploy interface does indeed invoke the code which drives the caching action. It is slightly confusing because in deploy_utils it is the preparation of instance_info which also triggers image download/caching operations.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

++ Confirmed my concern is dealt with.

The patch I'm about to post contains minimal changes:
- Rebased against latest-master Ironic
- Created patch with git format-patch so it can contain commit message + other metadata
- Fixed various minor issues preventing all tox environments (codespell) from passing
- Updated the release note to inform operators of the massive change in I/O use for conductors in swift-temp-url usage cases.

I'll then dedicate some time this weekend to getting a bad image through the fix. I'll update here if that testing ends up breaking something.

1 comments hidden view all 141 comments
Revision history for this message
Jay Faulkner (jason-oldos) wrote (last edit ):
Download full text (5.3 KiB)

I performed some testing against the above patch, using the following methodology:

On a local vm configured with the "Ironic w/Nova" devstack config, as documented here: https://docs.openstack.org/ironic/latest/contributor/devstack-guide.html#ironic-with-nova . After applying the above patch and restarting Ironic services, I used the attached script to setup Ironic for deployment and allow me to manually observe the results. Note that this does not do the proper network setup for a deployment to entirely succeed, but it does get them far enough to get the image cached which is sufficient for testing purposes. Below, when a failure or something interesting occurs, I will explain and provide log snippets if relevant.

The images used for test were provided to me by Dan Smith and I'm setting my success/failure expectations based on the self-documenting names. I hosted them on a local http server and exercised Ironic in standalone mode as to avoid any interaction with mitigations already merged in Glance/Nova. Each image was tested both with image_disk_format set to their actual format (iso, qcow2, vmdk, etc) and set to raw. An "OK" indicates in all cases the image was detected as expected and failed, or if it's a known good image it worked if expected.

OK bad-iso-with-qcow.iso
OK bad-qcow-with-backing.qcow2
OK bad-qcow-with-datafile.qcow2
OK bad-qed.qed
----

FAILED bad-vmdk-flat-expose.qcow2

Did not detect this as a bad image, both with image_disk_format set to qcow2 and set to raw. I'll note that the unexpected thing below is the *lack* of any messages indicating the image is dangerous, which were printed in the other cases.

image_disk_format=raw:

Jul 28 20:13:44 ubuntu ironic-conductor[858234]: DEBUG ironic.drivers.modules.deploy_utils [None req-960a4ca2-08ab-49d2-af1b-e51a80efd653 None None] Computed sha512 checksum for image /var/lib/ironic/images/cd4e6b83-0148-42c9-947e-ba00b2e44fe1/disk in 0.00 seconds, checksum value: feb82444c3e39ef0105128eaa5e06546f6265cb02b6b5274474d9d597dc891c19685d9c2f9d21211ffbb70ad46ef6c02f6d44720dd47ad0f7aae83924bd0de19.
Jul 28 20:13:44 ubuntu ironic-conductor[858234]: DEBUG ironic.drivers.modules.deploy_utils [None req-960a4ca2-08ab-49d2-af1b-e51a80efd653 None None] Start computing sha512 checksum for image /var/lib/ironic/images/cd4e6b83-0148-42c9-947e-ba00b2e44fe1/disk.
Jul 28 20:13:44 ubuntu ironic-conductor[858234]: DEBUG ironic.drivers.modules.deploy_utils [None req-960a4ca2-08ab-49d2-af1b-e51a80efd653 None None] Recalculating checksum for image /var/lib/ironic/images/cd4e6b83-0148-42c9-947e-ba00b2e44fe1/disk for node cd4e6b83-0148-42c9-947e-ba00b2e44fe1 due to image conversion
Jul 28 20:13:44 ubuntu ironic-conductor[858234]: DEBUG ironic.drivers.modules.image_cache [None req-960a4ca2-08ab-49d2-af1b-e51a80efd653 None None] Master cache hit for image http://nope:8000/bad-vmdk-flat-expose.qcow2
Jul 28 20:13:44 ubuntu ironic-conductor[858234]: DEBUG ironic.drivers.modules.deploy_utils [None req-960a4ca2-08ab-49d2-af1b-e51a80efd653 None None] Fetching image http://nope:8000/bad-vmdk-flat-expose.qcow2 for node cd4e6b83-0148-42c9-947e-ba00b2e44fe1

image_disk_format=qcow2:

Jul 28 20:07:42 ubuntu ironic...

Read more...

Revision history for this message
Dmitry Tantsur (divius) wrote :

As I told Julia the other day, I have serious concerns with duplicating not just ironic_lib.qemu_img but also a huge new module in both Ironic and IPA. I'm still not sure why this code cannot go to ironic-lib where it is now. At this rate, I'll probably -1 the patch once it's public.

Revision history for this message
Dmitry Tantsur (divius) wrote :

A positive side effect: doing it in ironic-lib will enforce that no access to qemu bypasses the security checks (ignoring the ansible deploy, which I"m not even sure we should fix given its nature). Judging by comment 16, it's still happening now.

Revision history for this message
Dan Smith (danms) wrote :

Jay, the format_inspector you're using doesn't support the monolithicFlat type of VMDK (since it's just a text file, of arbitrary length, can have things like comments which can make it very long, and because the other projects don't support that subtype). I've got patches up to the oslo version that now catch it, but they're based on other yet-to-be-merged infrastructure you won't have.

It is safe to call `qemu-img info` on that type as long as you use a full path to the image file, and that's what the other projects are currently relying on. We all have specific checks for VMDK files that only allow monolithicSparse and streamOptimized subtypes, based on the qemu-img output, and that behavior pre-dates this most recent CVE. Here's an example in nova:

https://github.com/openstack/nova/blob/master/nova/virt/images.py#L117

My recommendation would be that, unless you think you need to support booting a machine from a VMDK, that you just check the qemu-img output after you determine it's safe to run, and if it's a vmdk, reject it outright.

The point of the flat-expose test file is that running qemu-img convert on it will copy any file on the host machine into the target image (or worse if you don't provide the full path to the image file when you call convert).

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

We likely need to be careful with longer comments due to the way launchpad handles them, it makes it a bit harder to work with it.

Anyway, Jay, regarding your testing:

> FAILED bad-vmdk-flat-expose.qcow2

So, I'm guessing that in both cases, it was just allowed to try and detect it. I guess I'm wondering if the image format inspector code has gotten updates in the last week or so since it seems like we should have detected that. In both cases the file appears to have been downloaded based upon the logs. I guess we need to take a look around the checksum calculation code to make sure we don't have a way for a check to slip through.

----

> FAILED bad-vmdk-flat-expose.vmdk
[trim]
> When specifying raw, this passed image sanity checking as a raw image.

The bottom line issue with this case, is it appears that the image format inspection code doesn't properly identify the vmdk as a vmdk, but instead identifies it as a raw image which *is* permitted.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Dmitry, it is very much a scope problem, which is why in my opinion it is cleaner just to fix Ironic in one patch, and separately do the lightweight needful in IPA. This is based upon:

* We cannot execute qemu-img without passing in a source format to the call. Changing this alone, even guessing it on qemu-img calls exclusively is problematic for enforcing consistency and security, besides, as-is, this breaks ironic's unit testing which does mock the underlying execute call and verifies the format. So we're back to matching paired patches and logic for upgrade compatibility, which increases the overall risk to a fix because it increases the complexity to verify "fixed".

* To handle a separate fix in ironic lib or a dependent fix, we're going to have to drop in new logic into ironic anyhow because ideally we're passing in an argument, and we're still needing to do upfront format checking *anyway*.

* We cannot use the image info call in ironic lib anymore, anyhow, because it loads the drivers which can cause issues in qemu-img, which means we functionally have to patch the call's usage out in the dependencies of which removes half the usage in ironic-lib's qemu call copy

I suspect the best path is to talk through this in detail.

Revision history for this message
Dan Smith (danms) wrote :

Julia,

> The bottom line issue with this case, is it appears that the image format inspection code doesn't properly identify the vmdk as a vmdk, but instead identifies it as a raw image which *is* permitted.

If you are absolutely sure that when format_inspector fails to detect the subformats it doesn't support, the resulting "raw" gets passed to `qemu-img convert -f $fmt` then that too is a valid solution. However, if the innocent user tries it, they won't know why it's not working the way they expect (i.e. laying down a vmdk file directly on the disk, byte-for-byte).

In nova, we lose track of the originally-detected format as the image goes through our image cache, which makes this not a solution for us, just FYI.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Dan, Thanks for the comment in #19, that would definitely do it.

Could you point us to the patches your referring to so we can take a look?

I guess I'm really not a fan of the idea of trying to avoid one call all other cases, but then in one special then try to make a call to `qemu-img info` since that incurs risk as well, even if just in this specific case it does not.

Revision history for this message
Dan Smith (danms) wrote :

Here's the patch in the oslo series that makes this possible

https://review.opendev.org/c/openstack/oslo.utils/+/924512

As you can see, it needs more infrastructure changes in the FI itself, sits atop a bunch of other refactors , and while I have done some testing locally, it's definitely not proven out to the same degree as the rest of the stuff.

So if you're definitely going to run qemu-img with `-f $fmt`, where $fmt is the result of the format inspector (which will detect these as raw) you should be safe, you just won't reject that format visibly.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Ack, that is good to know, and realistically since it is just the descriptor file, that might make it on disk, but the machine would never, ever boot. So I guess it is likely okay.

The risk, AIUI, is if you could then snapshot that contents, but even if we had that supported as a thing, it would be in raw format, not as vmdk. So nothing to tickle anything but the raw driver as long as all qemu-img calls have a format defined.

Revision history for this message
Dan Smith (danms) wrote :

Yes, as long as you actually coerce the snapshot to be raw. That's probably an assumption ironic could make, but other services can't (and haven't and thus, had issues here).

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Indeed, since the starting point is always a raw block device. We don't have snapshotting functionality, but we know some folks who have implemented it and not shared patches with upstream since it is not a terribly complicated operation overall. I suspect this is likely something we should explicitly note in the release note.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :
Download full text (3.6 KiB)

Julia, Dmitry, and I met today in a video meeting to rapidly share context.

A few notes: Dmitry's concerns around if this code should live in ironic-lib or if we're OK patching Ironic/IPA only were resovled: we will be fine patching only IPA and Ironic.

Additionally, here are some notes for the meeting around the patch and the approach:

How to handle poisoned images in cache?
  * Ironic devs are OK with the cache getting cleared as time passed
  * Ironic devs should ensure the OSSA includes instructions to manually clear cache in paranoid situations

What changes are needed for each package?
* ironic-lib
  * Patch to remove qemu-img module, not to be backported

* Ironic
  * Patch with embedded image inspector to be written and backported
  * Post-disclosure, once ready, master branch will remove embedded image inspector module in favor of the one in oslo-utils
  * The current patch may circumvent image streaming in some cases, which could cause a significant performance hit, almost to the point of a DoS. Instead, we should avoid significantly changing conductor performance by default, while giving users and option to inspect images earlier at the expense of performance.
  * We need to add a value, that regardless of the setting of image_download_type, would force all images to be inspected on-conductor. This would be disabled by default.
    * conductor_always_validates_images = false (default)
      * For cases where we expect image streaming or image would've never touched conductor, trust IPA validation to catch and error
    * conductor_always_validates_images = true
      * Conductor downloads and security-checks all images by default
      * Major performance hit! Conductor DDoS possible with extreme-size raw images.
  * Ensure the description of "supported image types" config contains a list of images that we expect MAY work even though we don't support them explicitly

* IPA
  * Needs to independently inspect images that are passed through qemu-img for writing/conversion.
  * Cannot assume conductor has validated image is safe, due to various image streaming options

* IPA + Ironic
  * Ensure there are comments documenting:
    * to check IPA/Ironic code, anywhere we may be slightly duplicating code formerly-from-ironic-lib, instructing a dev to check the other similar code.
    * this code migrating to oslo.utils in future versions (this comment can be removed from backports)

When we eventually draft the OSSA, we want to mention the following:
  * Operators can manually purge image cache if they are concerned about existing cached images (note: by default this cache cleans every hour)
  * Ironic-lib is documented to not be supported for non-Ironic use cases. If anyone is using ironic-lib directly against our will: STOP. We are not patching qemu_img module in Ironic-Lib as part of this fix.
  * Information on how to report regressions / self-serve fixes (e.g. adding a new image type to allowlist)
  * Documentation telling users who refuse to update their IPA ramdisks (a behavior seen in the wild by many Ironic devs) to set conductor_always_validate_images to true.
  * Mention the new config blocking all but qcow2/raw images ...

Read more...

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Attaching an updated patch for Ironic. Includes documentation updates *as well*.

We will need to get a CVE number reserved to make final updates to this patch and the release note which has grown in length.

It changes the behavior from the first revision to align with the discussion notes JayF posted yesterday, and while the details are slightly different the basic meaning and purpose aligns with two different image inspection knobs to cover the distinct cases.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

This is an untested, rough draft patch for IPA. I'm primarily uploading it so it exists a place other than my laptop since I can't back it up in gerrit :).

Next actions will be updating unit tests and spinning this up in a devstack. Feel free to give this a high level review if you'd like.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Basic smoke testing with my latest patch against ironic *appears* to be generally okay when running the standalone test suite, although it also looks like it might have broken the ansible deployment interface. The test is still running and I'll need to extract the logs once completed to get a better idea of why it failed.

Stdout: 'ansible-playbook [core 2.17.2]\n config file = /opt/stack/ironic/ironic/drivers/modules/ansible/playbooks/ansible.cfg\n configured module search path = [\'/home/stack/.ansible/plugins/modules\', \'/usr/share/ansible/plugins/modules\']\n ansible python module location = /opt/stack/data/venv/lib/python3.10/site-packages/ansible\n ansible collection location = /home/stack/.ansible/collections:/usr/share/ansible/collections\n executable location = /opt/stack/data/venv/bin/ansible-playbook\n python version = 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0] (/opt/stack/data/venv/bin/python3.10)\n jinja version = 3.1.4\n libyaml = True\nUsing /opt/stack/ironic/ironic/drivers/modules/ansible/playbooks/ansible.cfg as config file\nsetting up inventory plugins\nLoading collection ansible.builtin from \nhost_list declined parsing /opt/stack/ironic/ironic/drivers/modules/ansible/playbooks/inventory as it did not pass its verify_file() method\nscript declined parsing /opt/stack/ironic/ironic/drivers/modules/ansible/playbooks/inventory as it did not pass its verify_file() method\nauto declined parsing /opt/stack/ironic/ironic/drivers/modules/ansible/playbooks/inventory as it did not pass its verify_file() method

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Okay, My ansible deployment interface failures *appear?* to be more rooted in the environment I'm testing in than anything else. Given my local environment has been through some various different stacking/restacking, there is a distinct possibility I've got the state a little broken for the more fragile drivers, and it does appear I'm having issues authenticating through to the ramdisk which is entirely unrelated.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

Ironic devs on the inside for this bug met this morning:
- Did a review of the rough draft IPA change
- Talked about strategies for backporting and configuration
- Shared some context about format_inspector with Dmitry

We are syncing again Monday morning, with a goal of having patches for master Ironic and IPA tested working by then, even if they need further (minor) revision.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Confirmed, it was my test environment.

$ tempest run --regex BaremetalDriverAnsibleWholedisk
{0} ironic_tempest_plugin.tests.scenario.ironic_standalone.test_basic_ops.BaremetalDriverAnsibleWholedisk.test_ip_access_to_server [294.144806s] ... ok

======
Totals
======
Ran: 1 tests in 294.1448 sec.
 - Passed: 1
 - Skipped: 0
 - Expected Fail: 0
 - Unexpected Success: 0
 - Failed: 0
Sum of execute time for each test: 294.1448 sec.

The plan right now in terms of smoke testing the master branch patch state is to walk through the scenario configurations and running them. So far, standalone jobs passes which is a really good sign.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

I've been able to successfully execute the following upstream tests which mix non-swift usage (standalone), swiftless glance usage, and usage with swift in integrated configurations. All of these were tested *with* and without the conductor_always_verifies_image option set.

ironic-standalone-redfish
ironic-standalone-anaconda
ironic-tempest-bios-redfish-pxe
ironic-tempest-ipa-wholeisk-bios-ipmi-direct-dib

The major variation in these jobs are largely a variety of scenarios and interfaces, so overall I have good confidence that the ironic patch, as-is, is not going to have any major issues when dropped into the gate (when we get to that point).

The team can discuss further test jobs, but we should repeat tests of the "direct" deployment interface (ironic-standalone-redfish and the tempest jobs will exercise this) when we have a IPA patch in a ready state.

Furthermore, I took Jay's script, modified slightly and repeated tests, and did discover an issue as related to disqualification of images based upon user supplied input of the image_disk_format compared to what we have received url content wise. I'll upload a new revision shortly, but the tl;dr is all of the expected failures failed, and all of the good images succeeded. In the script I'll also upload, I denote what is expected. I also tested against, from a good measure standpoint of additional images just to ensure we have our bases covered ultimately ensure the fix is in the best shape possible.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Attached is a revised patch for ironic, passes unit tests, pep8, and py3 testing locally. Started with 0002 just to signify a revision more than anything.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Attached is my test script I used to test the direct/standalone usage of the path and behavior where I exercised *both* local file download and the option to have the conductor to explicitly inspect all of the files. The expected behavior is also noted inline, and everything seems to work as noted.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

For ironic specifically, I took a look at the backporting order and complexity as compared to the current branch. It doesn't backport cleanly considering it also has documentation, and there have been some refactoring. All of these notes are from a direct cherry-pick of the patch to each branch, to create the patches for each branch, we'll need to go in order, which should resolve a lot of the differences I see in a direct cherry-pick as we go backwards in time.

Branch Name - Notes
-------------------
bugfix/25.0 - conflicts, but relatively clean

stable/2024.1 - conflicts, but relatively clean

bugfix/24.0 - conflicts, less clean, conflicts with the move of disk_utils->qemu_img in ironic-lib.

bugfix 23.1 - conflicts, but appears to be the same as 24.0

stable/2023.2 - conflicts, but appears to be the same as 24.0 and 23.1

stable/2023.1 - conflicts, possibly a little bit more than 2023.2, but realistically thee same.

unmaintained/zed - conflicts, looks very similar to 2023.1, if not identical.

unmaintained/yoga - As-is, looks like a direct cherry-pick conflicts with docs and some fixes around Anaconda and redirect fixes. If bugfix/24.0 requires an hour, expect a two hours on this branch in large part just to sort out and correctly ensure we have applied the right logic/flow.

unmaintained/xena - About the same as Yoga, likely will be relatively clean from yoga based patch to xena.

unmaintained/wallaby - A bit more complexity compared to Xena, esp. in deploy_utils.build_instance_info_for_deploy. Overall, the main issue at this branch is the iscsi deploy driver which will require some additional checking/testing. If yoga takes 2 hours, expect this to take about eight hours.

unmaintained/victoria - This looks very similar to Wallaby.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

v2 of the IPA patch.

Julia is a bit ahead of me. I hope to be where she's at with Ironic by EOD Monday.

Code that exists is passing all tests, including some new tests I've already added and some bugfixes I've applied as a result of those tests.

Remaining todos:
# TODO(JayF): Unit tests for _image_inspection
# TODO(JayF): Unit tests for disabled deep image inspection
# TODO(JayF): Migrate qemu_img tests, add some for deep image inspection conf
# TODO(JayF): Migrate format_inspector tests, ensure I have latest copy
# TODO(JayF): Addl devstack testing

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :
Download full text (3.8 KiB)

Dmitry, Jay, and I met this morning to discuss status and review the ironic patch specifically:

Notes/items to fix/amend/check.

security.rst
* Disk images in the docs - line 416, "bene" should be "been"
* typo: functioanlity in the 3rd item
* Add note to 443 in to indicate IPA will independently perform these checks on files downloaded and require conversion.
* check to see if we independently document image_download_source and fix

troubleshooting.rst:
* line 1267: Awkward text, needs revision: as in if the contents appears safe, or not.
* line 1282: We should note that final, late state check are also performed during the write process.
* line 1273: uploading spelling needs to be fixed.

configure-glance-images.rst
* line 15: permittedimage (need to be two words, missing space)

creating-images.rst.
* line 50: "image image"
* line 51: change confusion to use of this format may result in unexpected behavior.
* line 55: note blocked by default
* line 58: explicitly note this not based upon file extension.

General note for context around this, many of these checks are early, before starting provisioning.

release note:
* first item - "with a new image id"
* Third item: Add a note indicating IPA will perform additional checking.
* forth item - image formats permitted - should add reference to the configuration parameter
* ansible playbook item - change to "now supply an input"
* line 44: functionality spelling fix.
* line 50: make bolded text for at your own risk.
* line 61: "and *has* expressed as known working"
* Upgrade section: we need to add a thing... about the supported image list parameters.
* Upgrade section should be split into multiple distinct items.

conf/conductor.py
* line 438 or there about: Be more verbose about how this is super bad and can open the conductor to qemu-attacks.
* line 450: might need a comma in the help text.

images.py:

* line 431: fallback to qcow2 default should be get_source_format
* safety_check_image method, add a docstring
* line: 840: Change warning to error
* line 855: also change to error in check_if_format_is_permitted
General item for images.py, add _cls to parameters which are classes.

deploy_utils.py:
* fix docstring on fetch_images
* fetch_images: Add a future todo around the returned list that it should be improved at some point.
* cache_instance_image also needs a docstring update.
* change the FIXME on line 1203 to be *we explicitly delete for reasons like the cache really won't work beyond the single node in this case.

* line 1148: incomplete docstring as _invalid_image_Format.
* line 1182: change logging from debug to info
* line 1208 - Add notes around image_info and disk_format field generation so that we properly preserve the instance_info\image_disk_format to disk_format field as it is required for ipa and ansible deploy interfaces.
* line 1337: change were to "we are"
* line 1369: validate_results is getting the wrong field name

build_instance_info_for_deploy in deploy_utils.py
* Add notes around the if/else statements on the paths so reviewers/future editors can grok it.

image_cache.py
* line 373: change warning to error

To Check:
* see how expensive how image_format_inspecto...

Read more...

Changed in ossa:
importance: Undecided → Critical
status: New → In Progress
assignee: nobody → Brian Rosmaita (brian-rosmaita)
Changed in ironic:
status: Triaged → In Progress
assignee: nobody → Julia Kreger (juliaashleykreger)
Changed in ironic-python-agent:
status: Triaged → In Progress
assignee: nobody → Jay Faulkner (jason-oldos)
Changed in ironic-lib:
assignee: nobody → Jay Faulkner (jason-oldos)
summary: Hardening: don't run qemu-img with unvalidated image data
+ (CVE-2024-44082)
summary: - Hardening: don't run qemu-img with unvalidated image data
- (CVE-2024-44082)
+ Unvalidated image data passed to qemu-img (CVE-2024-44082)
information type: Private Security → Public Security
summary: - Unvalidated image data passed to qemu-img (CVE-2024-44082)
+ [OSSA-2024-003] Unvalidated image data passed to qemu-img
+ (CVE-2024-44082)
61 comments hidden view all 141 comments
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (bugfix/26.0)

Fix proposed to branch: bugfix/26.0
Review: https://review.opendev.org/c/openstack/ironic/+/927966

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (bugfix/25.0)

Fix proposed to branch: bugfix/25.0
Review: https://review.opendev.org/c/openstack/ironic/+/927967

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (stable/2024.1)

Fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/ironic/+/927968

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (bugfix/24.0)

Fix proposed to branch: bugfix/24.0
Review: https://review.opendev.org/c/openstack/ironic/+/927969

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/ironic/+/927970

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/ironic/+/927972

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (unmaintained/zed)

Fix proposed to branch: unmaintained/zed
Review: https://review.opendev.org/c/openstack/ironic/+/927973

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic-python-agent (master)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (unmaintained/yoga)

Fix proposed to branch: unmaintained/yoga
Review: https://review.opendev.org/c/openstack/ironic/+/927975

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic-python-agent (stable/2024.1)

Fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/ironic-python-agent/+/927976

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (unmaintained/xena)

Fix proposed to branch: unmaintained/xena
Review: https://review.opendev.org/c/openstack/ironic/+/927977

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic-python-agent (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/ironic-python-agent/+/927978

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic-python-agent (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/ironic-python-agent/+/927979

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (unmaintained/wallaby)

Fix proposed to branch: unmaintained/wallaby
Review: https://review.opendev.org/c/openstack/ironic/+/927980

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic-python-agent (bugfix/9.13)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (unmaintained/victoria)

Fix proposed to branch: unmaintained/victoria
Review: https://review.opendev.org/c/openstack/ironic/+/927982

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic-python-agent (bugfix/9.12)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic-python-agent (bugfix/9.9)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic-python-agent (unmaintained/zed)

Related fix proposed to branch: unmaintained/zed
Review: https://review.opendev.org/c/openstack/ironic-python-agent/+/927985

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic-python-agent (unmaintained/yoga)

Related fix proposed to branch: unmaintained/yoga
Review: https://review.opendev.org/c/openstack/ironic-python-agent/+/927986

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic-python-agent (unmaintained/xena)

Related fix proposed to branch: unmaintained/xena
Review: https://review.opendev.org/c/openstack/ironic-python-agent/+/927987

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic-python-agent (unmaintained/wallaby)

Related fix proposed to branch: unmaintained/wallaby
Review: https://review.opendev.org/c/openstack/ironic-python-agent/+/927988

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic-python-agent (unmaintained/victoria)

Related fix proposed to branch: unmaintained/victoria
Review: https://review.opendev.org/c/openstack/ironic-python-agent/+/927990

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ossa (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/ossa/+/927995

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ossa (master)

Reviewed: https://review.opendev.org/c/openstack/ossa/+/927995
Committed: https://opendev.org/openstack/ossa/commit/8d3e8bb3ae64e44b863ede07682952a2ef64e9b2
Submitter: "Zuul (22348)"
Branch: master

commit 8d3e8bb3ae64e44b863ede07682952a2ef64e9b2
Author: Brian Rosmaita <email address hidden>
Date: Fri Aug 23 18:17:28 2024 -0400

    Add OSSA-2024-003 (CVE-2024-44082)

    Closes-bug: 2071740
    Change-Id: I4595429e69834739dce2380a8a602f521271cb03

Changed in ossa:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic-python-agent (master)

Reviewed: https://review.opendev.org/c/openstack/ironic-python-agent/+/927974
Committed: https://opendev.org/openstack/ironic-python-agent/commit/e303a369dce6c4c5dd0402701b020888396406f3
Submitter: "Zuul (22348)"
Branch: master

commit e303a369dce6c4c5dd0402701b020888396406f3
Author: Jay Faulkner <email address hidden>
Date: Tue Jul 30 11:18:14 2024 -0700

    Inspect non-raw images for safety

    When IPA gets a non-raw image, it performs an on-the-fly conversion
    using qemu-img convert, as well as running qemu-img frequently to get
    basic information about the image before validating it.

    Now, we ensure that before any qemu-img calls are made, that we have
    inspected the image for safety and pass through the detected format.

    If given a disk_format=raw image and image streaming is enabled
    (default), we retain the existing behavior of not inspecting it in
    any way and streaming it bit-perfect to the device. In this case, we
    never use qemu-based tools on the image at all.

    If given a disk_format=raw image and image streaming is disabled, this
    change fixes a bug where the image may have been converted if it was not
    actually raw in the first place. We now stream these bit-perfect to the
    device.

    Adds two config options:
    - [DEFAULT]/disable_deep_image_inspection, which can be set to "True" in
      order to disable all security features. Do not do this.
    - [DEFAULT]/permitted_image_formats, default raw,qcow2, for image types
      IPA should accept.

    Both of these configuration options are wired up to be set by the lookup
    data returned by Ironic at lookup time.

    This uses a image format inspection module imported from Nova; this
    inspector will eventually live in oslo.utils, at which point we'll
    migrate our usage of the inspector to it.

    Closes-Bug: #2071740
    Change-Id: I5254b80717cb5a7f9084e3eff32a00b968f987b7

Changed in ironic-python-agent:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic-python-agent (stable/2024.1)

Reviewed: https://review.opendev.org/c/openstack/ironic-python-agent/+/927976
Committed: https://opendev.org/openstack/ironic-python-agent/commit/06fe5ff1782551e6f94640d47ea942ab81f18909
Submitter: "Zuul (22348)"
Branch: stable/2024.1

commit 06fe5ff1782551e6f94640d47ea942ab81f18909
Author: Jay Faulkner <email address hidden>
Date: Mon Mar 11 17:29:58 2024 +0100

    Inspect non-raw images for safety

    This is a backport of two changes merged together to facilitate
    backporting:

    The first is a refactor of disk utilities:

    Import disk_{utils,partitioner} from ironic-lib

    With the iscsi deploy long gone, these modules are only used in IPA and
    in fact represent a large part of its critical logic. Having them
    separately sometimes makes fixing issues tricky if an interface of
    a function needs changing.

    This change imports the code mostly as it is, just removing run_as_root and
    a deprecated function, as well as moving configuration options to config.py.

    Also migrates one relevant function from ironic_lib.utils.

    The second is the fix for the security issue:

    Inspect non-raw images for safety

    When IPA gets a non-raw image, it performs an on-the-fly conversion
    using qemu-img convert, as well as running qemu-img frequently to get
    basic information about the image before validating it.

    Now, we ensure that before any qemu-img calls are made, that we have
    inspected the image for safety and pass through the detected format.

    If given a disk_format=raw image and image streaming is enabled
    (default), we retain the existing behavior of not inspecting it in
    any way and streaming it bit-perfect to the device. In this case, we
    never use qemu-based tools on the image at all.

    If given a disk_format=raw image and image streaming is disabled, this
    change fixes a bug where the image may have been converted if it was not
    actually raw in the first place. We now stream these bit-perfect to the
    device.

    Adds two config options:
    - [DEFAULT]/disable_deep_image_inspection, which can be set to "True" in
      order to disable all security features. Do not do this.
    - [DEFAULT]/permitted_image_formats, default raw,qcow2, for image types
      IPA should accept.

    Both of these configuration options are wired up to be set by the lookup
    data returned by Ironic at lookup time.

    This uses a image format inspection module imported from Nova; this
    inspector will eventually live in oslo.utils, at which point we'll
    migrate our usage of the inspector to it.

    Closes-Bug: #2071740
    Co-Authored-By: Dmitry Tantsur <email address hidden>
    Change-Id: I5254b80717cb5a7f9084e3eff32a00b968f987b7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic-python-agent (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/ironic-python-agent/+/927978
Committed: https://opendev.org/openstack/ironic-python-agent/commit/b7fa84dcc1284beef87480af8fd32784dd3a80f6
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit b7fa84dcc1284beef87480af8fd32784dd3a80f6
Author: Jay Faulkner <email address hidden>
Date: Mon Mar 11 17:29:58 2024 +0100

    Inspect non-raw images for safety

    This is a backport of two changes merged together to facilitate
    backporting:

    The first is a refactor of disk utilities:

    Import disk_{utils,partitioner} from ironic-lib

    With the iscsi deploy long gone, these modules are only used in IPA and
    in fact represent a large part of its critical logic. Having them
    separately sometimes makes fixing issues tricky if an interface of
    a function needs changing.

    This change imports the code mostly as it is, just removing run_as_root and
    a deprecated function, as well as moving configuration options to config.py.

    Also migrates one relevant function from ironic_lib.utils.

    The second is the fix for the security issue:

    Inspect non-raw images for safety

    When IPA gets a non-raw image, it performs an on-the-fly conversion
    using qemu-img convert, as well as running qemu-img frequently to get
    basic information about the image before validating it.

    Now, we ensure that before any qemu-img calls are made, that we have
    inspected the image for safety and pass through the detected format.

    If given a disk_format=raw image and image streaming is enabled
    (default), we retain the existing behavior of not inspecting it in
    any way and streaming it bit-perfect to the device. In this case, we
    never use qemu-based tools on the image at all.

    If given a disk_format=raw image and image streaming is disabled, this
    change fixes a bug where the image may have been converted if it was not
    actually raw in the first place. We now stream these bit-perfect to the
    device.

    Adds two config options:
    - [DEFAULT]/disable_deep_image_inspection, which can be set to "True" in
      order to disable all security features. Do not do this.
    - [DEFAULT]/permitted_image_formats, default raw,qcow2, for image types
      IPA should accept.

    Both of these configuration options are wired up to be set by the lookup
    data returned by Ironic at lookup time.

    This uses a image format inspection module imported from Nova; this
    inspector will eventually live in oslo.utils, at which point we'll
    migrate our usage of the inspector to it.

    Closes-Bug: #2071740
    Co-Authored-By: Dmitry Tantsur <email address hidden>
    Change-Id: I5254b80717cb5a7f9084e3eff32a00b968f987b7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic-python-agent (bugfix/9.13)

Reviewed: https://review.opendev.org/c/openstack/ironic-python-agent/+/927981
Committed: https://opendev.org/openstack/ironic-python-agent/commit/9be29ad1dd1ce7ddeec1c6c4498b99d17fd42625
Submitter: "Zuul (22348)"
Branch: bugfix/9.13

commit 9be29ad1dd1ce7ddeec1c6c4498b99d17fd42625
Author: Jay Faulkner <email address hidden>
Date: Tue Jul 30 11:18:14 2024 -0700

    Inspect non-raw images for safety

    When IPA gets a non-raw image, it performs an on-the-fly conversion
    using qemu-img convert, as well as running qemu-img frequently to get
    basic information about the image before validating it.

    Now, we ensure that before any qemu-img calls are made, that we have
    inspected the image for safety and pass through the detected format.

    If given a disk_format=raw image and image streaming is enabled
    (default), we retain the existing behavior of not inspecting it in
    any way and streaming it bit-perfect to the device. In this case, we
    never use qemu-based tools on the image at all.

    If given a disk_format=raw image and image streaming is disabled, this
    change fixes a bug where the image may have been converted if it was not
    actually raw in the first place. We now stream these bit-perfect to the
    device.

    Adds two config options:
    - [DEFAULT]/disable_deep_image_inspection, which can be set to "True" in
      order to disable all security features. Do not do this.
    - [DEFAULT]/permitted_image_formats, default raw,qcow2, for image types
      IPA should accept.

    Both of these configuration options are wired up to be set by the lookup
    data returned by Ironic at lookup time.

    This uses a image format inspection module imported from Nova; this
    inspector will eventually live in oslo.utils, at which point we'll
    migrate our usage of the inspector to it.

    Closes-Bug: #2071740
    Change-Id: I5254b80717cb5a7f9084e3eff32a00b968f987b7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic-python-agent (bugfix/9.12)

Reviewed: https://review.opendev.org/c/openstack/ironic-python-agent/+/927983
Committed: https://opendev.org/openstack/ironic-python-agent/commit/be8ee50ea1b0fbccf91ea4e4180af1f0e8154cdb
Submitter: "Zuul (22348)"
Branch: bugfix/9.12

commit be8ee50ea1b0fbccf91ea4e4180af1f0e8154cdb
Author: Jay Faulkner <email address hidden>
Date: Tue Jul 30 11:18:14 2024 -0700

    Inspect non-raw images for safety

    When IPA gets a non-raw image, it performs an on-the-fly conversion
    using qemu-img convert, as well as running qemu-img frequently to get
    basic information about the image before validating it.

    Now, we ensure that before any qemu-img calls are made, that we have
    inspected the image for safety and pass through the detected format.

    If given a disk_format=raw image and image streaming is enabled
    (default), we retain the existing behavior of not inspecting it in
    any way and streaming it bit-perfect to the device. In this case, we
    never use qemu-based tools on the image at all.

    If given a disk_format=raw image and image streaming is disabled, this
    change fixes a bug where the image may have been converted if it was not
    actually raw in the first place. We now stream these bit-perfect to the
    device.

    Adds two config options:
    - [DEFAULT]/disable_deep_image_inspection, which can be set to "True" in
      order to disable all security features. Do not do this.
    - [DEFAULT]/permitted_image_formats, default raw,qcow2, for image types
      IPA should accept.

    Both of these configuration options are wired up to be set by the lookup
    data returned by Ironic at lookup time.

    This uses a image format inspection module imported from Nova; this
    inspector will eventually live in oslo.utils, at which point we'll
    migrate our usage of the inspector to it.

    Closes-Bug: #2071740
    Change-Id: I5254b80717cb5a7f9084e3eff32a00b968f987b7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/927965
Committed: https://opendev.org/openstack/ironic/commit/c996aafa6d2fb7cb90da6f6126bf385635cdf32e
Submitter: "Zuul (22348)"
Branch: master

commit c996aafa6d2fb7cb90da6f6126bf385635cdf32e
Author: Julia Kreger <email address hidden>
Date: Thu Aug 8 12:42:20 2024 -0700

    CVE-2024-44982: Harden all image handling and conversion code

    It was recently learned by the OpenStack community that running qemu-img
    on untrusted images without a format pre-specified can present a
    security risk. Furthermore, some of these specific image formats have
    inherently unsafe features. This is rooted in how qemu-img operates
    where all image drivers are loaded and attempt to evaluate the input data.
    This can result in several different vectors which this patch works to
    close.

    This change imports the qemu-img handling code from Ironic-Lib into
    Ironic, and image format inspection code, which has been developed by
    the wider community to validate general safety of images before converting
    them for use in a deployment.

    This patch contains functional changes related to the hardening of these
    calls including how images are handled, and updates documentation to
    provide context and guidance to operators.

    Closes-Bug: 2071740
    Change-Id: I7fac5c64f89aec39e9755f0930ee47ff8f7aed47
    Signed-off-by: Julia Kreger <email address hidden>

Changed in ironic:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (bugfix/26.0)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/927966
Committed: https://opendev.org/openstack/ironic/commit/56ffe4f8ca7a7defaa45067f6a937f11d1305614
Submitter: "Zuul (22348)"
Branch: bugfix/26.0

commit 56ffe4f8ca7a7defaa45067f6a937f11d1305614
Author: Julia Kreger <email address hidden>
Date: Thu Aug 8 12:42:20 2024 -0700

    CVE-2024-44982: Harden all image handling and conversion code

    It was recently learned by the OpenStack community that running qemu-img
    on untrusted images without a format pre-specified can present a
    security risk. Furthermore, some of these specific image formats have
    inherently unsafe features. This is rooted in how qemu-img operates
    where all image drivers are loaded and attempt to evaluate the input data.
    This can result in several different vectors which this patch works to
    close.

    This change imports the qemu-img handling code from Ironic-Lib into
    Ironic, and image format inspection code, which has been developed by
    the wider community to validate general safety of images before converting
    them for use in a deployment.

    This patch contains functional changes related to the hardening of these
    calls including how images are handled, and updates documentation to
    provide context and guidance to operators.

    Closes-Bug: 2071740
    Change-Id: I7fac5c64f89aec39e9755f0930ee47ff8f7aed47
    Signed-off-by: Julia Kreger <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (bugfix/25.0)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/927967
Committed: https://opendev.org/openstack/ironic/commit/019aff28f14a99961c364df6dfb3369160d811fb
Submitter: "Zuul (22348)"
Branch: bugfix/25.0

commit 019aff28f14a99961c364df6dfb3369160d811fb
Author: Julia Kreger <email address hidden>
Date: Thu Aug 8 12:42:20 2024 -0700

    CVE-2024-44982: Harden all image handling and conversion code

    It was recently learned by the OpenStack community that running qemu-img
    on un-trusted images without a format pre-specified can present a
    security risk. Furthermore, some of these specific image formats have
    inherently unsafe features. This is rooted in how qemu-img operates
    where all image drivers are loaded and attempt to evaluate the input data.
    This can result in several different vectors which this patch works to
    close.

    This change imports the qemu-img handling code from Ironic-Lib into
    Ironic, and image format inspection code, which has been developed by
    the wider community to validate general safety of images before converting
    them for use in a deployment.

    This patch contains functional changes related to the hardening of these
    calls including how images are handled, and updates documentation to
    provide context and guidance to operators.

    Closes-Bug: 2071740

    master branch - ironic-master-branch-bug-2071740-20240820.patch
    bugfix/26.0 - ironic-bugfix-26.0-bug-2071740-20240820.patch

    CVE-2024-44982: Harden all image handling and conversion code

    It was recently learned by the OpenStack community that running qemu-img
    on un-trusted images without a format pre-specified can present a
    security risk. Furthermore, some of these specific image formats have
    inherently unsafe features. This is rooted in how qemu-img operates
    where all image drivers are loaded and attempt to evaluate the input data.
    This can result in several different vectors which this patch works to
    close.

    This change imports the qemu-img handling code from Ironic-Lib into
    Ironic, and image format inspection code, which has been developed by
    the wider community to validate general safety of images before converting
    them for use in a deployment.

    This patch contains functional changes related to the hardening of these
    calls including how images are handled, and updates documentation to
    provide context and guidance to operators.

    Closes-Bug: 2071740
    Change-Id: I7fac5c64f89aec39e9755f0930ee47ff8f7aed47
    Signed-off-by: Julia Kreger <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (bugfix/24.0)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/927969
Committed: https://opendev.org/openstack/ironic/commit/07bb2caf3c75cda2b7a20e956836c72cdadca926
Submitter: "Zuul (22348)"
Branch: bugfix/24.0

commit 07bb2caf3c75cda2b7a20e956836c72cdadca926
Author: Julia Kreger <email address hidden>
Date: Thu Aug 8 12:42:20 2024 -0700

    CVE-2024-44982: Harden all image handling and conversion code

    It was recently learned by the OpenStack community that running qemu-img
    on un-trusted images without a format pre-specified can present a
    security risk. Furthermore, some of these specific image formats have
    inherently unsafe features. This is rooted in how qemu-img operates
    where all image drivers are loaded and attempt to evaluate the input data.
    This can result in several different vectors which this patch works to
    close.

    This change imports the qemu-img handling code from Ironic-Lib into
    Ironic, and image format inspection code, which has been developed by
    the wider community to validate general safety of images before converting
    them for use in a deployment.

    This patch contains functional changes related to the hardening of these
    calls including how images are handled, and updates documentation to
    provide context and guidance to operators.

    Closes-Bug: 2071740
    Change-Id: I7fac5c64f89aec39e9755f0930ee47ff8f7aed47
    Signed-off-by: Julia Kreger <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (stable/2024.1)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/927968
Committed: https://opendev.org/openstack/ironic/commit/f7c7ea935ac5acfc22decbb51da316837b77e69b
Submitter: "Zuul (22348)"
Branch: stable/2024.1

commit f7c7ea935ac5acfc22decbb51da316837b77e69b
Author: Julia Kreger <email address hidden>
Date: Thu Aug 8 12:42:20 2024 -0700

    CVE-2024-44982: Harden all image handling and conversion code

    It was recently learned by the OpenStack community that running qemu-img
    on un-trusted images without a format pre-specified can present a
    security risk. Furthermore, some of these specific image formats have
    inherently unsafe features. This is rooted in how qemu-img operates
    where all image drivers are loaded and attempt to evaluate the input data.
    This can result in several different vectors which this patch works to
    close.

    This change imports the qemu-img handling code from Ironic-Lib into
    Ironic, and image format inspection code, which has been developed by
    the wider community to validate general safety of images before converting
    them for use in a deployment.

    This patch contains functional changes related to the hardening of these
    calls including how images are handled, and updates documentation to
    provide context and guidance to operators.

    Closes-Bug: 2071740
    Change-Id: I7fac5c64f89aec39e9755f0930ee47ff8f7aed47
    Signed-off-by: Julia Kreger <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/927970
Committed: https://opendev.org/openstack/ironic/commit/6a7c1ce16056276c2dd347b42ebb7f097d86b282
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit 6a7c1ce16056276c2dd347b42ebb7f097d86b282
Author: Julia Kreger <email address hidden>
Date: Thu Aug 8 12:42:20 2024 -0700

    CVE-2024-44982: Harden all image handling and conversion code

    It was recently learned by the OpenStack community that running qemu-img
    on un-trusted images without a format pre-specified can present a
    security risk. Furthermore, some of these specific image formats have
    inherently unsafe features. This is rooted in how qemu-img operates
    where all image drivers are loaded and attempt to evaluate the input data.
    This can result in several different vectors which this patch works to
    close.

    This change imports the qemu-img handling code from Ironic-Lib into
    Ironic, and image format inspection code, which has been developed by
    the wider community to validate general safety of images before converting
    them for use in a deployment.

    This patch contains functional changes related to the hardening of these
    calls including how images are handled, and updates documentation to
    provide context and guidance to operators.

    Closes-Bug: 2071740
    Change-Id: I7fac5c64f89aec39e9755f0930ee47ff8f7aed47
    Signed-off-by: Julia Kreger <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/927972
Committed: https://opendev.org/openstack/ironic/commit/188d43616121f8c6652c902ce6011cc6d42a9433
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 188d43616121f8c6652c902ce6011cc6d42a9433
Author: Julia Kreger <email address hidden>
Date: Thu Aug 8 12:42:20 2024 -0700

    CVE-2024-44982: Harden all image handling and conversion code

    It was recently learned by the OpenStack community that running qemu-img
    on un-trusted images without a format pre-specified can present a
    security risk. Furthermore, some of these specific image formats have
    inherently unsafe features. This is rooted in how qemu-img operates
    where all image drivers are loaded and attempt to evaluate the input data.
    This can result in several different vectors which this patch works to
    close.

    This change imports the qemu-img handling code from Ironic-Lib into
    Ironic, and image format inspection code, which has been developed by
    the wider community to validate general safety of images before converting
    them for use in a deployment.

    This patch contains functional changes related to the hardening of these
    calls including how images are handled, and updates documentation to
    provide context and guidance to operators.

    Closes-Bug: 2071740
    Change-Id: I7fac5c64f89aec39e9755f0930ee47ff8f7aed47
    Signed-off-by: Julia Kreger <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (unmaintained/zed)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/927973
Committed: https://opendev.org/openstack/ironic/commit/b4530e85b7754018584e1b364e772105a3605b24
Submitter: "Zuul (22348)"
Branch: unmaintained/zed

commit b4530e85b7754018584e1b364e772105a3605b24
Author: Julia Kreger <email address hidden>
Date: Thu Aug 8 12:42:20 2024 -0700

    CVE-2024-44982: Harden all image handling and conversion code

    It was recently learned by the OpenStack community that running qemu-img
    on un-trusted images without a format pre-specified can present a
    security risk. Furthermore, some of these specific image formats have
    inherently unsafe features. This is rooted in how qemu-img operates
    where all image drivers are loaded and attempt to evaluate the input data.
    This can result in several different vectors which this patch works to
    close.

    This change imports the qemu-img handling code from Ironic-Lib into
    Ironic, and image format inspection code, which has been developed by
    the wider community to validate general safety of images before converting
    them for use in a deployment.

    This patch contains functional changes related to the hardening of these
    calls including how images are handled, and updates documentation to
    provide context and guidance to operators.

    Closes-Bug: 2071740
    Change-Id: I7fac5c64f89aec39e9755f0930ee47ff8f7aed47
    Signed-off-by: Julia Kreger <email address hidden>

tags: added: in-unmaintained-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic-python-agent (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/ironic-python-agent/+/927979
Committed: https://opendev.org/openstack/ironic-python-agent/commit/a1da3c9322305a5540a23ca19048ce7aa552a405
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit a1da3c9322305a5540a23ca19048ce7aa552a405
Author: Jay Faulkner <email address hidden>
Date: Mon Mar 11 17:29:58 2024 +0100

    Inspect non-raw images for safety

    This is a backport of two changes merged together to facilitate
    backporting:

    The first is a refactor of disk utilities:

    Import disk_{utils,partitioner} from ironic-lib

    With the iscsi deploy long gone, these modules are only used in IPA and
    in fact represent a large part of its critical logic. Having them
    separately sometimes makes fixing issues tricky if an interface of
    a function needs changing.

    This change imports the code mostly as it is, just removing run_as_root and
    a deprecated function, as well as moving configuration options to config.py.

    Also migrates one relevant function from ironic_lib.utils.

    The second is the fix for the security issue:

    Inspect non-raw images for safety

    When IPA gets a non-raw image, it performs an on-the-fly conversion
    using qemu-img convert, as well as running qemu-img frequently to get
    basic information about the image before validating it.

    Now, we ensure that before any qemu-img calls are made, that we have
    inspected the image for safety and pass through the detected format.

    If given a disk_format=raw image and image streaming is enabled
    (default), we retain the existing behavior of not inspecting it in
    any way and streaming it bit-perfect to the device. In this case, we
    never use qemu-based tools on the image at all.

    If given a disk_format=raw image and image streaming is disabled, this
    change fixes a bug where the image may have been converted if it was not
    actually raw in the first place. We now stream these bit-perfect to the
    device.

    Adds two config options:
    - [DEFAULT]/disable_deep_image_inspection, which can be set to "True" in
      order to disable all security features. Do not do this.
    - [DEFAULT]/permitted_image_formats, default raw,qcow2, for image types
      IPA should accept.

    Both of these configuration options are wired up to be set by the lookup
    data returned by Ironic at lookup time.

    This uses a image format inspection module imported from Nova; this
    inspector will eventually live in oslo.utils, at which point we'll
    migrate our usage of the inspector to it.

    Closes-Bug: #2071740
    Co-Authored-By: Dmitry Tantsur <email address hidden>
    Change-Id: I5254b80717cb5a7f9084e3eff32a00b968f987b7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (unmaintained/yoga)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/927975
Committed: https://opendev.org/openstack/ironic/commit/4b45309628d67f72eea3e7fef72ff72eb1997b12
Submitter: "Zuul (22348)"
Branch: unmaintained/yoga

commit 4b45309628d67f72eea3e7fef72ff72eb1997b12
Author: Julia Kreger <email address hidden>
Date: Thu Aug 8 12:42:20 2024 -0700

    CVE-2024-44982: Harden all image handling and conversion code

    It was recently learned by the OpenStack community that running qemu-img
    on un-trusted images without a format pre-specified can present a
    security risk. Furthermore, some of these specific image formats have
    inherently unsafe features. This is rooted in how qemu-img operates
    where all image drivers are loaded and attempt to evaluate the input data.
    This can result in several different vectors which this patch works to
    close.

    This change imports the qemu-img handling code from Ironic-Lib into
    Ironic, and image format inspection code, which has been developed by
    the wider community to validate general safety of images before converting
    them for use in a deployment.

    This patch contains functional changes related to the hardening of these
    calls including how images are handled, and updates documentation to
    provide context and guidance to operators.

    Closes-Bug: 2071740
    Change-Id: I7fac5c64f89aec39e9755f0930ee47ff8f7aed47
    Signed-off-by: Julia Kreger <email address hidden>

tags: added: in-unmaintained-yoga
Displaying first 40 and last 40 comments. View all 141 comments or add a comment.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.