[OSSA-2019-003] Nova Server Resource Faults Leak External Exception Details (CVE-2019-14433)

Bug #1837877 reported by Donny Davis on 2019-07-25
278
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Matt Riedemann
Ocata
High
Matt Riedemann
Pike
High
Matt Riedemann
Queens
High
Matt Riedemann
Rocky
High
Matt Riedemann
Stein
High
Matt Riedemann
OpenStack Security Advisory
High
Jeremy Stanley

Bug Description

This issue is being treated as a potential security risk under embargo. Please do not make any public mention of embargoed (private) security vulnerabilities before their coordinated publication by the OpenStack Vulnerability Management Team in the form of an official OpenStack Security Advisory. This includes discussion of the bug or associated fixes in public forums such as mailing lists, code review systems and bug trackers. Please also avoid private disclosure to other individuals not already approved for access to this information, and provide this same reminder to those who are made aware of the issue prior to publication. All discussion should remain confined to this private bug report, and any proposed fixes should be added to the bug as attachments.

It would appear Nova is revealing information that may be sensitive in error messages

http://lists.openstack.org/pipermail/openstack-infra/2019-July/006426.html

I attempted to hard-reboot it, and it went into an error state. The
initial error in the server status was

 {'message': 'Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreateWithFlags)', 'code': 500, 'created': '2019-07-25T07:25:25Z'}

After a short period, I tried again and got a different error state

 {'message': "internal error: process exited while connecting to monitor: lc=,keyid=masterKey0,iv=jHURYcYDkXqGBu4pC24bew==,format=base64 -drive 'file=rbd:volumes/volume-41553c15-6b12-4137-a318-7caf6a9eb44c:id=cinder:auth_supported=cephx\\;none:mon_host=172.24.0.56\\:6789", 'code': 500, 'created': '2019-07-25T07:27:21Z'}

I don't know if this is a setting or a bug. Better to report and close than not say anything I guess.

CVE References

Jeremy Stanley (fungi) wrote :

Since this report concerns a possible security risk, an incomplete security advisory task has been added while the core security reviewers for the affected project or projects confirm the bug and discuss the scope of any vulnerability along with potential solutions.

Changed in ossa:
status: New → Incomplete
description: updated
Jeremy Stanley (fungi) wrote :

Is this message being logged at DEBUG level? Or higher?

Jeremy Stanley (fungi) wrote :

Oh, wait, this was in a response from the API?

Donny Davis (donny-g) wrote :

I don't know, ianw posted it from an opendev outage yesterday. Can we add Mohammed Naser to this thread as well, I am pretty sure he would want to follow.

http://lists.openstack.org/pipermail/openstack-infra/2019-July/006426.html

Jeremy Stanley (fungi) wrote :

Yes, I've subscribed him just now and will also give him a heads-up via IRC privmsg.

Mohammed Naser (mnaser) wrote :

This is indeed an API response. I have researched this and I think we're safe (from a provider perspective):

{'message': "internal error: process exited while connecting to monitor: lc=,keyid=masterKey0,iv=jHURYcYDkXqGBu4pC24bew==,format=base64 -drive 'file=rbd:volumes/volume-41553c15-6b12-4137-a318-7caf6a9eb44c:id=cinder:auth_supported=cephx\\;none:mon_host=172.24.0.56\\:6789", 'code': 500, 'created': '2019-07-25T07:27:21Z'}

There are two concerns here:

"lc=,keyid=masterKey0,iv=jHURYcYDkXqGBu4pC24bew==,format=base64"
There is a generated key for every single VM booted inside QEMU, this points to that specific key ID and includes the initialization vendor (iv). It does not include the key that decrypts the cephx secret

"file=rbd:volumes/volume-41553c15-6b12-4137-a318-7caf6a9eb44c:id=cinder:auth_supported=cephx\\;none:mon_host=172.24.0.56\\:6789"
This actually is a bit more concerning because it leaks out the IP address of the Ceph monitor. In our case, it lives in an entirely different network that cannot be accessed but that's not the case for everyone...

"Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreateWithFlags)"
This is a libvirt exception that is getting bubbled up.

IMHO: We should *never* leak exceptions that bubble up in this way to the API layer, especially to a non-admin user.

Mohammed Naser (mnaser) wrote :

I'm going to dig through the code where this happened and what caused this result. I'll discuss privately with some of the Nova cores and ask them to be added here if they're interested.

Mohammed Naser (mnaser) wrote :

So:

nova/virt/libvirt/guest.py
===========
    def launch(self, pause=False):
        """Starts a created guest.

        :param pause: Indicates whether to start and pause the guest
        """
        flags = pause and libvirt.VIR_DOMAIN_START_PAUSED or 0
        try:
            return self._domain.createWithFlags(flags)
        except Exception:
            with excutils.save_and_reraise_exception():
                LOG.error('Error launching a defined domain '
                          'with XML: %s',
                          self._encoded_xml, errors='ignore')
===========

This bubbles up any exceptions from libvirt, the function that calls it..

nova/virt/libvirt/guest.py
===========
    def _create_domain(self, xml=None, domain=None,
                       power_on=True, pause=False, post_xml_callback=None):
        """Create a domain.

        Either domain or xml must be passed in. If both are passed, then
        the domain definition is overwritten from the xml.

        :returns guest.Guest: Guest just created
        """
        if xml:
            guest = libvirt_guest.Guest.create(xml, self._host)
            if post_xml_callback is not None:
                post_xml_callback()
        else:
            guest = libvirt_guest.Guest(domain)

        if power_on or pause:
            guest.launch(pause=pause)

        if not utils.is_neutron():
            guest.enable_hairpin()

        return guest
===========

This also handles no exceptions, so it goes up to _create_domain_and_network() which has a case to grab all generic exceptions

===========
        except Exception:
            # Any other error, be sure to clean up
            LOG.error('Failed to start libvirt guest', instance=instance)
            with excutils.save_and_reraise_exception():
                self._cleanup_failed_start(context, instance, network_info,
                                           block_device_info, guest,
                                           destroy_disks_on_failure)
===========

but that still bubbles up.. in this case, up to _hard_reboot() which has no exception handling, which is called by reboot() that has no exception handling, which means at this point, we've sent a *libvirt* exception up to the compute manager.. more specifically reboot_instance() which is decorated by the following:

===========
    @wrap_exception()
    @reverts_task_state
    @wrap_instance_event(prefix='compute')
    @wrap_instance_fault
===========

- wrap_exception: this seems to emit notifications on exceptions
- reverts_task_state: pretty self explanitory
- wrap_instance_event: this seems like it adds an instance action to the log
- wrap_instance_fault:L

Jeremy Stanley (fungi) wrote :

I don't object to subscribing more Nova developers, but note that I've already subscribed the Nova core security reviewers, so some of the folks you're thinking of could already be covered by that.

Mohammed Naser (mnaser) wrote :

I can't edit that and hit enter too quickly so I'll keep going:

- wrap_exception: this seems to emit notifications on exceptions
- reverts_task_state: pretty self explanitory
- wrap_instance_event: this seems like it adds an instance action to the log
- wrap_instance_fault: this is the culprit which seems to catch exceptions but set a fault on the VM (which is where this was exposed)

We'll have to decide where the issue was here, but IMHO, taking a non-handled exception and exposing it to the user directly can be pretty dangerous.

Mohammed Naser (mnaser) wrote :

I'm asking a few nova-cores in private if they have some time to go over this and at this point to decide on how the best way to fix this without playing whack-a-mole on all the possible exceptions that a driver can raise.

Matt Riedemann (mriedem) wrote :

So the issue is the traceback with the sensitive details in the instance fault, right? That's not exposed to end users, only admins (not even really configurable in the policy specific to faults, it's just the is_admin rule):

https://developer.openstack.org/api-guide/compute/faults.html

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L562

Now the logic in ^ is a bit suspect, if the fault code is not 500 then we could have issues. What is the fault code in this case? Are non-admin tenant users able to see this?

Note that the related instance action event details would be in a similar situation since the records a traceback when the reboot fails on the instance action event:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/manager.py#L3447

https://developer.openstack.org/api-ref/compute/?expanded=show-server-action-details-detail#show-server-action-details

Again that traceback in the event should only be viewable by admins by default policy (os_compute_api:os-instance-actions:events rule).

Matt Riedemann (mriedem) wrote :

From:

{'message': "internal error: process exited while connecting to monitor: lc=,keyid=masterKey0,iv=jHURYcYDkXqGBu4pC24bew==,format=base64 -drive 'file=rbd:volumes/volume-41553c15-6b12-4137-a318-7caf6a9eb44c:id=cinder:auth_supported=cephx\\;none:mon_host=172.24.0.56\\:6789", 'code': 500, 'created': '2019-07-25T07:27:21Z'}

It looks like the code is 500 so https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L562 should be OK and non-admin users shouldn't see those fault details.

Matt Riedemann (mriedem) wrote :

Having said all that, the fault interface, like instance actions, are not a very solid interface, meaning they don't really get audited, so I wouldn't be surprised if non-500 faults could be exposed to non-admin users depending on the exception that gets raised and logged. In this case it's a libvirtError that results in some generic NovaException so that's why it's a 500 code. For example, any nova exception class that extends nova.exception.Invalid would have a code of 400 and those could potentially leak out I suppose.

Matt Riedemann (mriedem) wrote :

Hmm, looking at that code again, details aren't in the response body of the fault but message is (always) and that's what has the sensitive content in it:

{'message': "internal error: process exited while connecting to monitor: lc=,keyid=masterKey0,iv=jHURYcYDkXqGBu4pC24bew==,format=base64 -drive 'file=rbd:volumes/volume-41553c15-6b12-4137-a318-7caf6a9eb44c:id=cinder:auth_supported=cephx\\;none:mon_host=172.24.0.56\\:6789", 'code': 500, 'created': '2019-07-25T07:27:21Z'}

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L553

Yeah....that's less good...

Matt Riedemann (mriedem) wrote :

(10:26:59 AM) mriedem: were you reading that fault as admin or non-admin
(10:27:13 AM) mriedem: if ianw reported it for infra, i'm guessing non-admin
(10:27:16 AM) mriedem: infra tenant on vexxhost
(10:27:22 AM) mnaser: yep infra teannt on vexxhost is not admin
(10:27:28 AM) mnaser: so as non-admin

Matt Riedemann (mriedem) wrote :

Ugh, OK. So in this particular case, the instance goes to ERROR state from the reboot failure here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/manager.py#L3537

The exception is re-raised and the fault (libvirtError) gets recorded by the wrap_instance_fault decorator:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/manager.py#L3448

This is what sets the message field on the InstanceFault:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L109

This is likely where the libvirtError gets turned into the message:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L78

Trying to play whack-a-mole with all of the places that could raise unexpected exceptions from lower layers, like the hypervisor in this case, is going to be a losing game. We should probably just restrict the fault message to admin-only since we don't audit them and this is python so explicitly handling typed exceptions isn't a thing.

Deciding *where* to obfuscate the message is the question I think. We could say that for any non-NovaException types (where format_message() is implemented), we could just include the message if context.is_admin here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L78

But that won't fix old compute nodes until you patch them. It also doesn't mean that NovaExceptions handled here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L73

Wouldn't also expose something from a lower level because we have several exceptions like this:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/exception.py#L267

Where whatever lower level error happens gets turned into the "reason" value.

Given that, it probably makes most sense to just add a new policy rule that is checked in the API to determine if the fault message can be shown here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L553

And the policy rule would default to admin-only. A policy rule is better since deployments could tailor it so non-admin support personal could still see those details but not the tenant user that owns the server.

If we do that, I'd also fix the logic here so we rely on policy rather than this janky code logic:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L562

Matt Riedemann (mriedem) wrote :

Per my suggested solution in comment 17, regarding backports, I think the API behavior change is OK on stable in this case since we'd be:

1. secure by default
2. justify the change b/c of security
3. have a release note
4. and the policy rule allows a deployer to re-expose the leaks if they want, e.g. to their private cloud CI/CD dev team that is all internal traffic

Dan Smith (danms) wrote :

Personally, I don't like the policy rule on the fault message field. I mean, it's okay if we want that in general, but that's not a reasonable solution to the problem I think, because it means nobody gets any information about why things failed anymore (like even NoValidHost).

It sucks to have to whack-a-mole the exceptions, but I think addressing it in compute/utils where we stringify unknown exceptions is a good plan. However, I don't think we can conditionally do that based on context.is_admin because some admin doing an operation would record this same information in the fault which another user could then see.

So I think what we should do is change the behavior which stringifies any unknown (i.e. non-NovaException) exception to always just grab the exception.__name__ (which we already do if the exception doesn't stringify to something non-Falsey) instead of the full message and record that. The details will still be there for admin viewers, but we treat any non-NovaException as could-be-sensitive and only record the name.

We'll need to backport it, it won't be fixed until all computes are upgraded, and people may have sensitive things in their databases now that need scrubbing. However, this is the right solution, IMHO. If we want a message policy toggle as well (in general or to mitigate exposure while scrubbing and upgrading) then that's fine I guess, although it does seem like an unfortunate thing for admins to turn off such that failed instance boots just go to ERROR with no explanation.

melanie witt (melwitt) wrote :

Just wanted to point out that the fault message is the only communication we have to the user about why the instance is in ERROR state. If we make that admin-only, then there will be no indication to non-admin end users about why their instance is in ERROR. It would be a major change and I'm not sure whether we should to go that far? I thought the end user (even non-admin) being able to have some info about why their instance is in ERROR is important for user experience, even if the info is something the "NoValidHost".

In general, I think we usually control what goes into the fault message (above the virt driver layer), but inside the drivers, looks like it's the wild west. I'm wondering if we could do something more like expose only the first sentence of a lower-level error in our handling decorators, by truncating it. And leave the full traceback for the fault "detail" which is admin-only by default policy.

I do empathize that we don't want to be playing whack-a-mole here. I'm just a bit hesitant about the idea of removing all ERROR state info for non-admin users.

Mohammed Naser (mnaser) wrote :

Could it be easier that the API layer will only expose exception+message when adming but only the exception name when not admin (i.e. user would only see "libvirtError" for example)

melanie witt (melwitt) wrote :

Note: I spent a long time typing my reply and Dan's reply landed before mine, so just wanted to note here that I'm on the same page and agree.

Matt Riedemann (mriedem) wrote :

@mnaser: Dan and I talked through this before he posted his comment and as he noted admins will still see the details in the fault which is the traceback and that's only for non-nova exceptions, but you'd still have the info you need as admin. So I don't think what you're proposing in comment 21 is worthwhile, and it would be complicated because like the examples Dan gives, where the fault is record (compute) and where it's exposed (api) are totally different and knowing the context / policy stuff at the compute isn't really possible with the current fields on the fault object, i.e. if we had message like today, details like today, and then some new field like "type" or something and in the API only showed type for the message to non-admins or whatever. That's just a bigger change.

I plan on working a patch with functional test for Dan's suggestion in comment 19. When I wrote my ideas before that I failed to remember NoValidHost which is important to not break for UX.

Changed in nova:
status: New → Triaged
assignee: nobody → Matt Riedemann (mriedem)
importance: Undecided → High
Matt Riedemann (mriedem) wrote :

Here is a patch with the functional test that recreates the issue. Next I'll build on this to add the proposed fix. Note the 2nd FIXME in the test about the error message not showing up in the traceback details of the fault which could be a problem since the admin won't see the error, only the type and traceback.

Matt Riedemann (mriedem) wrote :

Here is a patch with the proposed code fix and changes to the functional test. Note that in this case the non-admin tenant user:

- only sees the exception type class name for the fault message
- does not see details in the fault response

The admin user:

- only sees the exception type class name for the fault message
- sees the exception value (previously what they'd see in the fault message) and traceback in the fault details

WIP since I need to fix test fallout and add a release note once we have a CVE/OSSA.

Matt Riedemann (mriedem) wrote :

This version contains a diff with -U25 for easier review and also resolves some failing unit tests as a result of how the fault message and details are stored. I've tried to document the impacted unit tests to make that clear.

This version also includes a release note with a TODO to include the CVE and OSSA information.

For all intents and purposes, besides the reno, this is my proposed version for final review.

Dan Smith (danms) wrote :

The patch in #26 looks okay to me. I sent Matt a couple nits via IRC, but I think it's probably okay in current form if need be.

Matt Riedemann (mriedem) wrote :

I found a typo in my release note so the bug link doesn't work, but that is an easy fix.

I also probably need to check out of the compute API reference and API guide since they might need updating when talking about faults:

https://docs.openstack.org/api-guide/compute/faults.html

Matt Riedemann (mriedem) wrote :

Looking over the fault API reference and API guide they seem sufficiently generic that I don't think I really need to add anything in this patch since the examples used are things that wouldn't be changed by this patch.

Matt Riedemann (mriedem) wrote :

Here is a cherry-pick patch for stein.

Changed in nova:
status: Triaged → In Progress
Matt Riedemann (mriedem) wrote :

Here is a cherry-pick for rocky.

Matt Riedemann (mriedem) wrote :

Here is a cherry-pick patch for queens.

Matt Riedemann (mriedem) wrote :

The pike backport requires https://review.opendev.org/#/c/509935/ for the functional test to work.

Matt Riedemann (mriedem) wrote :

Here is the backport for the pike fake driver power off patch.

Matt Riedemann (mriedem) wrote :

Here is the pike backport of the fault fix.

melanie witt (melwitt) wrote :

I reviewed the patch in comment #26 and it LGTM too.

Jeremy Stanley (fungi) wrote :

Please review the proposed impact description below. Note I did not include a stable/pike point release as it's under extended maintenance. Donny, if you'd like an employer or organization included in the reporter credit along with your name, please let me know. If this looks correct, I'll use it to request a private CVE and then we can set the coordinated disclosure schedule for it...

Title: Nova Server Resource Faults Leak External Exception Details
Reporter: Donny Davis
Products: Nova
Affects: <17.0.12,>=18.0.0<18.2.2,>=19.0.0<19.0.2

Description:
Donny Davis reported a vulnerability in Nova Compute resource fault
handling. If an API request from an authenticated user ends in a
fault condition due to an external exception, details of the
underlying environment may be leaked in the response and could
include sensitive configuration or other data.

Matthew Thode (prometheanfire) wrote :

proposed impact description looks good if a bit dry.

Donny Davis (donny-g) wrote :

Jeremy, I think it looks great. Can you add Intel as my employer.

Thanks

Matt Riedemann (mriedem) wrote :

The impact description looks OK to me as well. I'm working on the ocata patch this morning.

Matt Riedemann (mriedem) wrote :

The ocata backports will also require https://review.opendev.org/#/c/483986/ for the functional test to work. It's a clean backport from pike so not a big deal.

Matt Riedemann (mriedem) wrote :

Here are the three ocata patches, applied in this order:

1. Implement-power_off-power_on-for-the-FakeDriver-ocata.patch
2. fix-unshelve-notification-test-instability-ocata.patch
3. WIP-Obfuscate-non-nova-server-fault-message-ocata.patch

Jeremy Stanley (fungi) on 2019-07-29
Changed in ossa:
status: Incomplete → Triaged
importance: Undecided → High
assignee: nobody → Jeremy Stanley (fungi)
Jeremy Stanley (fungi) wrote :

A CVE assignment has been requested with MITRE using the following impact description:

Title: Nova Server Resource Faults Leak External Exception Details
Reporter: Donny Davis (Intel)
Products: Nova
Affects: <17.0.12,>=18.0.0<18.2.2,>=19.0.0<19.0.2

Description:
Donny Davis with Intel reported a vulnerability in Nova Compute
resource fault handling. If an API request from an authenticated
user ends in a fault condition due to an external exception, details
of the underlying environment may be leaked in the response and
could include sensitive configuration or other data.

Changed in ossa:
status: Triaged → In Progress
Jeremy Stanley (fungi) on 2019-07-29
summary: - Error message reveals ceph information
+ Nova Server Resource Faults Leak External Exception Details
+ (CVE-2019-14433)

According to our Embargoed Disclosure[*] timeline, assuming the patches attached to this bug are considered ready and we send the pre-OSSA notification to downstream stakeholders on Wednesday (to give those subscribed to this bug a reasonable chance to object to the schedule) the earliest we can open this bug, push changes to public code review, and publish the OSSA will be Tuesday, August 6. Does this work for everyone?

Jeremy Stanley (fungi) wrote :
Tony Breeds (o-tony) wrote :

For what it's worth this looks good from a stable/EM Pov.

Matt Riedemann (mriedem) wrote :

> publish the OSSA will be Tuesday, August 6. Does this work for everyone?

Works for me. Dan is on vacation this week but back next week.

melanie witt (melwitt) wrote :

>> publish the OSSA will be Tuesday, August 6. Does this work for everyone?

> Works for me. Dan is on vacation this week but back next week.

Works for me too, if my assistance is needed in any way.

Jeremy Stanley (fungi) wrote :

Downstream stakeholders have been privately provided with the patches for master and branches currently under stable maintenance, and notified of the impending disclosure date/time. If they have any feedback, I'll relay it here or subscribe them to this bug report so they can do so themselves.

Changed in ossa:
status: In Progress → Fix Committed
Corey Bryant (corey.bryant) wrote :

Thanks very much for your work on this. Just one nit comment. I think "obfuscate" might not be the right word to use in the commit message. Maybe just use "replace"? I think obfuscate is not a desired security approach in the security community. See "security through obscurity".

Matt Riedemann (mriedem) wrote :

Sure, will take that into account when pushing the patches to Gerrit for public review if it's all the same. I need to update the release note on the patches with the CVE as well when it goes up for public review.

Jeremy Stanley (fungi) wrote :

To be clear, we'll need to know the review links for each patch so we can incorporate them into the public advisory. The sequence on Tuesday will be:

1. Switch bug from Private Security to Public Security

2. Push updated commits for openstack/nova into Gerrit for final review/approval

3. Confirm that preliminary CI jobs are succeeding and approve changes

4. Push commit to openstack/ossa with advisory including links to fixes in review

5. Distribute advisory to appropriate public mailing lists

6. Confirm fixes have merged and submit stable point release requests

Jeremy Stanley (fungi) on 2019-08-06
information type: Private Security → Public Security
Jeremy Stanley (fungi) on 2019-08-06
summary: - Nova Server Resource Faults Leak External Exception Details
- (CVE-2019-14433)
+ [OSSA-2019-003] Nova Server Resource Faults Leak External Exception
+ Details (CVE-2019-14433)

Reviewed: https://review.opendev.org/674909
Committed: https://git.openstack.org/cgit/openstack/ossa/commit/?id=6b0b3a50e69e0a8f6f3207ed15a2aaaa30391580
Submitter: Zuul
Branch: master

commit 6b0b3a50e69e0a8f6f3207ed15a2aaaa30391580
Author: Jeremy Stanley <email address hidden>
Date: Tue Aug 6 14:43:44 2019 +0000

    Add OSSA-2019-003 (CVE-2019-14433)

    Change-Id: I22c4b17a0ad1b6197a97c6b2670fe5d1a6a7406f
    Related-Bug: #1837877

Reviewed: https://review.opendev.org/674821
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=298b337a16c0d10916b4431c436d19b3d6f5360e
Submitter: Zuul
Branch: master

commit 298b337a16c0d10916b4431c436d19b3d6f5360e
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

    Replace non-nova server fault message

    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.

    The fault "details" are only shown to users with the
    admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

    SecurityImpact: This change contains a fix for CVE-2019-14433.

    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877

Changed in nova:
status: In Progress → Fix Released

Reviewed: https://review.opendev.org/674828
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=67651881163b75eb1983eaf753471a91ecec35eb
Submitter: Zuul
Branch: stable/stein

commit 67651881163b75eb1983eaf753471a91ecec35eb
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

    Replace non-nova server fault message

    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.

    The fault "details" are only shown to users with the
    admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

    SecurityImpact: This change contains a fix for CVE-2019-14433.

    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)

Reviewed: https://review.opendev.org/674848
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e0b91a5b1e89bd0506dc6da86bc61f1708f0215a
Submitter: Zuul
Branch: stable/rocky

commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

    Replace non-nova server fault message

    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.

    The fault "details" are only shown to users with the
    admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

    SecurityImpact: This change contains a fix for CVE-2019-14433.

    NOTE(mriedem): The functional test imports change here
    because Idaed39629095f86d24a54334c699a26c218c6593 is not
    in Rocky so the PlacementFixture comes from nova_fixtures.

    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)

Reviewed: https://review.opendev.org/674859
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3dcefba60a4f4553888a9dfda9fe3bee094d617a
Submitter: Zuul
Branch: stable/queens

commit 3dcefba60a4f4553888a9dfda9fe3bee094d617a
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

    Replace non-nova server fault message

    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.

    The fault "details" are only shown to users with the
    admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

    SecurityImpact: This change contains a fix for CVE-2019-14433.

    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)
    (cherry picked from commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a)

Jeremy Stanley (fungi) wrote :

The extended maintenance branch patches are still up for review, but since all the stable maintenance branch patches have now merged I'm closing the security advisory task.

Changed in ossa:
status: Fix Committed → Fix Released

Reviewed: https://review.opendev.org/674877
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6da28b0aa9b6a0ba67460f88dd2c397605b0679b
Submitter: Zuul
Branch: stable/pike

commit 6da28b0aa9b6a0ba67460f88dd2c397605b0679b
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

    Replace non-nova server fault message

    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.

    The fault "details" are only shown to users with the
    admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

    SecurityImpact: This change contains a fix for CVE-2019-14433.

    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)
    (cherry picked from commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a)
    (cherry picked from commit 3dcefba60a4f4553888a9dfda9fe3bee094d617a)

This issue was fixed in the openstack/nova 19.0.2 release.

This issue was fixed in the openstack/nova 18.2.2 release.

Reviewed: https://review.opendev.org/674908
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=02ea2c25eddebdc220d66e4a01d8deded7c77a57
Submitter: Zuul
Branch: stable/ocata

commit 02ea2c25eddebdc220d66e4a01d8deded7c77a57
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

    Replace non-nova server fault message

    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.

    The fault "details" are only shown to users with the
    admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

    SecurityImpact: This change contains a fix for CVE-2019-14433.

    NOTE(mriedem): In this backport the functional test is
    modified slightly to remove the DiskFilter since we are
    using placement for scheduler filtering on DISK_GB.

    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)
    (cherry picked from commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a)
    (cherry picked from commit 3dcefba60a4f4553888a9dfda9fe3bee094d617a)
    (cherry picked from commit 6da28b0aa9b6a0ba67460f88dd2c397605b0679b)

This issue was fixed in the openstack/nova 17.0.12 release.

To post a comment you must log in.
This report contains Public Security information  Edit
Everyone can see this security related information.

Other bug subscribers

Bug attachments