Bug #1837877 “[OSSA-2019-003] Nova Server Resource Faults Leak E...” : Series rocky : Bugs : OpenStack Compute (nova)

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-07-25:

#1

Since this report concerns a possible security risk, an incomplete security advisory task has been added while the core security reviewers for the affected project or projects confirm the bug and discuss the scope of any vulnerability along with potential solutions.

Changed in ossa:
status:	New → Incomplete
description:	updated

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-07-25:

#2

Is this message being logged at DEBUG level? Or higher?

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-07-25:

#3

Oh, wait, this was in a response from the API?

Revision history for this message

Donny Davis (donny-g) wrote on 2019-07-25:

#4

I don't know, ianw posted it from an opendev outage yesterday. Can we add Mohammed Naser to this thread as well, I am pretty sure he would want to follow.

http://lists.openstack.org/pipermail/openstack-infra/2019-July/006426.html

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-07-25:

#5

Yes, I've subscribed him just now and will also give him a heads-up via IRC privmsg.

Revision history for this message

Mohammed Naser (mnaser) wrote on 2019-07-25:

#6

This is indeed an API response. I have researched this and I think we're safe (from a provider perspective):

{'message': "internal error: process exited while connecting to monitor: lc=,keyid=masterKey0,iv=jHURYcYDkXqGBu4pC24bew==,format=base64 -drive 'file=rbd:volumes/volume-41553c15-6b12-4137-a318-7caf6a9eb44c:id=cinder:auth_supported=cephx\\;none:mon_host=172.24.0.56\\:6789", 'code': 500, 'created': '2019-07-25T07:27:21Z'}

There are two concerns here:

"lc=,keyid=masterKey0,iv=jHURYcYDkXqGBu4pC24bew==,format=base64"
There is a generated key for every single VM booted inside QEMU, this points to that specific key ID and includes the initialization vendor (iv). It does not include the key that decrypts the cephx secret

"file=rbd:volumes/volume-41553c15-6b12-4137-a318-7caf6a9eb44c:id=cinder:auth_supported=cephx\\;none:mon_host=172.24.0.56\\:6789"
This actually is a bit more concerning because it leaks out the IP address of the Ceph monitor. In our case, it lives in an entirely different network that cannot be accessed but that's not the case for everyone...

"Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreateWithFlags)"
This is a libvirt exception that is getting bubbled up.

IMHO: We should *never* leak exceptions that bubble up in this way to the API layer, especially to a non-admin user.

Revision history for this message

Mohammed Naser (mnaser) wrote on 2019-07-25:

#7

I'm going to dig through the code where this happened and what caused this result. I'll discuss privately with some of the Nova cores and ask them to be added here if they're interested.

Revision history for this message

Mohammed Naser (mnaser) wrote on 2019-07-25:

#8

So:

nova/virt/libvirt/guest.py
===========
def launch(self, pause=False):
"""Starts a created guest.

        :param pause: Indicates whether to start and pause the guest
        """
        flags = pause and libvirt.VIR_DOMAIN_START_PAUSED or 0
        try:
            return self._domain.createWithFlags(flags)
        except Exception:
            with excutils.save_and_reraise_exception():
                LOG.error('Error launching a defined domain '
                          'with XML: %s',
                          self._encoded_xml, errors='ignore')
===========

This bubbles up any exceptions from libvirt, the function that calls it..

nova/virt/libvirt/guest.py
===========
    def _create_domain(self, xml=None, domain=None,
                       power_on=True, pause=False, post_xml_callback=None):
        """Create a domain.

Either domain or xml must be passed in. If both are passed, then
the domain definition is overwritten from the xml.

        :returns guest.Guest: Guest just created
        """
        if xml:
            guest = libvirt_guest.Guest.create(xml, self._host)
            if post_xml_callback is not None:
                post_xml_callback()
        else:
            guest = libvirt_guest.Guest(domain)

if power_on or pause:
guest.launch(pause=pause)

if not utils.is_neutron():
guest.enable_hairpin()

return guest
===========

This also handles no exceptions, so it goes up to _create_domain_and_network() which has a case to grab all generic exceptions

===========
        except Exception:
            # Any other error, be sure to clean up
            LOG.error('Failed to start libvirt guest', instance=instance)
            with excutils.save_and_reraise_exception():
                self._cleanup_failed_start(context, instance, network_info,
                                           block_device_info, guest,
                                           destroy_disks_on_failure)
===========

but that still bubbles up.. in this case, up to _hard_reboot() which has no exception handling, which is called by reboot() that has no exception handling, which means at this point, we've sent a *libvirt* exception up to the compute manager.. more specifically reboot_instance() which is decorated by the following:

===========
    @wrap_exception()
    @reverts_task_state
    @wrap_instance_event(prefix='compute')
    @wrap_instance_fault
===========

- wrap_exception: this seems to emit notifications on exceptions
- reverts_task_state: pretty self explanitory
- wrap_instance_event: this seems like it adds an instance action to the log
- wrap_instance_fault:L

So:

nova/virt/libvirt/guest.py
===========
    def launch(self, pause=False):
        """Starts a created guest.

:param pause: Indicates whether to start and pause the guest
        """
        flags = pause and libvirt.VIR_DOMAIN_START_PAUSED or 0
        try:
            return self._domain.createWithFlags(flags)
        except Exception:
            with excutils.save_and_reraise_exception():
                LOG.error('Error launching a defined domain '
                          'with XML: %s',
                          self._encoded_xml, errors='ignore')
===========

This bubbles up any exceptions from libvirt, the function that calls it..

nova/virt/libvirt/guest.py
===========
    def _create_domain(self, xml=None, domain=None,
                       power_on=True, pause=False, post_xml_callback=None):
        """Create a domain.

Either domain or xml must be passed in. If both are passed, then
        the domain definition is overwritten from the xml.

:returns guest.Guest: Guest just created
        """
        if xml:
            guest = libvirt_guest.Guest.create(xml, self._host)
            if post_xml_callback is not None:
                post_xml_callback()
        else:
            guest = libvirt_guest.Guest(domain)

if power_on or pause:
            guest.launch(pause=pause)

if not utils.is_neutron():
            guest.enable_hairpin()

return guest
===========

This also handles no exceptions, so it goes up to _create_domain_and_network() which has a case to grab all generic exceptions

===========
        except Exception:
            # Any other error, be sure to clean up
            LOG.error('Failed to start libvirt guest', instance=instance)
            with excutils.save_and_reraise_exception():
                self._cleanup_failed_start(context, instance, network_info,
                                           block_device_info, guest,
                                           destroy_disks_on_failure)
===========

but that still bubbles up.. in this case, up to _hard_reboot() which has no exception handling, which is called by reboot() that has no exception handling, which means at this point, we've sent a *libvirt* exception up to the compute manager.. more specifically reboot_instance() which is decorated by the following:

===========
    @wrap_exception()
    @reverts_task_state
    @wrap_instance_event(prefix='compute')
    @wrap_instance_fault
===========

- wrap_exception: this seems to emit notifications on exceptions
- reverts_task_state: pretty self explanitory
- wrap_instance_event: this seems like it adds an instance action to the log
- wrap_instance_fault:L

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-07-25:

#9

I don't object to subscribing more Nova developers, but note that I've already subscribed the Nova core security reviewers, so some of the folks you're thinking of could already be covered by that.

Revision history for this message

Mohammed Naser (mnaser) wrote on 2019-07-25:

#10

I can't edit that and hit enter too quickly so I'll keep going:

- wrap_exception: this seems to emit notifications on exceptions
- reverts_task_state: pretty self explanitory
- wrap_instance_event: this seems like it adds an instance action to the log
- wrap_instance_fault: this is the culprit which seems to catch exceptions but set a fault on the VM (which is where this was exposed)

We'll have to decide where the issue was here, but IMHO, taking a non-handled exception and exposing it to the user directly can be pretty dangerous.

Revision history for this message

Mohammed Naser (mnaser) wrote on 2019-07-25:

#11

I'm asking a few nova-cores in private if they have some time to go over this and at this point to decide on how the best way to fix this without playing whack-a-mole on all the possible exceptions that a driver can raise.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-25:

#12

So the issue is the traceback with the sensitive details in the instance fault, right? That's not exposed to end users, only admins (not even really configurable in the policy specific to faults, it's just the is_admin rule):

https://developer.openstack.org/api-guide/compute/faults.html

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L562

Now the logic in ^ is a bit suspect, if the fault code is not 500 then we could have issues. What is the fault code in this case? Are non-admin tenant users able to see this?

Note that the related instance action event details would be in a similar situation since the records a traceback when the reboot fails on the instance action event:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/manager.py#L3447

https://developer.openstack.org/api-ref/compute/?expanded=show-server-action-details-detail#show-server-action-details

Again that traceback in the event should only be viewable by admins by default policy (os_compute_api:os-instance-actions:events rule).

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-25:

#13

From:

{'message': "internal error: process exited while connecting to monitor: lc=,keyid=masterKey0,iv=jHURYcYDkXqGBu4pC24bew==,format=base64 -drive 'file=rbd:volumes/volume-41553c15-6b12-4137-a318-7caf6a9eb44c:id=cinder:auth_supported=cephx\\;none:mon_host=172.24.0.56\\:6789", 'code': 500, 'created': '2019-07-25T07:27:21Z'}

It looks like the code is 500 so https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L562 should be OK and non-admin users shouldn't see those fault details.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-25:

#14

Having said all that, the fault interface, like instance actions, are not a very solid interface, meaning they don't really get audited, so I wouldn't be surprised if non-500 faults could be exposed to non-admin users depending on the exception that gets raised and logged. In this case it's a libvirtError that results in some generic NovaException so that's why it's a 500 code. For example, any nova exception class that extends nova.exception.Invalid would have a code of 400 and those could potentially leak out I suppose.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-25:

#15

Hmm, looking at that code again, details aren't in the response body of the fault but message is (always) and that's what has the sensitive content in it:

{'message': "internal error: process exited while connecting to monitor: lc=,keyid=masterKey0,iv=jHURYcYDkXqGBu4pC24bew==,format=base64 -drive 'file=rbd:volumes/volume-41553c15-6b12-4137-a318-7caf6a9eb44c:id=cinder:auth_supported=cephx\\;none:mon_host=172.24.0.56\\:6789", 'code': 500, 'created': '2019-07-25T07:27:21Z'}

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L553

Yeah....that's less good...

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-25:

#16

(10:26:59 AM) mriedem: were you reading that fault as admin or non-admin
(10:27:13 AM) mriedem: if ianw reported it for infra, i'm guessing non-admin
(10:27:16 AM) mriedem: infra tenant on vexxhost
(10:27:22 AM) mnaser: yep infra teannt on vexxhost is not admin
(10:27:28 AM) mnaser: so as non-admin

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-25:

#17

Ugh, OK. So in this particular case, the instance goes to ERROR state from the reboot failure here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/manager.py#L3537

The exception is re-raised and the fault (libvirtError) gets recorded by the wrap_instance_fault decorator:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/manager.py#L3448

This is what sets the message field on the InstanceFault:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L109

This is likely where the libvirtError gets turned into the message:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L78

Trying to play whack-a-mole with all of the places that could raise unexpected exceptions from lower layers, like the hypervisor in this case, is going to be a losing game. We should probably just restrict the fault message to admin-only since we don't audit them and this is python so explicitly handling typed exceptions isn't a thing.

Deciding *where* to obfuscate the message is the question I think. We could say that for any non-NovaException types (where format_message() is implemented), we could just include the message if context.is_admin here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L78

But that won't fix old compute nodes until you patch them. It also doesn't mean that NovaExceptions handled here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L73

Wouldn't also expose something from a lower level because we have several exceptions like this:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/exception.py#L267

Where whatever lower level error happens gets turned into the "reason" value.

Given that, it probably makes most sense to just add a new policy rule that is checked in the API to determine if the fault message can be shown here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L553

And the policy rule would default to admin-only. A policy rule is better since deployments could tailor it so non-admin support personal could still see those details but not the tenant user that owns the server.

If we do that, I'd also fix the logic here so we rely on policy rather than this janky code logic:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L562

Ugh, OK. So in this particular case, the instance goes to ERROR state from the reboot failure here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/manager.py#L3537

The exception is re-raised and the fault (libvirtError) gets recorded by the wrap_instance_fault decorator:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/manager.py#L3448

This is what sets the message field on the InstanceFault:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L109

This is likely where the libvirtError gets turned into the message:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L78

Trying to play whack-a-mole with all of the places that could raise unexpected exceptions from lower layers, like the hypervisor in this case, is going to be a losing game. We should probably just restrict the fault message to admin-only since we don't audit them and this is python so explicitly handling typed exceptions isn't a thing.

Deciding *where* to obfuscate the message is the question I think. We could say that for any non-NovaException types (where format_message() is implemented), we could just include the message if context.is_admin here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L78

But that won't fix old compute nodes until you patch them. It also doesn't mean that NovaExceptions handled here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/compute/utils.py#L73

Wouldn't also expose something from a lower level because we have several exceptions like this:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/exception.py#L267

Where whatever lower level error happens gets turned into the "reason" value.

Given that, it probably makes most sense to just add a new policy rule that is checked in the API to determine if the fault message can be shown here:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L553

And the policy rule would default to admin-only. A policy rule is better since deployments could tailor it so non-admin support personal could still see those details but not the tenant user that owns the server.

If we do that, I'd also fix the logic here so we rely on policy rather than this janky code logic:

https://github.com/openstack/nova/blob/2c0cb71fb0ac0d502dc9fed24211e1ef15407b8f/nova/api/openstack/compute/views/servers.py#L562

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-25:

#18

Per my suggested solution in comment 17, regarding backports, I think the API behavior change is OK on stable in this case since we'd be:

1. secure by default
2. justify the change b/c of security
3. have a release note
4. and the policy rule allows a deployer to re-expose the leaks if they want, e.g. to their private cloud CI/CD dev team that is all internal traffic

Revision history for this message

Dan Smith (danms) wrote on 2019-07-25:

#19

Personally, I don't like the policy rule on the fault message field. I mean, it's okay if we want that in general, but that's not a reasonable solution to the problem I think, because it means nobody gets any information about why things failed anymore (like even NoValidHost).

It sucks to have to whack-a-mole the exceptions, but I think addressing it in compute/utils where we stringify unknown exceptions is a good plan. However, I don't think we can conditionally do that based on context.is_admin because some admin doing an operation would record this same information in the fault which another user could then see.

So I think what we should do is change the behavior which stringifies any unknown (i.e. non-NovaException) exception to always just grab the exception.__name__ (which we already do if the exception doesn't stringify to something non-Falsey) instead of the full message and record that. The details will still be there for admin viewers, but we treat any non-NovaException as could-be-sensitive and only record the name.

We'll need to backport it, it won't be fixed until all computes are upgraded, and people may have sensitive things in their databases now that need scrubbing. However, this is the right solution, IMHO. If we want a message policy toggle as well (in general or to mitigate exposure while scrubbing and upgrading) then that's fine I guess, although it does seem like an unfortunate thing for admins to turn off such that failed instance boots just go to ERROR with no explanation.

Revision history for this message

melanie witt (melwitt) wrote on 2019-07-25:

#20

Just wanted to point out that the fault message is the only communication we have to the user about why the instance is in ERROR state. If we make that admin-only, then there will be no indication to non-admin end users about why their instance is in ERROR. It would be a major change and I'm not sure whether we should to go that far? I thought the end user (even non-admin) being able to have some info about why their instance is in ERROR is important for user experience, even if the info is something the "NoValidHost".

In general, I think we usually control what goes into the fault message (above the virt driver layer), but inside the drivers, looks like it's the wild west. I'm wondering if we could do something more like expose only the first sentence of a lower-level error in our handling decorators, by truncating it. And leave the full traceback for the fault "detail" which is admin-only by default policy.

I do empathize that we don't want to be playing whack-a-mole here. I'm just a bit hesitant about the idea of removing all ERROR state info for non-admin users.

Revision history for this message

Mohammed Naser (mnaser) wrote on 2019-07-25:

#21

Could it be easier that the API layer will only expose exception+message when adming but only the exception name when not admin (i.e. user would only see "libvirtError" for example)

Revision history for this message

melanie witt (melwitt) wrote on 2019-07-25:

#22

Note: I spent a long time typing my reply and Dan's reply landed before mine, so just wanted to note here that I'm on the same page and agree.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-25:

#23

@mnaser: Dan and I talked through this before he posted his comment and as he noted admins will still see the details in the fault which is the traceback and that's only for non-nova exceptions, but you'd still have the info you need as admin. So I don't think what you're proposing in comment 21 is worthwhile, and it would be complicated because like the examples Dan gives, where the fault is record (compute) and where it's exposed (api) are totally different and knowing the context / policy stuff at the compute isn't really possible with the current fields on the fault object, i.e. if we had message like today, details like today, and then some new field like "type" or something and in the API only showed type for the message to non-admins or whatever. That's just a bigger change.

I plan on working a patch with functional test for Dan's suggestion in comment 19. When I wrote my ideas before that I failed to remember NoValidHost which is important to not break for UX.

Changed in nova:
status:	New → Triaged
assignee:	nobody → Matt Riedemann (mriedem)
importance:	Undecided → High

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-26:

#24

WIP-Obfuscate-non-nova-server-fault-message.patch Edit (7.5 KiB, text/plain)

Here is a patch with the functional test that recreates the issue. Next I'll build on this to add the proposed fix. Note the 2nd FIXME in the test about the error message not showing up in the traceback details of the fault which could be a problem since the admin won't see the error, only the type and traceback.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-26:

#25

WIP patch with code fix Edit (11.6 KiB, text/plain)

Here is a patch with the proposed code fix and changes to the functional test. Note that in this case the non-admin tenant user:

- only sees the exception type class name for the fault message
- does not see details in the fault response

The admin user:

- only sees the exception type class name for the fault message
- sees the exception value (previously what they'd see in the fault message) and traceback in the fault details

WIP since I need to fix test fallout and add a release note once we have a CVE/OSSA.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-26:

#26

WIP patch with code and test fix and release note Edit (24.9 KiB, text/plain)

This version contains a diff with -U25 for easier review and also resolves some failing unit tests as a result of how the fault message and details are stored. I've tried to document the impacted unit tests to make that clear.

This version also includes a release note with a TODO to include the CVE and OSSA information.

For all intents and purposes, besides the reno, this is my proposed version for final review.

Revision history for this message

Dan Smith (danms) wrote on 2019-07-26:

#27

The patch in #26 looks okay to me. I sent Matt a couple nits via IRC, but I think it's probably okay in current form if need be.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-26:

#28

I found a typo in my release note so the bug link doesn't work, but that is an easy fix.

I also probably need to check out of the compute API reference and API guide since they might need updating when talking about faults:

https://docs.openstack.org/api-guide/compute/faults.html

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-26:

#29

Looking over the fault API reference and API guide they seem sufficiently generic that I don't think I really need to add anything in this patch since the examples used are things that wouldn't be changed by this patch.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-26:

#30

Stein cherry pick Edit (15.7 KiB, text/plain)

Here is a cherry-pick patch for stein.

Changed in nova:
status:	Triaged → In Progress

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-26:

#31

Rocky cherry-pick patch. Edit (15.8 KiB, text/plain)

Here is a cherry-pick for rocky.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-26:

#32

Queens cherry-pick Edit (15.8 KiB, text/plain)

Here is a cherry-pick patch for queens.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-26:

#33

The pike backport requires https://review.opendev.org/#/c/509935/ for the functional test to work.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-26:

#34

pike fake driver power off patch Edit (11.7 KiB, text/plain)

Here is the backport for the pike fake driver power off patch.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-26:

#35

Pike cherry-pick Edit (15.8 KiB, text/plain)

Here is the pike backport of the fault fix.

Revision history for this message

melanie witt (melwitt) wrote on 2019-07-26:

#36

I reviewed the patch in comment #26 and it LGTM too.

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-07-29:

#37

Please review the proposed impact description below. Note I did not include a stable/pike point release as it's under extended maintenance. Donny, if you'd like an employer or organization included in the reporter credit along with your name, please let me know. If this looks correct, I'll use it to request a private CVE and then we can set the coordinated disclosure schedule for it...

Title: Nova Server Resource Faults Leak External Exception Details
Reporter: Donny Davis
Products: Nova
Affects: <17.0.12,>=18.0.0<18.2.2,>=19.0.0<19.0.2

Description:
Donny Davis reported a vulnerability in Nova Compute resource fault
handling. If an API request from an authenticated user ends in a
fault condition due to an external exception, details of the
underlying environment may be leaked in the response and could
include sensitive configuration or other data.

Revision history for this message

Matthew Thode (prometheanfire) wrote on 2019-07-29:

#38

proposed impact description looks good if a bit dry.

Revision history for this message

Donny Davis (donny-g) wrote on 2019-07-29:

#39

Jeremy, I think it looks great. Can you add Intel as my employer.

Thanks

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-29:

#40

The impact description looks OK to me as well. I'm working on the ocata patch this morning.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-29:

#41

The ocata backports will also require https://review.opendev.org/#/c/483986/ for the functional test to work. It's a clean backport from pike so not a big deal.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-29:

#42

ocata patches Edit (10.1 KiB, application/zip)

Here are the three ocata patches, applied in this order:

1. Implement-power_off-power_on-for-the-FakeDriver-ocata.patch
2. fix-unshelve-notification-test-instability-ocata.patch
3. WIP-Obfuscate-non-nova-server-fault-message-ocata.patch

Jeremy Stanley (fungi) on 2019-07-29

Changed in ossa:
status:	Incomplete → Triaged
importance:	Undecided → High
assignee:	nobody → Jeremy Stanley (fungi)

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-07-29:

#43

A CVE assignment has been requested with MITRE using the following impact description:

Title: Nova Server Resource Faults Leak External Exception Details
Reporter: Donny Davis (Intel)
Products: Nova
Affects: <17.0.12,>=18.0.0<18.2.2,>=19.0.0<19.0.2

Description:
Donny Davis with Intel reported a vulnerability in Nova Compute
resource fault handling. If an API request from an authenticated
user ends in a fault condition due to an external exception, details
of the underlying environment may be leaked in the response and
could include sensitive configuration or other data.

Changed in ossa:
status:	Triaged → In Progress

Jeremy Stanley (fungi) on 2019-07-29

summary:

- Error message reveals ceph information
+ Nova Server Resource Faults Leak External Exception Details
+ (CVE-2019-14433)

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-07-29: Re: Nova Server Resource Faults Leak External Exception Details (CVE-2019-14433)

#44

According to our Embargoed Disclosure[*] timeline, assuming the patches attached to this bug are considered ready and we send the pre-OSSA notification to downstream stakeholders on Wednesday (to give those subscribed to this bug a reasonable chance to object to the schedule) the earliest we can open this bug, push changes to public code review, and publish the OSSA will be Tuesday, August 6. Does this work for everyone?

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-07-29:

#45

Oops, omitted my footnote with the URL:

[*] https://security.openstack.org/vmt-process.html#embargoed-disclosure

Revision history for this message

Tony Breeds (o-tony) wrote on 2019-07-30:

#46

For what it's worth this looks good from a stable/EM Pov.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-07-30:

#47

> publish the OSSA will be Tuesday, August 6. Does this work for everyone?

Works for me. Dan is on vacation this week but back next week.

Revision history for this message

melanie witt (melwitt) wrote on 2019-07-30:

#48

>> publish the OSSA will be Tuesday, August 6. Does this work for everyone?

> Works for me. Dan is on vacation this week but back next week.

Works for me too, if my assistance is needed in any way.

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-07-31:

#49

Downstream stakeholders have been privately provided with the patches for master and branches currently under stable maintenance, and notified of the impending disclosure date/time. If they have any feedback, I'll relay it here or subscribe them to this bug report so they can do so themselves.

Changed in ossa:
status:	In Progress → Fix Committed

Revision history for this message

Corey Bryant (corey.bryant) wrote on 2019-08-02:

#50

Thanks very much for your work on this. Just one nit comment. I think "obfuscate" might not be the right word to use in the commit message. Maybe just use "replace"? I think obfuscate is not a desired security approach in the security community. See "security through obscurity".

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-08-02:

#51

Sure, will take that into account when pushing the patches to Gerrit for public review if it's all the same. I need to update the release note on the patches with the CVE as well when it goes up for public review.

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-08-02:

#52

To be clear, we'll need to know the review links for each patch so we can incorporate them into the public advisory. The sequence on Tuesday will be:

1. Switch bug from Private Security to Public Security

2. Push updated commits for openstack/nova into Gerrit for final review/approval

3. Confirm that preliminary CI jobs are succeeding and approve changes

4. Push commit to openstack/ossa with advisory including links to fixes in review

5. Distribute advisory to appropriate public mailing lists

6. Confirm fixes have merged and submit stable point release requests

Jeremy Stanley (fungi) on 2019-08-06

information type:

Private Security → Public Security

Jeremy Stanley (fungi) on 2019-08-06

summary:

- Nova Server Resource Faults Leak External Exception Details
- (CVE-2019-14433)
+ [OSSA-2019-003] Nova Server Resource Faults Leak External Exception
+ Details (CVE-2019-14433)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-06: Fix proposed to nova (master)

#53

Fix proposed to branch: master
Review: https://review.opendev.org/674821

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-06: Fix proposed to nova (stable/stein)

#54

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/674828

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-06: Fix proposed to nova (stable/rocky)

#55

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/674848

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-06: Fix proposed to nova (stable/queens)

#56

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/674859

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-06: Fix proposed to nova (stable/pike)

#57

Fix proposed to branch: stable/pike
Review: https://review.opendev.org/674877

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-06: Fix proposed to nova (stable/ocata)

#58

Fix proposed to branch: stable/ocata
Review: https://review.opendev.org/674908

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-06: Related fix proposed to ossa (master)

#59

Related fix proposed to branch: master
Review: https://review.opendev.org/674909

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-06: Related fix merged to ossa (master)

#60

Reviewed: https://review.opendev.org/674909
Committed: https://git.openstack.org/cgit/openstack/ossa/commit/?id=6b0b3a50e69e0a8f6f3207ed15a2aaaa30391580
Submitter: Zuul
Branch: master

commit 6b0b3a50e69e0a8f6f3207ed15a2aaaa30391580
Author: Jeremy Stanley <email address hidden>
Date: Tue Aug 6 14:43:44 2019 +0000

Add OSSA-2019-003 (CVE-2019-14433)

Change-Id: I22c4b17a0ad1b6197a97c6b2670fe5d1a6a7406f
Related-Bug: #1837877

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-07: Fix merged to nova (master)

#61

Reviewed: https://review.opendev.org/674821
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=298b337a16c0d10916b4431c436d19b3d6f5360e
Submitter: Zuul
Branch: master

commit 298b337a16c0d10916b4431c436d19b3d6f5360e
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message

The server fault "message" is always shown in the API
server response, regardless of policy or user role.

The fault "details" are only shown to users with the
admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

SecurityImpact: This change contains a fix for CVE-2019-14433.

Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
Closes-Bug: #1837877

Reviewed:  https://review.opendev.org/674821
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=298b337a16c0d10916b4431c436d19b3d6f5360e
Submitter: Zuul
Branch:    master

commit 298b337a16c0d10916b4431c436d19b3d6f5360e
Author: Matt Riedemann <mriedem.os@gmail.com>
Date:   Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message
    
    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.
    
    The fault "details" are only shown to users with the
    admin role when the fault code is 500.
    
    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.
    
    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.
    
    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.
    
    SecurityImpact: This change contains a fix for CVE-2019-14433.
    
    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877

Changed in nova:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-07: Fix merged to nova (stable/stein)

#62

Reviewed: https://review.opendev.org/674828
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=67651881163b75eb1983eaf753471a91ecec35eb
Submitter: Zuul
Branch: stable/stein

commit 67651881163b75eb1983eaf753471a91ecec35eb
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message

The server fault "message" is always shown in the API
server response, regardless of policy or user role.

The fault "details" are only shown to users with the
admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

SecurityImpact: This change contains a fix for CVE-2019-14433.

    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)

Reviewed:  https://review.opendev.org/674828
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=67651881163b75eb1983eaf753471a91ecec35eb
Submitter: Zuul
Branch:    stable/stein

commit 67651881163b75eb1983eaf753471a91ecec35eb
Author: Matt Riedemann <mriedem.os@gmail.com>
Date:   Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message
    
    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.
    
    The fault "details" are only shown to users with the
    admin role when the fault code is 500.
    
    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.
    
    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.
    
    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.
    
    SecurityImpact: This change contains a fix for CVE-2019-14433.
    
    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-08: Fix merged to nova (stable/rocky)

#63

Reviewed: https://review.opendev.org/674848
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e0b91a5b1e89bd0506dc6da86bc61f1708f0215a
Submitter: Zuul
Branch: stable/rocky

commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message

The server fault "message" is always shown in the API
server response, regardless of policy or user role.

The fault "details" are only shown to users with the
admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

SecurityImpact: This change contains a fix for CVE-2019-14433.

    NOTE(mriedem): The functional test imports change here
    because Idaed39629095f86d24a54334c699a26c218c6593 is not
    in Rocky so the PlacementFixture comes from nova_fixtures.

    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)

Reviewed:  https://review.opendev.org/674848
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e0b91a5b1e89bd0506dc6da86bc61f1708f0215a
Submitter: Zuul
Branch:    stable/rocky

commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a
Author: Matt Riedemann <mriedem.os@gmail.com>
Date:   Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message
    
    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.
    
    The fault "details" are only shown to users with the
    admin role when the fault code is 500.
    
    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.
    
    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.
    
    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.
    
    SecurityImpact: This change contains a fix for CVE-2019-14433.
    
    NOTE(mriedem): The functional test imports change here
    because Idaed39629095f86d24a54334c699a26c218c6593 is not
    in Rocky so the PlacementFixture comes from nova_fixtures.
    
    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-09: Fix merged to nova (stable/queens)

#64

Reviewed: https://review.opendev.org/674859
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3dcefba60a4f4553888a9dfda9fe3bee094d617a
Submitter: Zuul
Branch: stable/queens

commit 3dcefba60a4f4553888a9dfda9fe3bee094d617a
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message

The server fault "message" is always shown in the API
server response, regardless of policy or user role.

The fault "details" are only shown to users with the
admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

SecurityImpact: This change contains a fix for CVE-2019-14433.

    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)
    (cherry picked from commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a)

Reviewed:  https://review.opendev.org/674859
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3dcefba60a4f4553888a9dfda9fe3bee094d617a
Submitter: Zuul
Branch:    stable/queens

commit 3dcefba60a4f4553888a9dfda9fe3bee094d617a
Author: Matt Riedemann <mriedem.os@gmail.com>
Date:   Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message
    
    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.
    
    The fault "details" are only shown to users with the
    admin role when the fault code is 500.
    
    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.
    
    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.
    
    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.
    
    SecurityImpact: This change contains a fix for CVE-2019-14433.
    
    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)
    (cherry picked from commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a)

Revision history for this message

Jeremy Stanley (fungi) wrote on 2019-08-09:

#65

The extended maintenance branch patches are still up for review, but since all the stable maintenance branch patches have now merged I'm closing the security advisory task.

Changed in ossa:
status:	Fix Committed → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-09: Fix merged to nova (stable/pike)

#66

Reviewed: https://review.opendev.org/674877
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6da28b0aa9b6a0ba67460f88dd2c397605b0679b
Submitter: Zuul
Branch: stable/pike

commit 6da28b0aa9b6a0ba67460f88dd2c397605b0679b
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message

The server fault "message" is always shown in the API
server response, regardless of policy or user role.

The fault "details" are only shown to users with the
admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

SecurityImpact: This change contains a fix for CVE-2019-14433.

    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)
    (cherry picked from commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a)
    (cherry picked from commit 3dcefba60a4f4553888a9dfda9fe3bee094d617a)

Reviewed:  https://review.opendev.org/674877
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6da28b0aa9b6a0ba67460f88dd2c397605b0679b
Submitter: Zuul
Branch:    stable/pike

commit 6da28b0aa9b6a0ba67460f88dd2c397605b0679b
Author: Matt Riedemann <mriedem.os@gmail.com>
Date:   Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message
    
    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.
    
    The fault "details" are only shown to users with the
    admin role when the fault code is 500.
    
    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.
    
    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.
    
    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.
    
    SecurityImpact: This change contains a fix for CVE-2019-14433.
    
    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)
    (cherry picked from commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a)
    (cherry picked from commit 3dcefba60a4f4553888a9dfda9fe3bee094d617a)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-12: Fix included in openstack/nova 19.0.2

#67

This issue was fixed in the openstack/nova 19.0.2 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-12: Fix included in openstack/nova 18.2.2

#68

This issue was fixed in the openstack/nova 18.2.2 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-13: Fix merged to nova (stable/ocata)

#69

Reviewed: https://review.opendev.org/674908
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=02ea2c25eddebdc220d66e4a01d8deded7c77a57
Submitter: Zuul
Branch: stable/ocata

commit 02ea2c25eddebdc220d66e4a01d8deded7c77a57
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message

The server fault "message" is always shown in the API
server response, regardless of policy or user role.

The fault "details" are only shown to users with the
admin role when the fault code is 500.

    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.

    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.

    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.

SecurityImpact: This change contains a fix for CVE-2019-14433.

    NOTE(mriedem): In this backport the functional test is
    modified slightly to remove the DiskFilter since we are
    using placement for scheduler filtering on DISK_GB.

    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)
    (cherry picked from commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a)
    (cherry picked from commit 3dcefba60a4f4553888a9dfda9fe3bee094d617a)
    (cherry picked from commit 6da28b0aa9b6a0ba67460f88dd2c397605b0679b)

Reviewed:  https://review.opendev.org/674908
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=02ea2c25eddebdc220d66e4a01d8deded7c77a57
Submitter: Zuul
Branch:    stable/ocata

commit 02ea2c25eddebdc220d66e4a01d8deded7c77a57
Author: Matt Riedemann <mriedem.os@gmail.com>
Date:   Fri Jul 26 10:53:02 2019 -0400

Replace non-nova server fault message
    
    The server fault "message" is always shown in the API
    server response, regardless of policy or user role.
    
    The fault "details" are only shown to users with the
    admin role when the fault code is 500.
    
    The problem with this is for non-nova exceptions, the
    fault message is a string-ified version of the exception
    (see nova.compute.utils.exception_to_dict) which can
    contain sensitive information which the non-admin owner
    of the server can see.
    
    This change adds a functional test to recreate the issue
    and a change to exception_to_dict which for the non-nova
    case changes the fault message by simply storing the
    exception type class name. Admins can still see the fault
    traceback in the "details" key of the fault dict in the
    server API response. Note that _get_fault_details is
    changed so that the details also includes the exception
    value which is what used to be in the fault message for
    non-nova exceptions. This is necessary so admins can still
    get the exception message with the traceback details.
    
    Note that nova exceptions with a %(reason)s replacement
    variable could potentially be leaking sensitive details as
    well but those would need to be cleaned up on a case-by-case
    basis since we don't want to change the behavior of all
    fault messages otherwise users might not see information
    like NoValidHost when their server goes to ERROR status
    during scheduling.
    
    SecurityImpact: This change contains a fix for CVE-2019-14433.
    
    NOTE(mriedem): In this backport the functional test is
    modified slightly to remove the DiskFilter since we are
    using placement for scheduler filtering on DISK_GB.
    
    Change-Id: I5e0a43ec59341c9ac62f89105ddf82c4a014df81
    Closes-Bug: #1837877
    (cherry picked from commit 298b337a16c0d10916b4431c436d19b3d6f5360e)
    (cherry picked from commit 67651881163b75eb1983eaf753471a91ecec35eb)
    (cherry picked from commit e0b91a5b1e89bd0506dc6da86bc61f1708f0215a)
    (cherry picked from commit 3dcefba60a4f4553888a9dfda9fe3bee094d617a)
    (cherry picked from commit 6da28b0aa9b6a0ba67460f88dd2c397605b0679b)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-19: Fix included in openstack/nova 17.0.12

#70

This issue was fixed in the openstack/nova 17.0.12 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-09-27: Fix included in openstack/nova 20.0.0.0rc1

#71

This issue was fixed in the openstack/nova 20.0.0.0rc1 release candidate.

Jeremy Stanley (fungi) on 2019-11-14

description:

updated

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2021-06-17: Fix included in openstack/nova ocata-eol

#72

This issue was fixed in the openstack/nova ocata-eol release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-08-01: Fix included in openstack/nova pike-eol

#73

This issue was fixed in the openstack/nova pike-eol release.

OpenStack Compute (nova)

[OSSA-2019-003] Nova Server Resource Faults Leak External Exception Details (CVE-2019-14433)

Bug Description

CVE References

Other bug subscribers

Patches

Bug attachments

Remote bug watches

	Status	Importance	Assigned to
OpenStack Compute (nova)	Fix Released	High	Matt Riedemann
Ocata	Fix Committed	High	Matt Riedemann
Pike	Fix Released	High	Matt Riedemann
Queens	Fix Committed	High	Matt Riedemann
Rocky	Fix Committed	High	Matt Riedemann
Stein	Fix Committed	High	Matt Riedemann
OpenStack Security Advisory	Fix Released	High	Jeremy Stanley