Comment 7 for bug 1851587

Revision history for this message
Matt Riedemann (mriedem) wrote :

From the fix for bug 1837877 https://review.opendev.org/#/c/674821/:

"Note that nova exceptions with a %(reason)s replacement
variable could potentially be leaking sensitive details as
well but those would need to be cleaned up on a case-by-case
basis since we don't want to change the behavior of all
fault messages otherwise users might not see information
like NoValidHost when their server goes to ERROR status
during scheduling."

In this case HypervisorUnavailable is a NovaException so it's treated differently:

https://github.com/openstack/nova/blob/a90fe1951200ebd27fe74788c0a96c01104ac2cf/nova/exception.py#L508

As I said above, this could likely show up in fault messages in a lot of places where the ComputeManager uses the wrap_instance_fault decorator to inject a fault on exceptions getting raised and anything that changes the instance status to ERROR, e.g. failed rebuild:

https://github.com/openstack/nova/blob/a90fe1951200ebd27fe74788c0a96c01104ac2cf/nova/compute/manager.py#L3061

https://github.com/openstack/nova/blob/a90fe1951200ebd27fe74788c0a96c01104ac2cf/nova/compute/manager.py#L3145

So one question is, do we need to start whitelisting certain exceptions?

And if we do, how? Because the API will always show the message:

https://github.com/openstack/nova/blob/a90fe1951200ebd27fe74788c0a96c01104ac2cf/nova/api/openstack/compute/views/servers.py#L331

but only show the details (traceback) for admins and non-500 (I guess, that's weird) error cases:

https://github.com/openstack/nova/blob/a90fe1951200ebd27fe74788c0a96c01104ac2cf/nova/api/openstack/compute/views/servers.py#L341

When I was working on the CVE fix above, it's complicated to know from the point that we inject the fault what should be shown based on context.is_admin because an admin could be rebuilding some non-admin's server, so we can't really base things on that.

If we only showed the fault message in the API for admins in 500 code cases, then non-admin users will no longer see NoValidHost.

Do we need to get so granular that we need to set an attribute on each class of nova exception indicating if its fault message can be exposed to non-admins? That would be hard to maintain I imagine, but maybe it would just start with HypervisorUnavailable and we build on that for other known types of nova exceptions that leak host details?