Nova-API uses Keystone's public endpoint for project id verification

Bug #1716344 reported by Christoph Fiehe
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
jichenjc
Pike
Fix Committed
Medium
jichenjc

Bug Description

I have setup a fresh HA deployment of OpenStack Pike on Ubuntu 16.04. I recognized in the logs that Nova fails during vm creation with the following exception:

2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity [req-6efab9e1-78f5-4e85-8247-686ff4f3568c dddfba8e02f746799a6408a523e6cd25 ed2d2efd86dd40e7a45491d8502318d3 - default default] Unable to contact keystone to verify project_id: SSLError: SSL exception connecting to https://os-cloud.mycompany.com:5000/v3/projects/ed2d2efd86dd40e7a45491d8502318d3: ("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",)
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity Traceback (most recent call last):
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity File "/usr/lib/python2.7/dist-packages/nova/api/openstack/identity.py", line 42, in verify_project_id
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity raise_exc=False)
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity File "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 845, in get
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity return self.request(url, 'GET', **kwargs)
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity File "/usr/lib/python2.7/dist-packages/positional/__init__.py", line 101, in inner
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity return wrapped(*args, **kwargs)
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity File "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 703, in request
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity resp = send(**kwargs)
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity File "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 765, in _send_request
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity raise exceptions.SSLError(msg)
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity SSLError: SSL exception connecting to https://os-cloud.mycompany.com:5000/v3/projects/ed2d2efd86dd40e7a45491d8502318d3: ("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",)
2017-09-11 09:31:28.909 5604 ERROR nova.api.openstack.identity

Keystone's public endpoint should only visible to external clients. All internal OpenStack services should use the internalURL for authentication purposes. I think my configuration is correct. The "auth_url" point to Keystone's internal URL, whereas "auth_uri" points to Keystone's public endpoint. I want to avoid https based communication for my internal cloud services.

$ openstack endpoint list | grep keystone
| 00a22bfee72141ddadacd0e357161075 | RegionOne | keystone | identity | True | internal | http://os-identity.mycompany.com:5000/v3 |
| 7178e534cb4e4c5e9a67066ff3e9c433 | RegionOne | keystone | identity | True | public | https://os-cloud.mycompany.com:5000/v3 |
| f5ed3bba70274d7fa03f2ceaab96c3c9 | RegionOne | keystone | identity | True | admin | http://os-identity.mycompany.com:35357/v3 |

################
nova.conf
################
...
[keystone_authtoken]
auth_type = password
auth_uri = http://os-cloud.mycompany.com:5000
auth_url = http://os-identity:35357
memcached_servers = os-memcache:11211
password = novapass
project_domain_name = default
project_name = service
user_domain_name = default
username = nova
...

Can someone please have a look?

Christoph Fiehe (fiehe)
description: updated
Christoph Fiehe (fiehe)
summary: - Nova-API sometimes uses Keystone's public endpoint
+ Nova-API uses Keystone's public endpoint for project id verification
description: updated
description: updated
Revision history for this message
Christoph Fiehe (fiehe) wrote :

It took some time to narrow down the problem. The issue was introduced with the Pike release, where project id verification for flavor access and quota modification got added.

The problem is caused by class "nova/api/openstack/identity.py" (line 37-42):
...
resp = sess.get('/projects/%s' % project_id,
                endpoint_filter={
                    'service_type': 'identity',
                    'version': (3, 0)
                },
                raise_exc=False)
...
Keystone's endpoint is retrieved from the service catalog without any configuration option which interface to use. The session calls the method "get_endpoint(...)" of the authentication plugin "_ContextAuthPlugin" provided by "nova/context.py" which forwards the call to the method "url_for" of "keystoneauth/keystoneauth1/access/service_catalog.py" where the default value "public" for the "interface" parameter gets applied.

To solve this, we must add a configuration option and tell nova which interface to use for looking up the "identity" service type from the service catalog.

Is there really no other way possible to retrieve the endpoint of the identity service?

Revision history for this message
Christoph Fiehe (fiehe) wrote :

With an additional dependency to "keystoneclient", we are able to get ride of the problem. This requires a small code modification of the class "nova/api/openstack/identity.py":

from keystoneauth1 import session
from keystoneclient.v3 import client
from keystoneclient import exceptions as ks_exc
from oslo_log import log as logging
import webob

from nova.i18n import _

LOG = logging.getLogger(__name__)

def verify_project_id(context, project_id):
    """verify that a project_id exists.

    This attempts to verify that a project id exists. If it does not,
    an HTTPBadRequest is emitted.

    """
    auth = context.get_auth_plugin()
    sess = session.Session(auth=auth)
    keystone = client.Client(session=sess)
    try:
        project = keystone.projects.get(project_id)
    except ks_exc.ClientException as e:
        if e.http_status == 404:
            raise webob.exc.HTTPBadRequest(
                explanation=_("Project ID %s is not a valid project.") %
                project_id)
        elif e.http_status == 403:
            # we don't have enough permission to verify this, so default
            # to "it's ok".
            LOG.info(
                "Insufficient permissions for user %(user)s to verify "
                "existence of project_id %(pid)s",
                {"user": context.user_id, "pid": project_id})
        else:
            LOG.warning(
                "Unexpected response from keystone trying to "
                "verify project_id %(pid)s - resp: %(code)s %(content)s",
                {"pid": project_id,
                 "code": resp.status_code,
                 "content": resp.content})
            # realize we did something wrong, but move on with a warning

Any comments?

Revision history for this message
jichenjc (jichenjc) wrote :
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

@jichenjc, I think the bug you mention *could* solve the certificate verification failure, but doesn't address the fact that we only get the endpoint URL from the service catalog by looking up the public one, even for internal communications.

That said, I'm not particularly expert on that, so I defer to others for helping me triaging that bug.

Revision history for this message
Matt Riedemann (mriedem) wrote :

The original description of this bug says it makes instance create fail, which would be incorrect. As stated, keystone is only used for project_id verification when updating quota or changing flavor access. There was a conscious decision to hard-code the public endpoint as the interface when this change was made, but we changed that hard-coding for nova talking to the placement endpoint so I don't see why we wouldn't also allow different endpoints for talking to keystone. And actually, the KSA adapter stuff that Eric Fried is working on in Queens all defaults to using the internal interface I think.

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

This is the similar change we made for nova making requests to placement:

https://review.openstack.org/#/q/Ic996e596f8473c0b8626e8d0e92e1bf58044b4f8

Revision history for this message
Matt Riedemann (mriedem) wrote :

To clarify, are you unable to update quota and flavor access? Or are you just seeing errors in the logs now but the APIs still work?

Revision history for this message
Sean Dague (sdague) wrote :

What actual API calls are returning 400s? Because VM Create shouldn't be touching this.

We are dumping errors when we can't talk to keystone, we could make those less verbose. It would be good to know if these were just stack traces, but the APIs in question were still working, or if this was more pernicious than that.

Revision history for this message
Matt Riedemann (mriedem) wrote :

Looks like the APIs will still function, but you get a stacktrace:

https://github.com/openstack/nova/blob/16.0.0/nova/api/openstack/identity.py#L52

Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
jichenjc (jichenjc)
Changed in nova:
assignee: nobody → jichenjc (jichenjc)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/513243

Changed in nova:
status: Confirmed → In Progress
Changed in nova:
assignee: jichenjc (jichenjc) → Matt Riedemann (mriedem)
Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → jichenjc (jichenjc)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/513243
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1606467b29969eb45efbb56c1b148a4a6f53c5cf
Submitter: Zuul
Branch: master

commit 1606467b29969eb45efbb56c1b148a4a6f53c5cf
Author: jichenjc <email address hidden>
Date: Wed Oct 18 11:20:51 2017 +0800

    Downgrade log for keystone verify client fail

    Under some circumstances the keystone verify process might fail
    but we are able to proceed because it's client setting error,
    so we don't need to report an exception log in the log file to
    confuse admin, instead, use an info log.

    In the reported bug, the issue is that nova is configured for
    the 'internal' identity endpoint but the nova code does not
    pass an interface, so KSA defaults to 'public' which fails.
    This is fixed with I2204c8bed8936d5bed0f410284d2a563f84e7100
    but not something we can backport, so this is a simple change
    to make the logging less annoying.

    Closes-Bug: 1716344

    Change-Id: I67c9f648f85de364de443e2a0535ddd361c14661

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/525475

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.0.0b2

This issue was fixed in the openstack/nova 17.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/525475
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=02af3d5901968927222c153cd768295f6b113fb0
Submitter: Zuul
Branch: stable/pike

commit 02af3d5901968927222c153cd768295f6b113fb0
Author: jichenjc <email address hidden>
Date: Wed Oct 18 11:20:51 2017 +0800

    Downgrade log for keystone verify client fail

    Under some circumstances the keystone verify process might fail
    but we are able to proceed because it's client setting error,
    so we don't need to report an exception log in the log file to
    confuse admin, instead, use an info log.

    In the reported bug, the issue is that nova is configured for
    the 'internal' identity endpoint but the nova code does not
    pass an interface, so KSA defaults to 'public' which fails.
    This is fixed with I2204c8bed8936d5bed0f410284d2a563f84e7100
    but not something we can backport, so this is a simple change
    to make the logging less annoying.

    Closes-Bug: 1716344

    Change-Id: I67c9f648f85de364de443e2a0535ddd361c14661
    (cherry picked from commit 1606467b29969eb45efbb56c1b148a4a6f53c5cf)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.4

This issue was fixed in the openstack/nova 16.0.4 release.

Revision history for this message
Alexandru Avadanii (alexandru-avadanii) wrote :

For the record, this bug *does* break things (quotas) if the admin/internal networks can't access the public network (complete isolation).
Christoph Fiehe's solution works in that scenario too. Thank you for that snippet!

Revision history for this message
Logan V (loganv) wrote :

I am also seeing nova-api attempting to use the keystone public endpoint when /v2.1/os-quota-sets is called on my Pike deployment. This is not valid in my environment; the API must use the internal endpoint to reach keystone. When the public endpoint is used, the connection sits in SYN_SENT state in netstat until it times out after a minute or two.

Hacking the endpoint_filter at https://github.com/openstack/nova/blob/d536bec9fc098c9db8d46f39aab30feb0783e428/nova/api/openstack/identity.py#L43-L46 to include interface=internal fixes the issue.

Unless I am mistaken this issue still exists in master:
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/api/openstack/identity.py#L33-L35

Revision history for this message
Logan V (loganv) wrote :

Disregard #18. It is not really correlated with this bug. (this bug is about session parameters, my bug is about auth parameters)
https://bugs.launchpad.net/nova/+bug/1751349

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.