Attach Volume Fails with secure call to cinder

Bug #1752152 reported by Divya K Konoor
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Eric Fried
Queens
Fix Committed
Medium
Eric Fried
python-cinderclient
Invalid
Undecided
Unassigned

Bug Description

It is found that when cinder endpoint is configured to use https, attach volume flow fails with the stack trace seen below (seen in nova api log) because it fails to make a secure call from nova to cinder. Secure calls perform certificate validation and in this particular flow, certificate validation is completely skipped

File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 3971, in attach_volume
2018-02-27 08:16:51.338 1324 ERROR cinder.is_microversion_supported(context, '3.44')
2018-02-27 08:16:51.338 1324 ERROR File "/usr/lib/python2.7/site-packages/nova/volume/cinder.py", line 138, in is_microversion_supported
2018-02-27 08:16:51.338 1324 ERROR _check_microversion(url, microversion)
2018-02-27 08:16:51.338 1324 ERROR File "/usr/lib/python2.7/site-packages/nova/volume/cinder.py", line 86, in _check_microversion
2018-02-27 08:16:51.338 1324 ERROR max_api_version = cinder_client.get_highest_client_server_version(url)
2018-02-27 08:16:51.338 1324 ERROR File "/usr/lib/python2.7/site-packages/cinderclient/client.py", line 126, in get_highest_client_server_version
2018-02-27 08:16:51.338 1324 ERROR min_server, max_server = get_server_version(url)
2018-02-27 08:16:51.338 1324 ERROR File "/usr/lib/python2.7/site-packages/cinderclient/client.py", line 109, in get_server_version
2018-02-27 08:16:51.338 1324 ERROR response = requests.get(version_url)
2018-02-27 08:16:51.338 1324 ERROR File "/usr/lib/python2.7/site-packages/requests/api.py", line 72, in get
2018-02-27 08:16:51.338 1324 ERROR return request('get', url, params=params, **kwargs)
2018-02-27 08:16:51.338 1324 ERROR File "/usr/lib/python2.7/site-packages/requests/api.py", line 58, in request
2018-02-27 08:16:51.338 1324 ERROR return session.request(method=method, url=url, **kwargs)
2018-02-27 08:16:51.338 1324 ERROR File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 502, in request
2018-02-27 08:16:51.338 1324 ERROR resp = self.send(prep, **send_kwargs)
2018-02-27 08:16:51.338 1324 ERROR File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 612, in send
2018-02-27 08:16:51.338 1324 ERROR r = adapter.send(request, **kwargs)
2018-02-27 08:16:51.338 1324 ERROR File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 504, in send
2018-02-27 08:16:51.338 1324 ERROR raise ConnectionError(e, request=request)
2018-02-27 08:16:51.338 1324 ERROR ConnectionError: HTTPSConnectionPool(host='ipx-x-x-x.xxx.xxx.xxx.com', port=9000): Max retries exceeded with url: / (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'SSL3_GET_SERVER_CERTIFICATE', 'certificate verify failed')],)",),))

This is a regression bug introduced as part of changeset https://review.openstack.org/#/c/469579/, which was merged way back in June 2017. As part of this changeset, a new function namely _check_microversion was introduced, which then makes a cinderclient call , which finally makes a cinder https REST api call without passing the certificate. This leads to the problem listed above.

https://github.com/openstack/nova/blob/stable/queens/nova/volume/cinder.py#L75
https://github.com/openstack/nova/blob/stable/queens/nova/volume/cinder.py#L86

https://github.com/openstack/python-cinderclient/blob/stable/queens/cinderclient/client.py#L126
https://github.com/openstack/python-cinderclient/blob/stable/queens/cinderclient/client.py#L109

Tags: cinder volumes
affects: cinder → nova
Changed in nova:
assignee: nobody → Divya K Konoor (dikonoor)
tags: added: cinder volumes
Matt Riedemann (mriedem)
Changed in nova:
status: New → Triaged
Revision history for this message
Matt Riedemann (mriedem) wrote :

Ouch, good find. https://review.openstack.org/#/c/508345/ could replace the microversion checks via cinderclient if we use the keystoneauth1 Adapter object, but that's not something we can backport to queens and pike.

How terrible would it be if we simply did a find/replace of "https" with "http" in the version URL we're checking here:

https://github.com/openstack/python-cinderclient/blob/3.5.0/cinderclient/client.py#L109

The version document in cinder clearly doesn't require a token, so SSL shouldn't be a requirement either it seems.

Revision history for this message
Matt Riedemann (mriedem) wrote :

The alternative is on the nova side, we just construct a cinderclient Client object and use it's internal client (session) to make a request, or use a keystoneauth1 adapter to make the request. I'm not sure if all of the version negotiation checking that's happening in cinderclient is something we need to worry about, i.e. what happens in here:

https://github.com/openstack/python-cinderclient/blob/3.5.0/cinderclient/client.py#L124

That's just making sure that both the cinderclient and server support the microversion that nova cares about. Nova can control the minimum required version of cinderclient via requirements.txt, and could get the server version via keystoneauth1.

Revision history for this message
Matt Riedemann (mriedem) wrote :

I think we can mark this as invalid for cinderclient as there isn't anything we can backport and require with a new minimum version on stable branches for cinderclient, so nova likely has to solve this alone.

Changed in python-cinderclient:
status: New → Invalid
Revision history for this message
Matt Riedemann (mriedem) wrote :

I'm not sure why this isn't a problem in devstack in our CI environment because tls-proxy is enabled which makes the endpoints get set using https, so I don't see how this is a problem for you but not in devstack?

Revision history for this message
Eric Fried (efried) wrote :

Getting the version document over http isn't the answer. Let's just duplicate the cinderclient.client.get_highest_client_server_version method in nova - at least for the stable version of the fix - but using the _SESSION instead of a raw requests.get to get the version document.

Revision history for this message
Matt Riedemann (mriedem) wrote :

This is hitting the version document too:

http://logs.openstack.org/45/508345/13/check/nova-next/aa61d86/logs/screen-c-api.txt.gz#_Mar_15_20_00_06_201395

Mar 15 20:00:06.201395 ubuntu-xenial-inap-mtl01-0002993032 <email address hidden>[32240]: INFO cinder.api.openstack.wsgi [req-2e3a55fd-3ed7-4b7f-b902-c4671afa83c4 req-d6e11ad6-95b0-42ed-bcb2-af3996a69096 admin admin] GET https://198.72.124.205/volume/
Mar 15 20:00:06.201628 ubuntu-xenial-inap-mtl01-0002993032 <email address hidden>[32240]: DEBUG cinder.api.openstack.wsgi [req-2e3a55fd-3ed7-4b7f-b902-c4671afa83c4 req-d6e11ad6-95b0-42ed-bcb2-af3996a69096 admin admin] Empty body provided in request {{(pid=32243) get_body /opt/stack/new/cinder/cinder/api/openstack/wsgi.py:718}}
Mar 15 20:00:06.202003 ubuntu-xenial-inap-mtl01-0002993032 <email address hidden>[32240]: DEBUG cinder.api.openstack.wsgi [req-2e3a55fd-3ed7-4b7f-b902-c4671afa83c4 req-d6e11ad6-95b0-42ed-bcb2-af3996a69096 admin admin] Calling method '<bound method VersionsController.all of <cinder.api.versions.VersionsController object at 0x7f81c53a0550>>' {{(pid=32243) _process_stack /opt/stack/new/cinder/cinder/api/openstack/wsgi.py:872}}
Mar 15 20:00:06.202926 ubuntu-xenial-inap-mtl01-0002993032 <email address hidden>[32240]: INFO cinder.api.openstack.wsgi [req-2e3a55fd-3ed7-4b7f-b902-c4671afa83c4 req-d6e11ad6-95b0-42ed-bcb2-af3996a69096 admin admin] https://198.72.124.205/volume/ returned with HTTP 300

Revision history for this message
Matt Riedemann (mriedem) wrote :

I'm also beginning to wonder how it is that if this was regressed since Pike, how is it we haven't heard more about this issue? I assume most productions clouds are using SSL.

Changed in nova:
status: Triaged → Incomplete
no longer affects: nova/pike
no longer affects: nova/queens
Revision history for this message
Divya K Konoor (dikonoor) wrote :

mriedem, I am using Queens and the url used is https. The request library by default does secure communication unless specified otherwise.

https://github.com/openstack/python-cinderclient/blob/3.5.0/cinderclient/client.py#L109

response = requests.get(version_url)

So in any case where a https url used and the certificate is not specified (in cases where the root certificate is not present at the client making the call), this is going to complain saying certificate verify failed.

So the above line of code in cinderclient should be fixed simply because it does not provide an option for the sys admin to either specify the certificate path or say the call has to be insecure.

The nova.conf [cinder] section has the following options, which can be utilized while connecting to the URL specified in the same section:

#catalog_info = volumev3:cinderv3:publicURL

# PEM encoded Certificate Authority to use when verifying HTTPs connections.
# (string value)
#cafile = <None>

# PEM encoded client certificate cert file (string value)
#certfile = <None>

# PEM encoded client certificate key file (string value)
#keyfile = <None>

# Verify HTTPS connections. (boolean value)
#insecure = false

description: updated
Revision history for this message
Divya K Konoor (dikonoor) wrote :

In this specific case of getting the server version url, the url is https://<cinder ip>:9000/. As Matt mentioned above, this returns different version information and doesn't require authentication. This doesn't really have to be a secure call in my opinion (verify can be set to False).

Changed in nova:
assignee: Divya K Konoor (dikonoor) → Eric Fried (efried)
status: Incomplete → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/557508

Revision history for this message
Eric Fried (efried) wrote :

@Divya, would you please try https://review.openstack.org/#/c/557508/ in your environment and let us know if it fixes the issue?

Revision history for this message
Divya K Konoor (dikonoor) wrote :

@efried, patch seems to be working fine with vm deploy/delete, volume attach/delete flows

Changed in nova:
assignee: Eric Fried (efried) → Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → Eric Fried (efried)
Revision history for this message
Matt Riedemann (mriedem) wrote :

I know this is at least needed in Queens but I'm not sure if we need this in Pike. Need to see if anything is using this code in Pike.

Revision history for this message
Gerald McBrearty (gfm-r) wrote :

Eric, FYI we live tested this in one of our environments this morning. Success!

Revision history for this message
Eric Fried (efried) wrote :

Thanks all. We're working to push it through.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/557837

melanie witt (melwitt)
Changed in nova:
importance: Undecided → Medium
Revision history for this message
Matt Riedemann (mriedem) wrote :

I audited the stable/pike code. While https://review.openstack.org/#/c/469579/ went into pike, and is used if we actually create/update/delete volume attachments, we don't actually call any of those flows. We didn't start creating new style attachments until Queens. Some of the plumbing was added in Pike, but not actually 'turned on' in the API at that point, so we shouldn't need to backport any fixes to stable/pike.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/557508
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=20eaaee2334957eb8739ecca524a1c4aa9f246e9
Submitter: Zuul
Branch: master

commit 20eaaee2334957eb8739ecca524a1c4aa9f246e9
Author: Eric Fried <email address hidden>
Date: Wed Mar 28 15:45:26 2018 -0500

    Use ksa session for cinder microversion check

    [1] added a method to validate availability of a desired version of the
    cinder API. This method called into
    cinderclient.client.get_highest_client_server_version to
    (unsurprisingly) discover the highest available version to compare
    against. However, that routine uses a raw requests.get to access the
    version document from the server. This breaks when the endpoint URL is
    using HTTPS, because nothing sets up the cert info for that call.

    With this change, we work around the issue by duplicating the logic from
    get_highest_client_server_version, but doing the version discovery via
    the same keystoneauth1 session that's configured for use with the client
    itself, thus inheriting any SSL configuration as appropriate.

    [1] https://review.openstack.org/#/c/469579/

    Change-Id: I4de355195281009a5979710d7f14ae8ea11d10e0
    Closes-Bug: #1752152

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/557837
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=307382f58d38778b480d2d030e427759a44c204b
Submitter: Zuul
Branch: stable/queens

commit 307382f58d38778b480d2d030e427759a44c204b
Author: Eric Fried <email address hidden>
Date: Wed Mar 28 15:45:26 2018 -0500

    Use ksa session for cinder microversion check

    [1] added a method to validate availability of a desired version of the
    cinder API. This method called into
    cinderclient.client.get_highest_client_server_version to
    (unsurprisingly) discover the highest available version to compare
    against. However, that routine uses a raw requests.get to access the
    version document from the server. This breaks when the endpoint URL is
    using HTTPS, because nothing sets up the cert info for that call.

    With this change, we work around the issue by duplicating the logic from
    get_highest_client_server_version, but doing the version discovery via
    the same keystoneauth1 session that's configured for use with the client
    itself, thus inheriting any SSL configuration as appropriate.

    [1] https://review.openstack.org/#/c/469579/

    Change-Id: I4de355195281009a5979710d7f14ae8ea11d10e0
    Closes-Bug: #1752152
    (cherry picked from commit 20eaaee2334957eb8739ecca524a1c4aa9f246e9)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.2

This issue was fixed in the openstack/nova 17.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.0.0b1

This issue was fixed in the openstack/nova 18.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.