Neutron Metadata Agent broken on Keystone v3 in Kilo (and probably Liberty)

Bug #1590957 reported by Ian Cordasco
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Invalid
Undecided
Ian Cordasco
Kilo
Fix Released
Medium
Ian Cordasco
Liberty
Fix Released
Medium
Ian Cordasco

Bug Description

We have a customer that launches roughly 700 (usually more) instances at once for burst workloads on OpenStack Ansible Kilo. When they do, Neutron Metadata Agent starts returning 500 errors.

We spent many hours tracking the source of the 500 errors and we have found *one part* of the root cause. Here's what happens:

In Kilo Neutron Metadata Agent (henceforth known as NMA), attempts to talk to Neutron API over RPC (in OSA's case Rabbit MQ). If that fails, it then attempts to use Neutronclient to talk to it over HTTP.

In /etc/neutron/metadata-agent.ini we have auth_url = https://<internalvip>:35357/v3 for Keystone, but if you follow the code path through Neutronclient, it is using internal code that *only* works on Keystone v2. This means that the HTTP fallback ends up receiving a 404 from Keystone which causes an unhandled exception in Neutron.

The mitigation for this is to fix the auth_url in metadata-agent.ini to use /v2.0 instead of /v3.

I looked at Neutron and Neutronclient on stable/liberty and they both have the *same* problem and roughly the same code path.

Naturally, this doesn't solve whatever is causing Rabbit connection problems from NMA *but* this does fix the 500 errors because NMA can successfully fallback to HTTP using neutronclient.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (kilo)

Fix proposed to branch: kilo
Review: https://review.openstack.org/327955

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstack-ansible (liberty)

Fix proposed to branch: liberty
Review: https://review.openstack.org/327960

Revision history for this message
Jesse Pretorius (jesse-pretorius) wrote :

@Ian is this issue present in Mitaka onwards too?

Revision history for this message
Ian Cordasco (icordasc) wrote :

I haven't looked that far into the present yet Jesse. ;) That's on my list of things to do today.

Revision history for this message
Ian Cordasco (icordasc) wrote :

So a quick look at stable/mitaka's neutron/agent/metadata/agent.py file shows that it *only* uses RPC for the calls that start to fail in this case. In other words, there is no HTTP fallback which makes this configuration in mitaka + unnecessary potentially.

Revision history for this message
Bjoern (bjoern-t) wrote :
Ian Cordasco (icordasc)
Changed in openstack-ansible:
status: Triaged → Invalid
importance: Medium → Undecided
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on openstack-ansible (kilo)

Change abandoned by Ian Cordasco (<email address hidden>) on branch: kilo
Review: https://review.openstack.org/327955
Reason: I've merged this into the other metadata_agent.ini fixes for Keystone auth via Neutronclient v2_0

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on openstack-ansible (liberty)

Change abandoned by Ian Cordasco (<email address hidden>) on branch: liberty
Review: https://review.openstack.org/327960
Reason: I've merged this fix into a separate review for simplicity.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (liberty)

Reviewed: https://review.openstack.org/328430
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=b271778e16a83aefb0e704a9aca0ef09f405084c
Submitter: Jenkins
Branch: liberty

commit b271778e16a83aefb0e704a9aca0ef09f405084c
Author: Ian Cordasco <email address hidden>
Date: Fri Jun 10 12:35:41 2016 -0500

    Use correct keystone auth parameters

    Neutron Metadata Agent uses the authentication parameters as a fallback
    method in the event that communication with Neutron API over RPC fails.
    In the fallback case, it looks for Keystone v2.0 authentication
    credentials but looks for them under the names:

    - admin_user
    - admin_password
    - admin_tenant_name

    Which can be the service user information. We were previously
    configuring Keystone v3 authentication parameters in this space and this
    causes 400 Bad Request responses from Keystone (since the Agent has null
    values for those config options).

    Further, the metadata agent does not need to use the admin URL to
    authenticate. Instead, it can use the internal URL to retrieve its token
    and then authenticate to Neutron API over HTTP.

    Change-Id: Ib413d3f3f3351bef29b0e68a2cfb96b7f3dff3c3
    Closes-bug: 1591282
    Closes-bug: 1590957

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (kilo)

Reviewed: https://review.openstack.org/328432
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=75deb166c8922f7ceb120d3892ae1ac4fa4702bb
Submitter: Jenkins
Branch: kilo

commit 75deb166c8922f7ceb120d3892ae1ac4fa4702bb
Author: Ian Cordasco <email address hidden>
Date: Fri Jun 10 12:35:41 2016 -0500

    Use correct keystone auth parameters

    Neutron Metadata Agent uses the authentication parameters as a fallback
    method in the event that communication with Neutron API over RPC fails.
    In the fallback case, it looks for Keystone v2.0 authentication
    credentials but looks for them under the names:

    - admin_user
    - admin_password
    - admin_tenant_name

    Which can be the service user information. We were previously
    configuring Keystone v3 authentication parameters in this space and this
    causes 400 Bad Request responses from Keystone (since the Agent has null
    values for those config options).

    Further, the metadata agent does not need to use the admin URL to
    authenticate. Instead, it can use the internal URL to retrieve its token
    and then authenticate to Neutron API over HTTP.

    Change-Id: Ib413d3f3f3351bef29b0e68a2cfb96b7f3dff3c3
    Closes-bug: 1591282
    Closes-bug: 1590957
    (cherry picked from commit b271778e16a83aefb0e704a9aca0ef09f405084c)

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote : Fix included in openstack/openstack-ansible 12.0.15

This issue was fixed in the openstack/openstack-ansible 12.0.15 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/openstack-ansible 11.2.17

This issue was fixed in the openstack/openstack-ansible 11.2.17 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.