Routed provider networks: placement API handling errors

Bug #1828543 reported by Lajos Katona on 2019-05-10
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Medium
Lajos Katona

Bug Description

Routed provider networks is a feature which uses placement to store information about segments, the subnets in segments and make possible that nova can use this information in scheduling.
On master the placement API calls are failing, at first at get_inventory call:

May 09 14:15:26 multicont neutron-server[31232]: DEBUG oslo_concurrency.lockutils [-] Lock "notifier-a76cce90-7366-495e-9784-9ddef689bc71" released by "neutron.notifiers.batch_notifier.BatchNotifier.queue_event.<locals>.synced_send" :: held 0.112s {{(pid=31252) inner /usr/local/lib/python3.6/dist-packages/oslo_concurrency/lockutils.py:339}}
May 09 14:15:26 multicont neutron-server[31232]: Traceback (most recent call last):
May 09 14:15:26 multicont neutron-server[31232]: File "/opt/stack/neutron-lib/neutron_lib/placement/client.py", line 433, in get_inventory
May 09 14:15:26 multicont neutron-server[31232]: return self._get(url).json()
May 09 14:15:26 multicont neutron-server[31232]: File "/opt/stack/neutron-lib/neutron_lib/placement/client.py", line 178, in _get
May 09 14:15:26 multicont neutron-server[31232]: **kwargs)
May 09 14:15:26 multicont neutron-server[31232]: File "/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py", line 1037, in get
May 09 14:15:26 multicont neutron-server[31232]: return self.request(url, 'GET', **kwargs)
May 09 14:15:26 multicont neutron-server[31232]: File "/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py", line 890, in request
May 09 14:15:26 multicont neutron-server[31232]: raise exceptions.from_response(resp, method, url)
May 09 14:15:26 multicont neutron-server[31232]: keystoneauth1.exceptions.http.NotFound: Not Found (HTTP 404) (Request-ID: req-4133f4c6-df6c-467f-9d15-e8532fc6504b)
May 09 14:15:26 multicont neutron-server[31232]: During handling of the above exception, another exception occurred:
...
May 09 14:15:26 multicont neutron-server[31232]: File "/opt/stack/neutron/neutron/services/segments/plugin.py", line 229, in _update_nova_inventory
May 09 14:15:26 multicont neutron-server[31232]: IPV4_RESOURCE_CLASS)
May 09 14:15:26 multicont neutron-server[31232]: File "/opt/stack/neutron-lib/neutron_lib/placement/client.py", line 53, in wrapper
May 09 14:15:26 multicont neutron-server[31232]: return f(self, *a, **k)
May 09 14:15:26 multicont neutron-server[31232]: File "/opt/stack/neutron-lib/neutron_lib/placement/client.py", line 444, in get_inventory
May 09 14:15:26 multicont neutron-server[31232]: if "No resource provider with uuid" in e.details:
May 09 14:15:26 multicont neutron-server[31232]: TypeError: argument of type 'NoneType' is not iterable

Using stable/pike (not just for neutron) the syncing is OK.
I suppose as the placement client code was moved to neutron-lib and changed to work with placement 1.20 something happened that makes routed networks placement calls failing.

Some details:
Used reproduction steps: https://docs.openstack.org/neutron/latest/admin/config-routed-networks.html (of course the pike one for stable/pike deployment)
neutron: d0e64c61835801ad8fdc707fc123cfd2a65ffdd9
neutron-lib: bcd898220ff53b3fed46cef8c460269dd6af3492
placement: 57026255615679122e6f305dfa3520c012f57ca7
nova: 56fef7c0e74d7512f062c4046def10401df16565
Ubuntu 18.04.2 LTS based multihost devstack

Lajos Katona (lajos-katona) wrote :

Queens is OK, on Monday I will check with Rocky.

Lajos Katona (lajos-katona) wrote :

Rocky is failing to report to placement with the same exception (TypeError: argument of type 'NoneType' is not iterable)

Lajos Katona (lajos-katona) wrote :

Short summary with my latest findings:

High level procedure of the things happen:
* subnet create with segment triggers placement inventory create/update
* _update_nova_inventory (https://opendev.org/openstack/neutron/src/branch/master/neutron/services/segments/plugin.py#L216)
tries to fetch the inventory for the rp which uuid is the same as the segment uuid in the subnet properties.
* placement sends back http404 if the RP/or the inventory is not there,
based on the http404 content keystoneauth translates that to exception with details and segments plugin creates the rp with inventory.

Queens:
The HTTP GET accepts any content by the header (Accept: */*), so by default the http404 sent back by placement contains html (Content-Type: text/html; charset=UTF-8),
and I suppose keystoneauth translates that to keystonauth1.exceptions.NotFound with the details containing the http body.

Rocky:
The HTTP GET accepts any content by the header (Accept: */*\), and by default the http404 sent back by placement contains json (Content-Type: application/json),
and I suppose keystoneauth can't translate that to details of the NotFound exception.

Current master (neutron: 1bc30c915ce8088daca471ca7865e4222ff47815,
                neutron-lib: 65264a936a6c9d0a8c4eae69acaee2c3ee54b5d6,
                placement: 1281806c99ceb80fed78237b39aa34b9097a39e6,
                keystoneauth1==3.14.0):
The HTTP GET contains accept header (accept: application/json), and by default the http404 sent back by placement contains json (Content-Type: application/json),
and I suppose keystoneauth can't translate that to details of the NotFound exception.

Lajos Katona (lajos-katona) wrote :

I think I got to the depths of this issue. From this point I will refer to code and behavior on current master.

keystoneauth1 translates the http errors to exceptions (https://opendev.org/openstack/keystoneauth/src/branch/master/keystoneauth1/exceptions/http.py#L386).
The method from_response, expects that in case of json reply the html body is a json dict and has error key, and under that has a details key, like this (python code example):
body= {'error': {'message': 'Foooooo', 'details': 'Baaaaar'}}

Placement on the contrary sends something like this:
body={'errors': [{'status': 404, 'title': 'Not Found', 'detail': 'The resource could not be found.\n\n No resource provider with uuid ...... ', 'request_id': ''}]}

So keystone expects a key error, but placement sends errors, and under error placement excepts dict keys like details, and message, but placement sends a list of dicts and the dicts have keys like detail and title.

1) The quick and dirty solution can be that get_inventory (and other methods for placement client in neutron/neutron-lib which expects something useful in the exception details) to use the exception's response, like this:

except ks_exc.NotFound as e:
    if 'Fooo' in e.response.text:
        do_something_with_it()

2) Another future proof way should be to discuss the question with keystone and placement folks to
a) make placement send http errors the way keystone expects them.
b) make keystoneauth handle the perhaps more general http exceptions placement started to use.

Lajos Katona (lajos-katona) wrote :

Just for fun:
https://opendev.org/openstack/placement/src/branch/master/placement/util.py#L94
placement.util.json_error_formatter is the method that formats the http body to be the above mentioned.

The why is summarized here: http://specs.openstack.org/openstack/api-wg/guidelines/errors.html as referenced in placement code as well.

Lajos Katona (lajos-katona) wrote :

The ultimate solution proposed to keystoneauth:
https://review.opendev.org/662281

Related fix proposed to branch: master
Review: https://review.opendev.org/663978

Related fix proposed to branch: master
Review: https://review.opendev.org/663980

Change abandoned by Lajos Katona (<email address hidden>) on branch: master
Review: https://review.opendev.org/662204
Reason: https://review.opendev.org/662281 fixed the problem in keystoneauth

Change abandoned by Lajos Katona (<email address hidden>) on branch: master
Review: https://review.opendev.org/662205
Reason: https://review.opendev.org/662281 fixed the problem in keystoneauth

Reviewed: https://review.opendev.org/663978
Committed: https://git.openstack.org/cgit/openstack/neutron-lib/commit/?id=28e71cbd74bf10da3332265659adb27ee9614582
Submitter: Zuul
Branch: master

commit 28e71cbd74bf10da3332265659adb27ee9614582
Author: elajkat <email address hidden>
Date: Fri May 31 11:50:46 2019 +0200

    placement client: fix routed prov networks working

    Routed provider networks works with placement microversion 1.1, that
    version returned no body for resource provider creation, but from 1.20
    body is returned and the client expected that as bandwidth feature were
    designed after that.

    Related-Bug: #1828543
    Change-Id: Id6e6d633b00237d8909160e7ed6f5e495399a252

Reviewed: https://review.opendev.org/663979
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=62f55a12b053b753616747445bc9e3aa0e5e72d9
Submitter: Zuul
Branch: master

commit 62f55a12b053b753616747445bc9e3aa0e5e72d9
Author: elajkat <email address hidden>
Date: Thu Jun 6 13:09:54 2019 +0200

    Force segments to use placement 1.1

    By documentation segments plugin was designed to use placement
    microversion 1.1, force to use that.

    Change-Id: Ibb8d6bcce7f0fe1070b9eeb2ad632dfb58a3a015
    Depends-On: https://review.opendev.org/663978
    Related-Bug: #1828543

Reviewed: https://review.opendev.org/663980
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=95023227b77b72382fdbb06bddadf832b1f20f8f
Submitter: Zuul
Branch: master

commit 95023227b77b72382fdbb06bddadf832b1f20f8f
Author: elajkat <email address hidden>
Date: Fri May 31 11:38:47 2019 +0200

    segments: Fix resource provider inventories update

    Related-Bug: #1828543
    Depends-On: https://review.opendev.org/663978
    Change-Id: I4275779bd8d353fbaa80c646515819b0a34edebb

Reviewed: https://review.opendev.org/670105
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=12ab7c4cb149147e3e7886233f4ba99c3484f923
Submitter: Zuul
Branch: master

commit 12ab7c4cb149147e3e7886233f4ba99c3484f923
Author: elajkat <email address hidden>
Date: Wed Jul 10 17:20:22 2019 +0200

    segments: fix rp inventory update

    The patch https://review.opendev.org/663980 made resource provider
    inventory update failing with the assumption that inventory update
    expects a dict with a key of the resource class, like resource provider
    inventories update.
    See the placement API-ref:
    https://developer.openstack.org/api-ref/placement/#update-resource-provider-inventory
    https://developer.openstack.org/api-ref/placement/#update-resource-provider-inventories

    Change-Id: I7de1a947b864eb5ac57ebaca895f827d2e667443
    Closes-Bug: #1836037
    Related-Bug: #1828543

Akihiro Motoki (amotoki) wrote :

we received a bug report on stable/stein separately (bug 1862565). It is worth checking the possibility of backporting the fix in the master. I added backport-potential-xx tags.

tags: added: rocky-backport-potential stein-backport-potential train-backport-potential
Akihiro Motoki (amotoki) wrote :

The fixes are part of train.

tags: removed: train-backport-potential
Lajos Katona (lajos-katona) wrote :

I just checked and a fix for keystoneauth1 is necessary:
https://review.opendev.org/662281

For some reason this patch was not marked with the bug. It is possible to workaround this from neutron-lib, but that is not a clean backport.

Lajos Katona (lajos-katona) wrote :

Oh, no I see now the related bug line in commit msg, just not in a pattern as I expected, so it's possible to backport it perhaps.

Change abandoned by Lajos Katona (<email address hidden>) on branch: stable/stein
Review: https://review.opendev.org/706820
Reason: The keystoneauth patch is blocked (I just abandoned it), if there's a real need for this fix, we can try ot workaround this in neutron-lib (that's ugly, but better than nothing)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers