Openstack / Nova CLI returning "Unknown Error HTTP 500"

Bug #2003813 reported by Lucas de Ataides Barreto
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Luan Nunes Utimura

Bug Description

Brief Description
-----------------
STX-Openstack sanity is constantly having failed test cases due to the openstack cli returning "Unknown Error HTTP 500". system cli is also returning this error in some cases.
test cases:
  lock/unlock: intermittent
  live-migrate: constant
  pause/unpause: constant
  suspend/resume: constant

Severity
--------
Major: System/Feature is usable but degraded

Steps to Reproduce
------------------
Issue openstack cli commands on VMs.

Expected Behavior
------------------
Command is executed as expected

Actual Behavior
----------------
Command returns "Unknown Error HTTP 500"

Reproducibility
---------------
100% reproducible

System Configuration
--------------------
Two node system - AIO Duplex

Branch/Pull Time/Commit
-----------------------
/mirror/starlingx/master/debian/monolithic/20230104T070000Z/outputs/helm-charts/stx-openstack-1.0-1.stx.4-debian-stable-latest.tgz

Last Pass
---------
Never passed on Debian

Timestamp/Logs
--------------
[sysadmin@controller-0 ~(keystone_admin)]$ nova --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://keystone.openstack.svc.cluster.local/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne pause 39295d5c-1082-4456-9f14-ec6aa805e39a
[sysadmin@controller-0 ~(keystone_admin)]$ ERROR (ClientException): Unknown Error (HTTP 500)

Test Activity
-------------
Sanity

Workaround
----------
N/A

Changed in starlingx:
assignee: nobody → Thales Elero Cervi (tcervi)
Revision history for this message
Thales Elero Cervi (tcervi) wrote :

Collected a nova client live-migration call with debug flag:

DEBUG (session:944) GET call to compute for http://nova.openstack.svc.cluster.local/v2.1/xyz/servers/xyz used request id req-xyz
DEBUG (session:517) REQ: curl -g -i -X POST http://nova.openstack.svc.cluster.local/v2.1/xyz/servers/xyz/action -H "Accept: application/json" -H "Content-Type: application/json" -H "OpenStack-API-Version: compute 2.87" -H "User-Agent: python-novaclient" -H "X-Auth-Token: {SHA256}xyz" -H "X-OpenStack-Nova-API-Version: 2.87" -d '{"os-migrateLive": {"host": null, "block_migration": "auto"}}'
DEBUG (connectionpool:452) http://nova.openstack.svc.cluster.local:80 "POST /v2.1/xyz/servers/xyz/action HTTP/1.1" 500 0
DEBUG (session:548) RESP: [500] Connection: keep-alive Content-Length: 0 Content-Type: text/plain Date: Tue, 24 Jan 2023 20:33:34 GMT
DEBUG (session:580) RESP BODY: Omitted, Content-Type is set to text/plain. Only application/json responses have their bodies logged.
DEBUG (shell:821) Unknown Error (HTTP 500)
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/novaclient/shell.py", line 819, in main
    OpenStackComputeShell().main(argv)
  File "/usr/lib/python3/dist-packages/novaclient/shell.py", line 741, in main
    args.func(self.cs, args)
  File "/usr/lib/python3/dist-packages/novaclient/v2/shell.py", line 3680, in do_live_migration
    _find_server(cs, args.server).live_migrate(args.host, args.block_migrate,
  File "/usr/lib/python3/dist-packages/novaclient/api_versions.py", line 393, in substitution
    return methods[-1].func(obj, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/novaclient/v2/servers.py", line 507, in live_migrate
    return self.manager.live_migrate(self, host, block_migration)
  File "/usr/lib/python3/dist-packages/novaclient/api_versions.py", line 393, in substitution
    return methods[-1].func(obj, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/novaclient/v2/servers.py", line 1892, in live_migrate
    return self._live_migrate(server, host,
  File "/usr/lib/python3/dist-packages/novaclient/v2/servers.py", line 1830, in _live_migrate
    return self._action('os-migrateLive', server, body)
  File "/usr/lib/python3/dist-packages/novaclient/v2/servers.py", line 2115, in _action
    resp, body = self._action_return_resp_and_body(action, server,
  File "/usr/lib/python3/dist-packages/novaclient/v2/servers.py", line 2127, in _action_return_resp_and_body
    return self.api.client.post(url, body=body)
  File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 401, in post
    return self.request(url, 'POST', **kwargs)
  File "/usr/lib/python3/dist-packages/novaclient/client.py", line 78, in request
    raise exceptions.from_response(resp, body, url, method)
novaclient.exceptions.ClientException: Unknown Error (HTTP 500)
ERROR (ClientException): Unknown Error (HTTP 500)

tags: added: stx.8.0 stx.distro.openstack
Changed in starlingx:
assignee: Thales Elero Cervi (tcervi) → Luan Nunes Utimura (lutimura)
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → High
Revision history for this message
Luan Nunes Utimura (lutimura) wrote :
Download full text (4.5 KiB)

After porting `stx-nova-api-proxy` to Debian, I noticed that this error started to appear in other subcommands that worked normally in the CentOS based version (e.g. `nova flavor create`).

Upon further investigation, I found out that the code was breaking due to some request headers that were being passed to the Python 3's built-in `http` library with `NoneType` as their values (where, in fact, they should've been passed as strings [or not passed at all]):

...
'HTTP_OPENSTACK_SYSTEM_SCOPE': None,
'HTTP_X_DOMAIN_ID': None,
'HTTP_X_DOMAIN_NAME': None
...

In Python 2.7 – and therefore in the CentOS-based version of `nova-api-proxy` – this wasn't a problem because the builtin `httplib` (equivalent of Python 3's `http`) library handled the conversion to string (even of `NoneType`):

https://github.com/enthought/Python-2.7.3/blob/master/Lib/httplib.py#L938

However, in Python 3 – and therefore in the Debian-based version of `nova-api-proxy` – an exception is thrown instead:

https://github.com/python/cpython/blob/044fb4fb53594b37de8188cb36f3ba33ce2d617e/Lib/http/client.py#L1262

Therefore, some adjustments related to the Python 2.7 to Python 3 migration need to be made to the `nova-api-proxy` source code, something I will do in this LP.

----

Unfortunately, the original exception thrown during pause/unpause/suspend/resume is still occurring. The debug logs in `nova-api-proxy` shows the following tracestack:

```
2023-01-31 11:27:50,871.871 6 DEBUG nova_api_proxy.common.service [-] Traceback (most recent call last):
  File "/var/lib/openstack/lib/python3.9/site-packages/eventlet/wsgi.py", line 573, in handle_one_response
    result = self.application(self.environ, start_response)
  File "/var/lib/openstack/lib/python3.9/site-packages/webob/dec.py", line 143, in __call__
    return resp(environ, start_response)
  File "/var/lib/openstack/lib/python3.9/site-packages/routes/middleware.py", line 153, in __call__
    response = self.app(environ, start_response)
  File "/var/lib/openstack/lib/python3.9/site-packages/webob/dec.py", line 143, in __call__
    return resp(environ, start_response)
  File "/var/lib/openstack/lib/python3.9/site-packages/webob/dec.py", line 129, in __call__
    resp = self.call_func(req, *args, **kw)
  File "/var/lib/openstack/lib/python3.9/site-packages/webob/dec.py", line 193, in call_func
    return self.func(req, *args, **kwargs)
  File "/var/lib/openstack/lib/python3.9/site-packages/keystonemiddleware/auth_token/__init__.py", line 341, in __call__
    response = req.get_response(self._app)
  File "/var/lib/openstack/lib/python3.9/site-packages/webob/request.py", line 1313, in send
    status, headers, app_iter = self.call_application(
  File "/var/lib/openstack/lib/python3.9/site-packages/webob/request.py", line 1278, in call_application
    app_iter = application(self.environ, start_response)
  File "/var/lib/openstack/lib/python3.9/site-packages/webob/dec.py", line 143, in __call__
    return resp(environ, start_response)
  File "/var/lib/openstack/lib/python3.9/site-packages/routes/middleware.py", line 153, in __call__
    response = self.app(environ, start_response)
  File "/var/lib/openstack/lib/python3.9/site-packa...

Read more...

Revision history for this message
Luan Nunes Utimura (lutimura) wrote (last edit ):

Upon further investigation, I found that there are still some incompatibilities with Python 3 in the NFV code.
I'll be opening a review regarding these adjustments.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nfv (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/872411

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nfv (master)

Reviewed: https://review.opendev.org/c/starlingx/nfv/+/872411
Committed: https://opendev.org/starlingx/nfv/commit/1e475dca0c3884199b7345b719dea9a05601bc36
Submitter: "Zuul (22348)"
Branch: master

commit 1e475dca0c3884199b7345b719dea9a05601bc36
Author: Luan Nunes Utimura <email address hidden>
Date: Wed Feb 1 10:10:03 2023 -0300

    Debian: Fix nova actions

    Since the platform migration to Debian, it was observed that the
    following Nova actions stopped working:
      - pause;
      - unpause;
      - suspend;
      - resume;
      - live-migration.

    The reason behind that is that some packages related to Nova, which have
    already been migrated to Debian, still have some incompatibilities with
    Python 3. Consequently, whenever these Nova actions were executed, some
    exceptions occurred on the nova-api-proxy and NFV side, preventing them
    from working.

    Therefore, this change aims to improve this compatibility.

    Most of the changes were necessary due to the fact that in Python 3
    there is more of a distinction between `bytes` and `str`, whereas in
    Python 2 `bytes` is just an alias for `str`.

    Test Plan (on AIO-DX):
    PASS - Successfully perform a VM pause, unpause, suspend, resume.
    PASS - Successfully perform a VM live-migration.

    Closes-Bug: 2003813

    Signed-off-by: Luan Nunes Utimura <email address hidden>
    Change-Id: I918fe6e3deaa68630c797449649012e9fbf16fe4

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.