nova service disable/enable returns 500 on cell environment

Bug #1361180 reported by Rajesh Tailor on 2014-08-25
34
This bug affects 6 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
RedBaron
Icehouse
High
Rajesh Tailor
Juno
High
Rajesh Tailor

Bug Description

nova service disable/enable returns 500 on cell environment. Actual enable/disable looks processed correctly.

It also throws following error in nova-api service:
ValueError: invalid literal for int() with base 10: 'region!child@5'

How to reproduce:

$ nova --os-username admin service-list

Output:
+----------------+------------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+----------------+------------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| region!child@1 | nova-conductor | region!child@ubuntu | internal | enabled | up | 2014-08-18T06:17:36.000000 | - |
| region!child@3 | nova-cells | region!child@ubuntu | internal | enabled | up | 2014-08-18T06:17:29.000000 | - |
| region!child@4 | nova-scheduler | region!child@ubuntu | internal | enabled | up | 2014-08-18T06:17:30.000000 | - |
| region!child@5 | nova-compute | region!child@ubuntu | nova | enabled | up | 2014-08-18T06:17:31.000000 | - |
| region@1 | nova-cells | region@ubuntu | internal | enabled | up | 2014-08-18T06:17:29.000000 | - |
| region@2 | nova-cert | region@ubuntu | internal | enabled | down | 2014-08-18T06:08:28.000000 | - |
| region@3 | nova-consoleauth | region@ubuntu | internal | enabled | up | 2014-08-18T06:17:37.000000 | - |
+----------------+------------------+---------------------+----------+---------+-------+----------------------------+-----------------+

$ nova --os-username admin service-disable 'region!child@ubuntu' nova-compute

The above command gives the following error:
ERROR (ClientException): Unknown Error (HTTP 500)

$ nova --os-username admin service-list

Output:
+----------------+------------------+---------------------+----------+----------+-------+----------------------------+-----------------+
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+----------------+------------------+---------------------+----------+----------+-------+----------------------------+-----------------+
| region!child@1 | nova-conductor | region!child@ubuntu | internal | enabled | up | 2014-08-18T06:19:06.000000 | - |
| region!child@3 | nova-cells | region!child@ubuntu | internal | enabled | up | 2014-08-18T06:19:09.000000 | - |
| region!child@4 | nova-scheduler | region!child@ubuntu | internal | enabled | up | 2014-08-18T06:19:10.000000 | - |
| region!child@5 | nova-compute | region!child@ubuntu | nova | disabled | up | 2014-08-18T06:19:11.000000 | - |
| region@1 | nova-cells | region@ubuntu | internal | enabled | up | 2014-08-18T06:19:09.000000 | - |
| region@2 | nova-cert | region@ubuntu | internal | enabled | down | 2014-08-18T06:08:28.000000 | - |
| region@3 | nova-consoleauth | region@ubuntu | internal | enabled | up | 2014-08-18T06:19:07.000000 | - |
+----------------+------------------+---------------------+----------+----------+-------+----------------------------+-----------------+

$ nova --os-username admin service-enable 'region!child@ubuntu' nova-compute
The above command gives following error:
ERROR (ClientException): Unknown Error (HTTP 500)

$ nova --os-username admin service-list
+----------------+------------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+----------------+------------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| region!child@1 | nova-conductor | region!child@ubuntu | internal | enabled | up | 2014-08-18T06:52:37.000000 | - |
| region!child@3 | nova-cells | region!child@ubuntu | internal | enabled | up | 2014-08-18T06:52:40.000000 | - |
| region!child@4 | nova-scheduler | region!child@ubuntu | internal | enabled | up | 2014-08-18T06:52:41.000000 | - |
| region!child@5 | nova-compute | region!child@ubuntu | nova | enabled | up | 2014-08-18T06:52:42.000000 | - |
| region@1 | nova-cells | region@ubuntu | internal | enabled | up | 2014-08-18T06:52:40.000000 | - |
| region@2 | nova-cert | region@ubuntu | internal | enabled | down | 2014-08-18T06:08:28.000000 | - |
| region@3 | nova-consoleauth | region@ubuntu | internal | enabled | up | 2014-08-18T06:52:39.000000 | - |
+----------------+------------------+---------------------+----------+---------+-------+----------------------------+-----------------+

The nova-api log shows the following error, after we try to disable/enable the child cell service.

output of nova-api.log:

2014-08-18 15:18:23.848 DEBUG nova.api.openstack.wsgi [req-c8e2b844-6aa4-4819-b4f4-38bfe30240f0 admin demo] Action: 'update', body: {"binary": "nova-compute", "host": "region!child@ubuntu"} from (pid=11556) _process_stack /opt/stack/nova/nova/api/openstack/wsgi.py:940
2014-08-18 15:18:23.848 DEBUG nova.api.openstack.wsgi [req-c8e2b844-6aa4-4819-b4f4-38bfe30240f0 admin demo] Calling method '<bound method ServiceController.update of <nova.api.openstack.compute.contrib.services.ServiceController object at 0x7f16bd608510>>' (Content-type='application/json', Accept='application/json') from (pid=11556) _process_stack /opt/stack/nova/nova/api/openstack/wsgi.py:945
2014-08-18 15:18:23.930 ERROR object [req-c8e2b844-6aa4-4819-b4f4-38bfe30240f0 admin demo] Error setting Service.id
2014-08-18 15:18:23.930 TRACE object Traceback (most recent call last):
2014-08-18 15:18:23.930 TRACE object File "/opt/stack/nova/nova/objects/base.py", line 70, in setter
2014-08-18 15:18:23.930 TRACE object field.coerce(self, name, value))
2014-08-18 15:18:23.930 TRACE object File "/opt/stack/nova/nova/objects/fields.py", line 166, in coerce
2014-08-18 15:18:23.930 TRACE object return self._type.coerce(obj, attr, value)
2014-08-18 15:18:23.930 TRACE object File "/opt/stack/nova/nova/objects/fields.py", line 231, in coerce
2014-08-18 15:18:23.930 TRACE object return int(value)
2014-08-18 15:18:23.930 TRACE object ValueError: invalid literal for int() with base 10: 'region!child@5'
2014-08-18 15:18:23.930 TRACE object
2014-08-18 15:18:23.934 ERROR nova.api.openstack [req-c8e2b844-6aa4-4819-b4f4-38bfe30240f0 admin demo] Caught error: invalid literal for int() with base 10: 'region!child@5'
2014-08-18 15:18:23.934 TRACE nova.api.openstack Traceback (most recent call last):
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/nova/nova/api/openstack/__init__.py", line 125, in __call__
2014-08-18 15:18:23.934 TRACE nova.api.openstack return req.get_response(self.application)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/webob/request.py", line 1320, in send
2014-08-18 15:18:23.934 TRACE nova.api.openstack application, catch_exc_info=False)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/webob/request.py", line 1284, in call_application
2014-08-18 15:18:23.934 TRACE nova.api.openstack app_iter = application(self.environ, start_response)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2014-08-18 15:18:23.934 TRACE nova.api.openstack return resp(environ, start_response)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/python-keystoneclient/keystoneclient/middleware/auth_token.py", line 663, in __call__
2014-08-18 15:18:23.934 TRACE nova.api.openstack return self.app(env, start_response)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2014-08-18 15:18:23.934 TRACE nova.api.openstack return resp(environ, start_response)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2014-08-18 15:18:23.934 TRACE nova.api.openstack return resp(environ, start_response)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/routes/middleware.py", line 131, in __call__
2014-08-18 15:18:23.934 TRACE nova.api.openstack response = self.app(environ, start_response)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2014-08-18 15:18:23.934 TRACE nova.api.openstack return resp(environ, start_response)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 130, in __call__
2014-08-18 15:18:23.934 TRACE nova.api.openstack resp = self.call_func(req, *args, **self.kwargs)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 195, in call_func
2014-08-18 15:18:23.934 TRACE nova.api.openstack return self.func(req, *args, **kwargs)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/nova/nova/api/openstack/wsgi.py", line 917, in __call__
2014-08-18 15:18:23.934 TRACE nova.api.openstack content_type, body, accept)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/nova/nova/api/openstack/wsgi.py", line 983, in _process_stack
2014-08-18 15:18:23.934 TRACE nova.api.openstack action_result = self.dispatch(meth, request, action_args)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/nova/nova/api/openstack/wsgi.py", line 1070, in dispatch
2014-08-18 15:18:23.934 TRACE nova.api.openstack return method(req=request, **action_args)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/nova/nova/api/openstack/compute/contrib/services.py", line 206, in update
2014-08-18 15:18:23.934 TRACE nova.api.openstack self.host_api.service_update(context, host, binary, status_detail)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/nova/nova/compute/cells_api.py", line 579, in service_update
2014-08-18 15:18:23.934 TRACE nova.api.openstack db_service)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/nova/nova/objects/service.py", line 67, in _from_db_object
2014-08-18 15:18:23.934 TRACE nova.api.openstack service[key] = db_service[key]
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/nova/nova/objects/base.py", line 398, in __setitem__
2014-08-18 15:18:23.934 TRACE nova.api.openstack setattr(self, name, value)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/nova/nova/objects/base.py", line 70, in setter
2014-08-18 15:18:23.934 TRACE nova.api.openstack field.coerce(self, name, value))
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/nova/nova/objects/fields.py", line 166, in coerce
2014-08-18 15:18:23.934 TRACE nova.api.openstack return self._type.coerce(obj, attr, value)
2014-08-18 15:18:23.934 TRACE nova.api.openstack File "/opt/stack/nova/nova/objects/fields.py", line 231, in coerce
2014-08-18 15:18:23.934 TRACE nova.api.openstack return int(value)
2014-08-18 15:18:23.934 TRACE nova.api.openstack ValueError: invalid literal for int() with base 10: 'region!child@5'

Changed in nova:
assignee: nobody → Rajesh Tailor (rajesh-tailor)

Fix proposed to branch: master
Review: https://review.openstack.org/118672

Changed in nova:
status: New → In Progress
Sean Dague (sdague) on 2014-09-19
tags: added: cells
Changed in nova:
importance: Undecided → High
RedBaron (dheeraj-gupta4) wrote :

There is a common problem with all service related methods in HostAPI.
In `service_get_all` (which corresponds to service-list), the ServiceProxy is used because we can't cast objects returned by cells to Service objects as ID's returned by cells contain full cell-path and it needs to be stripped out before creating Service objects.

The use of this proxy needs to be implemented in other functions as well -
service_get_by_compute_host - used by evacuate
service_update - used by service-enable and service-disable

Rajesh if you are no longer working on this, I have a patch ready for this which I can push.

Fix proposed to branch: master
Review: https://review.openstack.org/126498

Changed in nova:
assignee: Rajesh Tailor (rajesh-tailor) → RedBaron (dheeraj-gupta4)

Change abandoned by Rajesh Tailor (<email address hidden>) on branch: master
Review: https://review.openstack.org/118672
Reason: Similar patch has submitted to address this issue.
please refer: https://review.openstack.org/#/c/126498

Reviewed: https://review.openstack.org/126498
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=fcd24c6774af0add2bf20c604232e4db9747da7d
Submitter: Jenkins
Branch: master

commit fcd24c6774af0add2bf20c604232e4db9747da7d
Author: Dheeraj Gupta <email address hidden>
Date: Tue Oct 7 09:18:27 2014 +0000

    Extends use of ServiceProxy to more methods in HostAPI in cells

    Cells prepend full cell path to the service ID before returning any
    service related info. This means service ID is non numeric and can't
    be cast into Service objects. In cells, service_get_all method
    in HostAPI (which is used to display list of services) strips out
    the cell path from received IDs, creates Service objects using
    remaining numerical ID and uses a ServiceProxy to associate cell paths
    Service objects.
    However, other service related methods do not do so. They include:
     - service_update (Used for enabling/disabling services)
     - service_get_all_by_host (Used for evacuation)
    These functions try to cast received service info (with alphanumeric
    service IDs) into Service objects and fail with a ValueError.
    This leads to API cell throwing Error 500 for service-enable,
    service-disable and evacuate.
    This patch extends the ServiceProxy usage to both these methods. It
    also changes the corresponding HostAPI tests.

    Change-Id: Iff2707602d5fabfbe8438150b5ad74b3c31bb011
    Closes-Bug: 1361180

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2014-12-18
Changed in nova:
milestone: none → kilo-1
status: Fix Committed → Fix Released

Reviewed: https://review.openstack.org/138374
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2da234190346d76725017b4ee7570d0a7e89a91c
Submitter: Jenkins
Branch: stable/juno

commit 2da234190346d76725017b4ee7570d0a7e89a91c
Author: Dheeraj Gupta <email address hidden>
Date: Tue Oct 7 09:18:27 2014 +0000

    Extends use of ServiceProxy to more methods in HostAPI in cells

    Cells prepend full cell path to the service ID before returning any
    service related info. This means service ID is non numeric and can't
    be cast into Service objects. In cells, service_get_all method
    in HostAPI (which is used to display list of services) strips out
    the cell path from received IDs, creates Service objects using
    remaining numerical ID and uses a ServiceProxy to associate cell paths
    Service objects.
    However, other service related methods do not do so. They include:
     - service_update (Used for enabling/disabling services)
     - service_get_all_by_host (Used for evacuation)
    These functions try to cast received service info (with alphanumeric
    service IDs) into Service objects and fail with a ValueError.
    This leads to API cell throwing Error 500 for service-enable,
    service-disable and evacuate.
    This patch extends the ServiceProxy usage to both these methods. It
    also changes the corresponding HostAPI tests.

    Note: The required unit-tests are manually added to the below path,
    as new path for unit-tests is not present in stable/juno release.
    nova/tests/compute/test_host_api.py

    Conflicts:
            nova/tests/unit/compute/test_host_api.py

    Change-Id: Iff2707602d5fabfbe8438150b5ad74b3c31bb011
    Closes-Bug: 1361180
    (cherry picked from commit fcd24c6774af0add2bf20c604232e4db9747da7d)

tags: added: in-stable-juno
Alan Pevec (apevec) on 2015-03-04
tags: removed: in-stable-juno

Reviewed: https://review.openstack.org/138308
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c5411d22f0d1da0cb15f5d7c8511a4caec53b265
Submitter: Jenkins
Branch: stable/icehouse

commit c5411d22f0d1da0cb15f5d7c8511a4caec53b265
Author: Dheeraj Gupta <email address hidden>
Date: Tue Oct 7 09:18:27 2014 +0000

    Extends use of ServiceProxy to more methods in HostAPI in cells

    Cells prepend full cell path to the service ID before returning any
    service related info. This means service ID is non numeric and can't
    be cast into Service objects. In cells, service_get_all method
    in HostAPI (which is used to display list of services) strips out
    the cell path from received IDs, creates Service objects using
    remaining numerical ID and uses a ServiceProxy to associate cell paths
    Service objects.
    However, other service related methods do not do so. They include:
     - service_update (Used for enabling/disabling services)
     - service_get_all_by_host (Used for evacuation)
    These functions try to cast received service info (with alphanumeric
    service IDs) into Service objects and fail with a ValueError.
    This leads to API cell throwing Error 500 for service-enable,
    service-disable and evacuate.
    This patch extends the ServiceProxy usage to both these methods. It
    also changes the corresponding HostAPI tests.

    Note: The required unit-tests are manually added to the below path,
    as new path for unit-tests is not present in stable/icehouse release.
    nova/tests/compute/test_host_api.py

    Conflicts:
            nova/compute/cells_api.py
            nova/tests/unit/compute/test_host_api.py

    Closes-Bug: 1361180
    Change-Id: Iff2707602d5fabfbe8438150b5ad74b3c31bb011
    (cherry picked from commit fcd24c6774af0add2bf20c604232e4db9747da7d)

Thierry Carrez (ttx) on 2015-04-30
Changed in nova:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers