nova service-list failed when one or some cell-service exception of multiple child cells

Bug #1539056 reported by Jinquan Ni
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Won't Fix
Undecided
Jinquan Ni

Bug Description

1. version
kilo 2015.1.0

2. Relevant log

2016-01-28 20:16:27.232 26909 ERROR nova.cells.messaging [req-8daf498b-d2ae-42e6-9089-40046ec550d3 f04e318acd7e4e5093c91e6dc74a28c3 53adc6d6825b43378d6ab89fc38051da - - -] Error waiting for responses from neighbor cells
2016-01-28 20:16:27.232 26909 TRACE nova.cells.messaging Traceback (most recent call last):
2016-01-28 20:16:27.232 26909 TRACE nova.cells.messaging File "/usr/lib/python2.7/site-packages/nova/cells/messaging.py", line 549, in process
2016-01-28 20:16:27.232 26909 TRACE nova.cells.messaging num_responses=len(next_hops))
2016-01-28 20:16:27.232 26909 TRACE nova.cells.messaging File "/usr/lib/python2.7/site-packages/nova/cells/messaging.py", line 235, in _wait_for_json_responses
2016-01-28 20:16:27.232 26909 TRACE nova.cells.messaging raise exception.CellTimeout()
2016-01-28 20:16:27.232 26909 TRACE nova.cells.messaging CellTimeout: Timeout waiting for response from cell
2016-01-28 20:16:27.232 26909 TRACE nova.cells.messaging

3. Reproduce steps:

3.1 environment described

I have a api cell node:api-cell,
two child cell node: child-cell-01 child-cell-02,
and each child_cell have one compute node

3.2
excuate “systemctl stop openstack-nova-cells.service” on child-cell-01

openstack-nova-cells.service - OpenStack Nova Cells Server
   Loaded: loaded (/usr/lib/systemd/system/openstack-nova-cells.service; enabled)
   Active: inactive (dead) since Thu 2016-01-28 19:54:50 CST; 24min ago
  Process: 2566 ExecStart=/usr/bin/nova-cells (code=exited, status=0/SUCCESS)
 Main PID: 2566 (code=exited, status=0/SUCCESS)

3.3 excuate “nova service-list” on api-cell
Expected result: list all service and status(Tolerance does not contain service on child-cell-01 and it‘s compute node)
Actual result : ERROR, log reference section 2

4
Only one or some cell’s service exception of many child cell lead to nova service-list failed isunreasonable

Tags: cells
Jinquan Ni (ni-jinquan)
Changed in nova:
assignee: nobody → jinquanni(ZTE) (ni-jinquan)
Jinquan Ni (ni-jinquan)
summary:
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/275583

Revision history for this message
Andrew Laski (alaski) wrote :

This is a part of the cells architecture. Because the service information is contained within a child cell and not replicated to the api cell there is a requirement that child cells be available in order to respond to this query.

The challenge with fixing this is that there is no way to indicate in the API if a partial response is being provided. So while I agree that it would be preferable to return a response rather than fail under these conditions it is not a good idea to return a partial list of service information. For now you'll need to ensure that all cells services are running properly in order to keep the API fully functional.

Changed in nova:
status: New → Confirmed
importance: Undecided → Wishlist
Revision history for this message
Sarafraj Singh (sarafraj-singh) wrote :

Jinquan,
Are you working on the fix? Please change status to Inprogress if you are, otherwise change Assigned to ->nobody.

Revision history for this message
Matt Riedemann (mriedem) wrote :

cells v1 is frozen so we can focus on cells v2, as such this is a latent and not trivial bug to fix, so marking it as won't fix.

See item #2 here:

http://docs.openstack.org/developer/nova/cells.html#status

Changed in nova:
status: Confirmed → Won't Fix
importance: Wishlist → Undecided
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Jinquan Ni (<email address hidden>) on branch: master
Review: https://review.openstack.org/275583
Reason: won't FIX

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.