CI jobs fails due to instances in ERROR state

Bug #1850291 reported by Slawek Kaplonski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
New
Undecided
Unassigned

Bug Description

I see that in various jobs that some tests are failing because instance is in ERROR state.
After some checking it seems for me that the issue is in scheduler as I see there errors like:

Oct 27 12:29:31.361318 ubuntu-bionic-ovh-bhs1-0012520618 nova-scheduler[19272]: WARNING nova.context [None req-9e056bb6-787f-49fe-8896-41285d7418b0 tempest-ServersTestManualDisk-1766481780 tempest-ServersTestManualDisk-1766481780] Timed out waiting for response from cell 43118bd8-e32a-4aa4-b93a-37969e41dba6

or

Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: WARNING nova.context [None req-37a8fab9-7d64-4ef4-9464-f70e5ed35d53 tempest-ServersTestJSON-1383648109 tempest-ServersTestJSON-1383648109] Timed out waiting for response from cell: CellTimeout: Timeout waiting for response from cell
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context Traceback (most recent call last):
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/opt/stack/new/nova/nova/context.py", line 443, in scatter_gather_cells
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context cell_uuid, result = queue.get()
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 322, in get
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context return waiter.wait()
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 141, in wait
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context return get_hub().switch()
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 298, in switch
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context return self.greenlet.switch()
Oct 16 21:02:27.981751 ubuntu-bionic-fortnebula-regionone-0012347429 nova-scheduler[20380]: ERROR nova.context CellTimeout: Timeout waiting for response from cell

Looking at logstash it seems that this happens quite often on various jobs: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Timed%20out%20waiting%20for%20response%20from%20cell%5C%22

Revision history for this message
Matt Riedemann (mriedem) wrote :

It's a known issue, see bug 1844929.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.