[Ironic] Nova compute will fail to start if it can not talk to the Ironic API

Bug #1675732 reported by Lucas Alvares Gomes
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Medium
Lucas Alvares Gomes

Bug Description

This can happen during an upgrade. The Ironic driver in nova will try to reach the Ironic API for a certain # of times and after that, if the API doesn't become available the whole service will stop with:

4210>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 60 of 61 from (pid=14540) wrapper /usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py:201
2017-03-24 10:28:48.703 ERROR ironicclient.common.http [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error contacting Ironic server: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/de
tail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries exceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9
bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',)). Attempt 61 of 61
2017-03-24 10:28:48.704 ERROR oslo_service.service [req-90c91d04-d6c6-47e1-98e9-3d03a63d63ca None None] Error starting thread.
2017-03-24 10:28:48.704 TRACE oslo_service.service Traceback (most recent call last):
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 722, in run_service
2017-03-24 10:28:48.704 TRACE oslo_service.service service.start()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/service.py", line 162, in start
2017-03-24 10:28:48.704 TRACE oslo_service.service self.manager.pre_start_hook()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 1166, in pre_start_hook
2017-03-24 10:28:48.704 TRACE oslo_service.service startup=True)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/compute/manager.py", line 6608, in update_available_resource
2017-03-24 10:28:48.704 TRACE oslo_service.service nodenames = set(self.driver.get_available_nodes())
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 610, in get_available_nodes
2017-03-24 10:28:48.704 TRACE oslo_service.service self._refresh_cache()
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 566, in _refresh_cache
2017-03-24 10:28:48.704 TRACE oslo_service.service for node in self._get_node_list(detail=True, limit=0):
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/driver.py", line 485, in _get_node_list
2017-03-24 10:28:48.704 TRACE oslo_service.service node_list = self.ironicclient.call("node.list", **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/opt/stack/nova/nova/virt/ironic/client_wrapper.py", line 146, in call
2017-03-24 10:28:48.704 TRACE oslo_service.service return self._multi_getattr(client, method)(*args, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/v1/node.py", line 137, in list
2017-03-24 10:28:48.704 TRACE oslo_service.service limit=limit)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/base.py", line 149, in _list_pagination
2017-03-24 10:28:48.704 TRACE oslo_service.service resp, body = self.api.json_request('GET', url)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 552, in json_request
2017-03-24 10:28:48.704 TRACE oslo_service.service resp = self._http_request(url, method, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 190, in wrapper
2017-03-24 10:28:48.704 TRACE oslo_service.service return func(self, url, method, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/ironicclient/common/http.py", line 525, in _http_request
2017-03-24 10:28:48.704 TRACE oslo_service.service raise_exc=False, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/positional/__init__.py", line 101, in inner
2017-03-24 10:28:48.704 TRACE oslo_service.service return wrapped(*args, **kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 616, in request
2017-03-24 10:28:48.704 TRACE oslo_service.service resp = send(**kwargs)
2017-03-24 10:28:48.704 TRACE oslo_service.service File "/usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py", line 690, in _send_request
2017-03-24 10:28:48.704 TRACE oslo_service.service raise exceptions.ConnectFailure(msg)
2017-03-24 10:28:48.704 TRACE oslo_service.service ConnectFailure: Unable to establish connection to http://192.168.122.5:6385/v1/nodes/detail: HTTPConnectionPool(host='192.168.122.5', port=6385): Max retries ex
ceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7eff4aa9bf50>: Failed to establish a new connection: [Errno 111] ECONNREFUSED',))
2017-03-24 10:28:48.704 TRACE oslo_service.service

---

I don't believe that should be the right behavior. If the ironic nova driver tries to fetch the ndoes from the Ironic service but it's not available I think it should log the error and just return a list of empty nodes.

This happens in the get_available_nodes() call of the driver, which runs periodically in nova so it will be retried later once the Ironic API is available again.

[UPDATE]

Apparently we had a similar bug in the past: https://bugs.launchpad.net/nova/+bug/1430616

Tags: ironic
Changed in nova:
assignee: nobody → Lucas Alvares Gomes (lucasagomes)
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/449587

Changed in nova:
status: New → In Progress
Matt Riedemann (mriedem)
tags: added: ironic
Changed in nova:
importance: Undecided → Low
importance: Low → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Lucas Alvares Gomes (<email address hidden>) on branch: master
Review: https://review.openstack.org/449587

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.