I have checked failing tempest from [1]
We see errors in the ovn metadata agent log [3] for a request for the instance (10.100.0.12):
2019-04-23 23:27:42.513 30579 DEBUG networking_ovn.agent.metadata.server [-] Request: GET /2009-04-04/meta-data/instance-id HTTP/1.0
Accept: */*
Connection: close
Content-Type: text/plain
Host: 169.254.169.254
User-Agent: curl/7.24.0 (x86_64-pc-linux-gnu) libcurl/7.24.0 OpenSSL/1.0.0j zlib/1.2.6
X-Forwarded-For: 10.100.0.12
X-Ovn-Network-Id: 6a2c1465-129d-4eba-b24d-856327a0ba47 __call__ /usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/server.py:64
2019-04-23 23:27:42.514 30579 DEBUG networking_ovn.agent.metadata.server [-] Request to Nova at overcloud.internalapi.localdomain:8775 _proxy_request /usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/server.py:101
2019-04-23 23:27:42.514 30579 DEBUG networking_ovn.agent.metadata.server [-] {'X-Forwarded-For': '10.100.0.12', 'X-Instance-ID-Signature': '3ab84648c2acfc5963a8ec7ebc9af0edddc76348de1aab25dd0db579c6f85667', 'X-Tenant-ID': '59fb6d91d57e443a8b4887d94e9653f4', 'X-Instance-ID': '521734e6-27d9-4867-bc60-ceacc55aa22b'} _proxy_request /usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/server.py:102
2019-04-23 23:27:47.107 30404 DEBUG ovsdbapp.backend.ovs_idl.event [-] SbGlobalUpdateEvent : Matched SB_Global, ('update',), None None matches /usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/event.py:40
2019-04-23 23:27:47.108 30404 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): UpdateChassisExtIdsCommand(if_exists=True, external_ids={'neutron:ovn-metadata-sb-cfg': '49'}, name=893bfbb4-63c7-40b0-a236-6ebc2313f593) do_commit /usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84
2019-04-23 23:27:47.449 30404 DEBUG ovsdbapp.backend.ovs_idl.event [-] SbGlobalUpdateEvent : Matched SB_Global, ('update',), None None matches /usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/event.py:40
2019-04-23 23:27:47.450 30404 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): UpdateChassisExtIdsCommand(if_exists=True, external_ids={'neutron:ovn-metadata-sb-cfg': '50'}, name=893bfbb4-63c7-40b0-a236-6ebc2313f593) do_commit /usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server [-] Unexpected error.: error: [Errno 110] ETIMEDOUT
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server Traceback (most recent call last):
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/server.py", line 68, in __call__
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server return self._proxy_request(instance_id, project_id, req)
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/server.py", line 119, in _proxy_request
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server body=req.body)
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server File "/usr/lib/python2.7/site-packages/httplib2/__init__.py", line 1621, in request
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server File "/usr/lib/python2.7/site-packages/httplib2/__init__.py", line 1363, in _request
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server (response, content) = self._conn_request(conn, request_uri, method, body, headers)
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server File "/usr/lib/python2.7/site-packages/httplib2/__init__.py", line 1284, in _conn_request
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server conn.connect()
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server File "/usr/lib/python2.7/site-packages/httplib2/__init__.py", line 934, in connect
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server raise socket.error, msg
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server error: [Errno 110] ETIMEDOUT
2019-04-23 23:27:47.592 30579 ERROR networking_ovn.agent.metadata.server
2019-04-23 23:27:47.596 30579 INFO eventlet.wsgi.server [-] Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/eventlet/wsgi.py", line 578, in handle_one_response
write(b''.join(towrite))
File "/usr/lib/python2.7/site-packages/eventlet/wsgi.py", line 519, in write
wfile.flush()
File "/usr/lib64/python2.7/socket.py", line 303, in flush
self._sock.sendall(view[write_offset:write_offset+buffer_size])
File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 401, in sendall
tail = self.send(data, flags)
File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 395, in send
return self._send_loop(self.fd.send, data, flags)
File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 382, in _send_loop
return send_method(data, *args)
error: [Errno 32] Broken pipe
We don't see any incoming request to the nova metadata api service on any of the controllers.
From haproxy.log of the different controllers we only see always the local metadata service backend as up, which still should be enough to server the request:
- controller-0
Apr 23 23:18:07 overcloud-controller-0 haproxy[12]: Server nova_metadata/overcloud-controller-0.internalapi.localdomain is UP, reason: Layer7 check passed, code: 200, info: "OK", check duration: 2ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
- controller-1:
Apr 23 23:18:01 overcloud-controller-1 haproxy[13]: Server nova_metadata/overcloud-controller-1.internalapi.localdomain is UP, reason: Layer7 check passed, code: 200, info: "OK", check duration: 1ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
- controller-2:
Apr 23 23:18:01 overcloud-controller-2 haproxy[13]: Server nova_metadata/overcloud-controller-2.internalapi.localdomain is UP, reason: Layer7 check passed, code: 200, info: "OK", check duration: 1ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
The issue is that we miss the nova_metadata api iptables rules (port 8775/13775) [5] , e.g. [4]. [5] references 'tripleo::nova_placement::firewall_rules' which is wrong.
I'll create a new LP for this and submit a patch.
Just a side note: designate use the same ID 139 [6]
[1] http://logs.rdoproject.org/20/655120/1/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053/37cc755/logs/tempest.html.gz
[2] http://logs.rdoproject.org/20/655120/1/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053/37cc755/logs/overcloud-novacompute-0/var/log/containers/nova/nova-compute.log.txt.gz
[3] http://logs.rdoproject.org/20/655120/1/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053/37cc755/logs/overcloud-novacompute-0/var/log/containers/neutron/ovn-metadata-agent.log.txt.gz
[4] http://logs.rdoproject.org/20/655120/1/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053/37cc755/logs/overcloud-controller-0/etc/sysconfig/iptables.gz
[5] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/nova/nova-metadata-container-puppet.yaml#L135
[6] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/experimental/designate/designate-api-container-puppet.yaml#L97-L101
Fix proposed to branch: master /review. opendev. org/655389
Review: https:/